Cash flow prediction of a bank deposit using scalable graph analysis and machine learning
Abstract
Cash flow prediction of a bank is an important task as it is not only related to liquidity risk but is also regulated by financial authorities. To improve the prediction, a graph analysis of bank transaction data is promising, while its size, scale-free nature, and various attributes make the task challenging.In this paper, we propose a graph-based machine learning method for the cash flow prediction task. Our contributions are as follows. (i) We introduce an extensible and scalable shared-memory parallel graph analysis platform that supports the vertex-centric, bulk synchronous parallel programming paradigm. (ii) We introduce two novel graph features upon the platform: (ii-a) an internal money flow feature based on the Markov process approximation, and (ii-b) an anomaly score feature derived from other graph features.The proposed method is examined with real bank transaction data. The proposed graph features reduce the error of a long-term (31-day) cash flow prediction by 56 % from that of a non-graph-based time-series prediction model. The graph analysis platform can compute graph features from a graph with 10 × 10^6 nodes and 593 × 10^6 edges in 2 hours 20 minutes.