Automated Balance Depletion Prediction in Retail Banking

Name
Eldar Hasanov
Abstract
Retail banks employ various solutions and techniques to analyze data of customers with the business goal of delivering better service. In general, customer transactions and cash flow may provide useful information or pattern about customer’s behavior. One of the machine learning techniques that is employed on the cash flow and transactions of a customer is balance depletion prediction which estimates whether or not a customer will reach a balance of zero, or close to zero, within a given time interval. The balance depletion prediction may provide a better economic strategy for customers and help retail banks to offer more competent risk management services to the bank’s customers. These models have also been exploited by several other companies to identify potential problems in their business and to mitigate the adverse outcomes during project development. Although there have been few studies to analyze the cash flow of companies, a limited number of research studies has addressed the problem of cash flow and balance depletion prediction in retail banking.
Here, we present a case study where we employ machine learning solution to build balance depletion model. Our task is estimating the depletion of balance after the given prediction window. Our partner financial institution provided datasets that contain a time series of balance records for six months and data related to the customer and bank account. Initially, we propose a baseline approach where we train LightGBM classifier on the input data. To reduce computational complexity, we integrate two feature selection techniques into the pipeline (Boruta and BoostaRoota). Next, to improve model performance, we incorporate three feature engineering techniques: manual, Featuretools and TSFRESH. Each model is evaluated on a real anonymized dataset extracted by the financial institution.
Boruta and BoostaRoota don’t provide expected improvement due to input dataset size and computation time of the algorithm. Besides, the feature engineering techniques don’t also provide significant improvement over the baseline approach. Feature extraction with TSFRESH is computationally expensive while other two feature engineering techniques perform in short time.
Graduation Thesis language
English
Graduation Thesis type
Master - Software Engineering
Supervisor(s)
Marlon Dumas
Defence year
2019
 
PDF