Bank Scan: Your personal financial advisor
Developing an AI Fintech application to analyze and determine the financial health of banking clients
In a world where 2–3 billion people are underbanked, including 25% of households in the United States, the need to provide intelligent banking analytics cannot possibly be overstated. For those who may not be familiar with the term ‘underbanked’, it is defined as an individual with access to a bank account but with limited financial services and insights. Especially for many of us millennials, the absence of a solid financial advisor such as the likes of Apple Pay, can see us siphon off our wealth at warp speed.
Ostensibly, not everyone has access to advanced analytics provided by their bank or smartphone, however most of us will have access to some form of a bank statement or something to that effect. While current analytics services have the luxury of directly interfacing with your bank account and thereby acquiring information objectively, a third party service would need to quite literally read and digest the information printed on your bank statement. This is where a Fintech application with artificial intelligence can interface and add value to our poorly underbanked client. And this is indeed what this AI Fintech application hereafter referred to as ‘Bank Scan’ will do.
Bank Scan is an application that will read your bank statement, calculate how much credit, debit and balance you have had, and will even decipher in what categories you have been spending your money in. Finally, it will tender a ‘financial health score’ that will indicate to what degree of well-being your finances are in. All of this is done with the utility of techniques such as natural language processing, data mining and most saliently — artificial intelligence. Namely, there are several packages in Python that are being used in this program to extract data from your bank statement, that would have otherwise been carried out by a human. Specifically, Tabula is extracting the tabular data from your bank statement and feeding it into a Pandas dataframe in Python, while spaCy is taking care of the natural language processing that extracts all the human readable text from the document and even uses AI to find items such as name and currency.
Once the content of the bank statement is inserted into a dataframe, it is fairly simple to compute total credit, debit and balance, however what is not so trivial is determining what categories the credit and debit transactions fall into. For this purpose, a bag-of-words approach was utilized to analyze the description for each transaction and to decipher what category the item correlates to. For instance, if the descriptor has any of the following words it will be classified as spending on food:
A dictionary of hundreds of stemmed and lemmatized words was formed for each of the following spending categories to classify each transaction.
Likewise any credits in the bank statement were categorized in the same manner with the following classifications.
The credit/debit of the account is visualized as follows, where blue and red bars denote credit and debit respectively.
Subsequently, the chart is broken down into each of the aforementioned credit/debit categories, whereby each classification is color coded.
Financial Health Score
The spending in each category is scored linearly based on how much under or overspending is detected. Overspending is penalized while under-spending is not, and the threshold for each was loosely determined by an aggregate of financial pundits’ advice on spending, i.e. you should spend no more than 25–35% on housing and no more than 10–15% on food as a percentage of your total spending.
Finally in order to compute the financial health score, initially I calculated a ‘debit score’ that combines your score for all of the eleven spending categories and gives this score a weight of 50%.
Then I calculated the ‘balance score’ which takes into consideration whether you have surplus or deficit spending and normalizes this based on your account’s starting balance as shown in the relation below. Likewise the balance score is also given a weight of 50%.
Subsequently, the ‘debit score’ and ‘balance score’ are combined as follows to give you an overall ‘financial health score’.
The beta version of this application is complete and can be downloaded at https://github.com/mkhorasani/Bank_Scan. While this is a comprehensive application that addresses the needs of underbanked clients, it is however not exhaustive either and is indeed a work in progress. By releasing this as an open source service and providing access to the source code, I hope to engage in a collaborative and iterative approach to enhance this product and to render it as a web app in the near future.