Skip to main content

Scalable and Imbalance-Resistant Machine Learning Models for Anti-money Laundering: A Two-Layered Approach

  • Conference paper
  • First Online:
Enterprise Applications, Markets and Services in the Finance Industry (FinanceCom 2020)

Abstract

In this paper, we address the problem of detecting potentially illicit behavior in the context of Anti-Money Laundering (AML). We specifically address two requirements that arise when training machine learning models for AML: scalability and imbalance-resistance. By scalability we mean the ability to train the models to very large transaction datasets. By imbalance-resistance we mean the ability for the model to achieve suitable accuracy despite high class imbalance, i.e. the low number of instances of potentially illicit behavior relative to a large number of features that may characterize potentially illicit behavior. We propose a two-layered modelling concept. The first layer consists of a Logistic Regression model with simple features, which can be computed with low overhead. These features capture customer profiles as well as global aggregates of transaction histories. This layer filters out a proportion of customers whose activity patterns can be deemed non-illicit with high confidence. In the second layer, a gradient boosting model with complex features is used so as to classify the remaining customers. We anticipate that this two-layered approach achieves the stated requirements. Firstly, feature extraction is more scalable as the more computationally demanding features of the second layer do not need to be extracted for every customer. Secondly, the first layer acts as an undersampling method for the second layer, thus partially addressing the class imbalance. We validate the approach using a real dataset of customer profiles and transaction histories, together with labels provided by AML experts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We also conducted experiments using another implementation of extreme gradient boosting (XGBoost), but these experiments consistently led to lower accuracy. In the evaluation reported below, we only report on results obtained using Catboost.

References

  1. Tsui, E., Gao, S., Xu, D., Wang, H., Green, P.: Knowledge-based anti-money laundering: a software agent bank application. J. Knowl. Manage. (2009)

    Google Scholar 

  2. Breslow, S., Hagstroem, M., Mikkelsen, D., Robu, K. The new frontier in anti-money laundering McKinsey Insights, November 2017. https://www.mckinsey.com/business-functions/risk/our-insights/the-new-frontier-in-anti-money-laundering

  3. Kotsiantis, S., Koumanakos, E., Tzelepis, D., Tampakas, V.: Forecasting fraudulent financial statements using data mining. Int. J. Comput. Intell. 3(2), 104–110 (2006)

    Google Scholar 

  4. Jayasree, V., Siva Balan, R.V.: Money laundering regulatory risk evaluation using bitmap index-based decision tree. J. Assoc. Arab Univ. Basic Appl. Sci. 23(1), 96–102 (2017)

    Google Scholar 

  5. Nielsen, D.: Tree boosting with XGBoost - why does XGBoost win “every” machine learning competition? Master’s Thesis, NTNU (2016)

    Google Scholar 

  6. Palshikar, G.K., Apte, M.: Financial Security Against Money Laundering: A Survey. In: Emerging Trends in ICT Security, pp. 577–590. Morgan Kaufmann (2014)

    Google Scholar 

  7. Senator, T.E., et al.: Financial crimes enforcement network AI system (FAIS) identifying potential money laundering from reports of large cash transactions. AI Mag. 16(4), 21 (1995)

    Google Scholar 

  8. Chen, Z., Teoh, E.N., Nazir, A., Karuppiah, E.K., Lam, K.S.: Machine learning techniques for anti-money laundering (AML) solutions in potentially suspicious transaction detection: a review. Knowl. Inf. Syst. 57(2), 245–285 (2018)

    Article  Google Scholar 

  9. Helmy, T.H., Zaki, M., Salah, T., Badran, K.: Design of a monitor for detecting money laundering and terrorist financing. J. Theoret. Appl. Inf. Technol. 85(3), 425 (2016)

    Google Scholar 

  10. Chen, Y.T., Mathe, J.: Fuzzy computing applications for anti-money laundering and distributed storage system load monitoring (2011)

    Google Scholar 

  11. Cortinas, R., et al.: Secure failure detection and consensus in trustedpals. IEEE Trans. Dependable Secure Comput. 9(4), 610–625 (2012)

    Article  Google Scholar 

  12. Phua, C., Smith-Miles, K., Lee, V., Gayler, R.: Resilient identity crime detection. IEEE Trans. Knowl. Data Eng. 24(3), 533–546 (2010)

    Article  Google Scholar 

  13. Liou, F.M.: Fraudulent financial reporting detection and business failure prediction models: a comparison. Manage. Audit. J. (2008)

    Google Scholar 

  14. tej.com.tw

  15. Lopez-Rojas, E.A., Axelsson, S.: Money laundering detection using synthetic data. In: The 27th annual workshop of the Swedish Artificial Intelligence Society (SAIS), Örebro; Sweden, 14–15 May 2012, no. 071, pp. 33–40. Linköping University Electronic Press, May 2012

    Google Scholar 

  16. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, A.: CatBoost: unbiased boosting with categorical features. In: Advances in Neural Information Processing Systems, pp. 6638–6648 (2018)

    Google Scholar 

  17. Leontjeva, A., Goldszmidt, M., Xie, Y., Yu, F., Abadi, M.: Early security classification of skype users via machine learning. In Proceedings of the 2013 ACM workshop on Artificial Intelligence and Security, pp. 35–44. ACM, November 2013

    Google Scholar 

  18. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  19. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 3, 408–421 (1972)

    Article  MathSciNet  Google Scholar 

  20. Kursa, M.B., Jankowski, A., Rudnicki, W.R.: Boruta-a system for feature selection. Fundamenta Informaticae 101(4), 271–285 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This research was partly funded by the European Regional Development Funds via Archimedes Foundation (NUTIKAS programme).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pavlo Tertychnyi .

Editor information

Editors and Affiliations

A Appendix

A Appendix

Sequence-based Features Calculation. We have an assumption that sequences of customers’ transactions are not random and follow some hidden structure. Therefore we used a way to encode this information to the model by so-called generative log-odds features [17], where we estimate transaction probabilities between each transaction state separately for potentially illicit and non-illicit customers and then compare them. This approach allows us to capture the dynamics of the transaction history for our classification task while introducing less overhead than methods based on neural networks (e.g. Boltzman machines) or deep learning auto-encoders. In the log-odd feature extraction method, we want to generate features based on sequential probabilities. We are interested in the following probability:

$$\begin{aligned} P(X)= P(x_1,x_2,\ldots ,x_n ) \end{aligned}$$
(1)

where \(x_1,x_2,\ldots ,x_n\) are some discrete properties of transactions (i.e. direction). One particular way to estimate this probability is to use the chain rule:

$$\begin{aligned} P(X)=P(x_1,\ldots ,x_n)=p(x_1)p(x_2 \mid x_1)\ldots p(x_n \mid x_1,\ldots ,x_{n-1}) \end{aligned}$$
(2)

In some cases, it is practically impossible, so we can simplify the assumptions using Markov property:

$$\begin{aligned} P(X_n=x_n \mid X_{n-1}=x_{n-1},\ldots ,X_0=x_0)=P(X_n=x_n \mid X_{n-1}=x_{n-1}) \end{aligned}$$
(3)

But for our task, we are more interested in finding that a particular set of transactions is more illicit than just a set of regular non-illicit transactions. Mathematically, we want to estimate:

$$\begin{aligned} \mathop {{{\,\mathrm{argmax}\,}}}\nolimits _{y\,\in \, (potentially\,illicit, non-illicit)} P(Y=y|X) \end{aligned}$$
(4)

One way to calculate this probability is to use Bayes theorem:

$$\begin{aligned} \mathop {{{\,\mathrm{argmax}\,}}}\limits _y P(Y=y \mid X) = \mathop {{{\,\mathrm{argmax}\,}}}\limits _y P(X \mid Y=y)P(Y=y) \end{aligned}$$
(5)

The only thing left is to calculate \( P(X \mid Y=y)\) and \( P(Y=y)\). \(P(X \mid Y=y)\) can be calculated using train set and then calculating transition probabilities separately for potentially illicit class and non-illicit class. For example, if there are only two states in a transaction sequence, namely in, out. All we need to estimate transition probabilities is to calculate

$$\begin{aligned} P(in \mid out) = \frac{(count(out \rightarrow in))}{(count(out))} \end{aligned}$$
(6)

Similarly, for other combination of in, out we should do the same. \( P(Y=y)\) is the prior probability of being potentially illicit, which is simply a proportion of potentially illicit customers in a full customer set for train data. Finally, instead of outputting a binary label 1/0 (potentially illicit sequence or not), we can plug in this as a feature into a classifier along with other features. We can use so-called log-odds ratio instead of a binary feature, defining as:

$$\begin{aligned} \log \frac{P(Y=potentially\,illicit \mid X)}{P(Y=non-illicit \mid X)} \end{aligned}$$
(7)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tertychnyi, P., Slobozhan, I., Ollikainen, M., Dumas, M. (2020). Scalable and Imbalance-Resistant Machine Learning Models for Anti-money Laundering: A Two-Layered Approach. In: Clapham, B., Koch, JA. (eds) Enterprise Applications, Markets and Services in the Finance Industry. FinanceCom 2020. Lecture Notes in Business Information Processing, vol 401. Springer, Cham. https://doi.org/10.1007/978-3-030-64466-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64466-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64465-9

  • Online ISBN: 978-3-030-64466-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics