2. Bibliography

AA17

Aderemi O Adewumi and Andronicus A Akinyelu. A survey of machine-learning and nature-inspired based credit card fraud detection techniques. International Journal of System Assurance Engineering and Management, 8(2):937–953, 2017.

AHJ+20

Ayman Alazizi, Amaury Habrard, François Jacquenet, Liyun He-Guelton, and Frédéric Oblé. Dual sequential variational autoencoders for fraud detection. In IDA, 14–26. 2020.

AFR97

Emin Aleskerov, Bernd Freisleben, and Bharat Rao. Cardwatch: a neural network based database mining system for credit card fraud detection. In Proceedings of the IEEE/IAFE 1997 computational intelligence for financial engineering (CIFEr), 220–226. IEEE, 1997.

ASH+19

Haseeb Ali, Mohd Najib Mohd Salleh, Kashif Hussain, Arshad Ahmad, Ayaz Ullah, Arshad Muhammad, Rashid Naseem, and Muzammil Khan. A review on data preprocessing methods for class imbalance problem. International Journal of Engineering & Technology, 8:390–397, 2019.

AC15

Jinwon An and Sungzoon Cho. Variational autoencoder based anomaly detection using reconstruction probability. Special Lecture on IE, 2(1):1–18, 2015.

BCB14

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.

BASO16

Alejandro Correa Bahnsen, Djamila Aouada, Aleksandar Stojanovic, and Björn Ottersten. Feature engineering strategies for credit card fraud detection. Expert Systems with Applications, 51:134–142, 2016.

Ban20

European Central Bank. 6th report on card fraud. August 2020. [Online; Last consulted 09-October-2020]. URL: https://www.ecb.europa.eu/pub/cardfraud/html/ecb.cardfraudreport202008~521edb602b.en.html#toc2.

BBM+03

Gustavo EAPA Batista, Ana LC Bazzan, Maria Carolina Monard, and others. Balancing training data for automated annotation of keywords: a case study. In WOB, 10–18. 2003.

BPM04

Gustavo EAPA Batista, Ronaldo C Prati, and Maria Carolina Monard. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD explorations newsletter, 6(1):20–29, 2004.

BTK+21

Marijan Beg, Juliette Taka, Thomas Kluyver, Alexander Konovalov, Min Ragan-Kelley, Nicolas M Thiéry, and Hans Fangohr. Using jupyter for reproducible scientific workflows. Computing in Science & Engineering, 23(2):36–46, 2021.

BentejacCsorgoMartinezMunoz21

Candice Bentéjac, Anna Csörgo, and Gonzalo Martínez-Muñoz. A comparative analysis of gradient boosting algorithms. Artificial Intelligence Review, 54(3):1937–1967, 2021.

BB12

James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Journal of machine learning research, 2012.

Bis06

Christopher M Bishop. Pattern recognition and machine learning. springer, 2006.

Bon21

Gianluca Bontempi. Statistical foundations of machine learning, 2nd edition. Université Libre de Bruxelles, 2021.

BEP13

Kendrick Boyd, Kevin H Eng, and C David Page. Area under the precision-recall curve: point estimates and confidence intervals. In Joint European conference on machine learning and knowledge discovery in databases, 451–466. Springer, 2013.

Bre96

Leo Breiman. Bagging predictors. Machine learning, 24(2):123–140, 1996.

Bre01

Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.

Car18

Fabrizio Carcillo. Beyond Supervised Learning in Credit Card Fraud Detection: A Dive into Semi-supervised and Distributed Learning. Université libre de Bruxelles, 2018.

CDPLB+18

Fabrizio Carcillo, Andrea Dal Pozzolo, Yann-Aël Le Borgne, Olivier Caelen, Yannis Mazzer, and Gianluca Bontempi. Scarff: a scalable framework for streaming credit card fraud detection with spark. Information fusion, 41:182–194, 2018.

CLBCB18

Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, and Gianluca Bontempi. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization. International Journal of Data Science and Analytics, 5(4):285–300, 2018.

CLBC+19

Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Yacine Kessaci, Frédéric Oblé, and Gianluca Bontempi. Combining unsupervised and supervised learning in credit card fraud detection. Information Sciences, 2019.

CTMozetivc20

Vitor Cerqueira, Luis Torgo, and Igor Mozetič. Evaluating time series forecasting models: an empirical study on performance estimation methods. Machine Learning, 109(11):1997–2028, 2020.

Cha09

Nitesh V Chawla. Data mining for imbalanced datasets: an overview. In Data mining and knowledge discovery handbook, pages 875–886. Springer, 2009.

CBHK02

Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357, 2002.

CCHJ08

Nitesh V Chawla, David A Cieslak, Lawrence O Hall, and Ajay Joshi. Automatically countering imbalance and its empirical relationship to cost. Data Mining and Knowledge Discovery, 17(2):225–252, 2008.

CJK04

Nitesh V Chawla, Nathalie Japkowicz, and Aleksander Kotcz. Special issue on learning from imbalanced data sets. ACM SIGKDD explorations newsletter, 6(1):1–6, 2004.

CLB+04

Chao Chen, Andy Liaw, Leo Breiman, and others. Using random forest to learn imbalanced data. University of California, Berkeley, 110(1-12):24, 2004.

CG16

Tianqi Chen and Carlos Guestrin. Xgboost: a scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 785–794. 2016.

Cyb89

George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.

DP15

Andrea Dal Pozzolo. Adaptive machine learning for credit card fraud detection. Université libre de Bruxelles, 2015.

DPBC+17

Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi. Credit card fraud detection: a realistic modeling and a novel learning strategy. IEEE transactions on neural networks and learning systems, 29(8):3784–3797, 2017.

DPCLB+14

Andrea Dal Pozzolo, Olivier Caelen, Yann-Ael Le Borgne, Serge Waterschoot, and Gianluca Bontempi. Learned lessons in credit card fraud detection from a practitioner perspective. Expert systems with applications, 41(10):4915–4928, 2014.

DJS+20

Kanishka Ghosh Dastidar, Johannes Jurgovsky, Wissam Siblini, Liyun He-Guelton, and Michael Granitzer. Nag: neural feature aggregation framework for credit card fraud detection. In 2020 IEEE International Conference on Data Mining (ICDM), 92–101. IEEE, 2020.

DG06

Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, 233–240. 2006.

DCLT18

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

DH00

Pedro Domingos and Geoff Hulten. Mining high-speed data streams. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 71–80. 2000.

DEG18

Anna Veronika Dorogush, Vasily Ershov, and Andrey Gulin. Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363, 2018.

Elk01

Charles Elkan. The foundations of cost-sensitive learning. In International joint conference on artificial intelligence, volume 17, 973–978. Lawrence Erlbaum Associates Ltd, 2001.

EMH19

Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. Neural architecture search: a survey. The Journal of Machine Learning Research, 20(1):1997–2017, 2019.

FZ11

Guangzhe Fan and Mu Zhu. Detection of rare items with target. Statistics and Its Interface, 4(1):11–17, 2011.

Faw04

Tom Fawcett. Roc graphs: notes and practical considerations for researchers. Machine learning, 31(1):1–38, 2004.

Faw06

Tom Fawcett. An introduction to roc analysis. Pattern recognition letters, 27(8):861–874, 2006.

FernandezGarciaG+18

Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C Prati, Bartosz Krawczyk, and Francisco Herrera. Learning from imbalanced data sets. Springer, 2018.

FK15

Peter Flach and Meelis Kull. Precision-recall-gain curves: pr analysis done right. In Advances in neural information processing systems, 838–846. 2015.

FS97

Yoav Freund and Robert E Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1):119–139, 1997.

FHT01

Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Volume 1. Springer series in statistics New York, 2001.

Fri01

Jerome H Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189–1232, 2001.

FCTZ16

Kang Fu, Dawei Cheng, Yi Tu, and Liqing Zhang. Credit card fraud detection using convolutional neural networks. In International conference on neural information processing, 483–490. Springer, 2016.

GvZliobaiteB+14

João Gama, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. A survey on concept drift adaptation. ACM computing surveys (CSUR), 46(4):1–37, 2014.

GR94

Sushmito Ghosh and Douglas L Reilly. Credit card fraud detection with a neural-network. In System Sciences, 1994. Proceedings of the Twenty-Seventh Hawaii International Conference on, volume 3, 621–630. IEEE, 1994.

GBC16

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.

GTM+20

Akhilesh Gupta, Nesime Tatbul, Ryan Marcus, Shengtian Zhou, Insup Lee, and Justin Gottschlich. Class-weighted evaluation metrics for imbalanced data classification. arXiv preprint arXiv:2010.05995, 2020.

HYS+17

Guo Haixiang, Li Yijing, Jennifer Shang, Gu Mingyun, Huang Yuanyue, and Gong Bing. Learning from class-imbalanced data: review of methods and applications. Expert Systems with Applications, 73:220–239, 2017.

HWM05

Hui Han, Wen-Yuan Wang, and Bing-Huan Mao. Borderline-smote: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing, 878–887. Springer, 2005.

HBGL08

Haibo He, Yang Bai, Edwardo A Garcia, and Shutao Li. Adasyn: adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), 1322–1328. IEEE, 2008.

HN92

Robert Hecht-Nielsen. Theory of the backpropagation neural network. In Neural networks for perception, pages 65–93. Elsevier, 1992.

Imb21

Imblearn. Imbalanced learning library for python. 2021. [Online; Last consulted 26-June-2021]. URL: https://imbalanced-learn.org/.

Ins18

Statistic Brain Research Institute. Credit card fraud statistics. April 2018. [Online; Last consulted 30-March-2021]. URL: https://www.statisticbrain.com/credit-card-fraud-statistics/.

JGZ+18

Johannes Jurgovsky, Michael Granitzer, Konstantin Ziegler, Sylvie Calabretto, Pierre-Edouard Portier, Liyun He-Guelton, and Olivier Caelen. Sequence classification for credit-card fraud detection. Expert Systems with Applications, 100:234–245, 2018.

Kag16

Kaggle. Credit card fraud detection dataset. November 2016. [Online; Last consulted 09-March-2021]. URL: https://www.kaggle.com/mlg-ulb/creditcardfraud.

Kag19

Kaggle. Ieee-cis fraud detection - can you detect fraud from customer transactions? September 2019. [Online; Last consulted 26-August-2021]. URL: https://www.kaggle.com/c/ieee-fraud-detection.

KMF+17

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: a highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30:3146–3154, 2017.

KonevcnyMY+16

Jakub Konečn\`y, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.

Kri10

M. Krivko. A hybrid model for plastic card fraud detection systems. Expert Systems with Applications, 37(8):6070 – 6076, 2010. URL: http://www.sciencedirect.com/science/article/pii/S0957417410001582, doi:https://doi.org/10.1016/j.eswa.2010.02.119.

LRT14

Balaji Lakshminarayanan, Daniel M Roy, and Yee Whye Teh. Mondrian forests: efficient online random forests. Advances in neural information processing systems, 27:3140–3148, 2014.

LDB17

Felix Last, Georgios Douzas, and Fernando Bacao. Oversampling for imbalanced learning based on k-means and smote. arXiv preprint arXiv:1711.00837, 2017.

Lau01

Jorma Laurikkala. Improving identification of difficult small classes by balancing class distribution. In Conference on Artificial Intelligence in Medicine in Europe, 63–66. Springer, 2001.

LNC+11

Quoc V Le, Jiquan Ngiam, Adam Coates, Ahbik Lahiri, Bobby Prochnow, and Andrew Y Ng. On optimization methods for deep learning. In ICML. 2011.

LLBHG+19

Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He-Guelton, Frédéric Oblé, and Gianluca Bontempi. Deep-learning domain adaptation techniques for credit cards fraud detection. In INNS Big Data and Deep Learning conference, 78–88. Springer, 2019.

LPS+21

Bertrand Lebichot, Gian Marco Paldino, W Siblini, L He-Guelton, F Oblé, and G Bontempi. Incremental learning strategies for credit cards fraud detection. International Journal of Data Science and Analytics, pages 1–10, 2021.

LVLB+21

Bertrand Lebichot, Théo Verhelst, Yann-Aël Le Borgne, Liyun He-Guelton, Frédéric Oblé, and Gianluca Bontempi. Transfer learning strategies for credit card fraud detection. IEEE access, 9:114754–114766, 2021.

LemaitreNA17

Guillaume Lemaître, Fernando Nogueira, and Christos K. Aridas. Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17):1–5, 2017. URL: http://jmlr.org/papers/v18/16-365.html.

LS08

Charles X Ling and Victor S Sheng. Cost-sensitive learning and the class imbalance problem. Encyclopedia of machine learning, 2011:231–235, 2008.

LJ20

Yvan Lucas and Johannes Jurgovsky. Credit card fraud detection using machine learning: a survey. arXiv preprint arXiv:2010.06479, 2020.

MO97

Richard Maclin and David Opitz. An empirical evaluation of bagging and boosting. AAAI/IAAI, 1997:546–551, 1997.

MAT+19

Sara Makki, Zainab Assaghir, Yehia Taher, Rafiqul Haque, Mohand-Said Hacid, and Hassan Zeineddine. An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access, 7:93010–93022, 2019.

MZ03

Inderjeet Mani and I Zhang. Knn approach to unbalanced data distributions: a case study involving information extraction. In Proceedings of workshop on learning from imbalanced datasets, volume 126. ICML United States, 2003.

McK17

Wes McKinney. Python for data analysis: Data wrangling with Pandas, NumPy, and IPython - 2nd Edition. O'Reilly Media, Inc., 2017.

MekterovicBrkicBaranovic18

Igor Mekterović, Ljiljana Brkić, and Mirta Baranović. A systematic review of data mining approaches to credit card fraud detection. WSEAS Transactions on Business and Economics, 15:437–444, 2018.

Mus19

John Muschelli. Roc and auc with a binary predictor: a potentially misleading metric. Journal of Classification, pages 1–13, 2019.

MullerG16

Andreas C Müller and Sarah Guido. Introduction to machine learning with Python: a guide for data scientists. O'Reilly Media, Inc., 2016.

NH10

Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on International Conference on Machine Learning, 807–814. 2010.

NCK11

Hien M Nguyen, Eric W Cooper, and Katsuari Kamei. Borderline over-sampling for imbalanced data classification. International Journal of Knowledge Engineering and Soft Data Paradigms, 3(1):4–21, 2011.

PGC+17

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. In NIPS-W. 2017.

PL18

Vipul Patil and Umesh Kumar Lilhore. A survey on different data mining & machine learning methods for credit card fraud detection. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 3(5):320–325, 2018.

PC18

Rimpal R Popat and Jayesh Chaudhary. A survey on credit card fraud detection using machine learning. In 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI), 1120–1125. IEEE, 2018.

PP19

C Victoria Priscilla and D Padma Prabha. Credit card fraud detection: a systematic review. In International Conference on Information, Communication and Computing Technology, 290–303. Springer, 2019.

PGV+17

Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. Catboost: unbiased boosting with categorical features. arXiv preprint arXiv:1706.09516, 2017.

RWC+19

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, and others. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.

rep19

Nilson report. Nilson report issue 1164. November 2019. [Online; Last consulted 09-October-2020]. URL: https://nilsonreport.com/upload/content_promo/The_Nilson_Report_Issue_1164.pdf .

Rud16

Sebastian Ruder. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.

RHW86

David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learning representations by back-propagating errors. nature, 323(6088):533–536, 1986.

SSB18

Imane Sadgali, Nawal Sael, and Faouzia Benabbou. Detection of credit card fraud: state of art. International Journal of computer science and network security, 18(11):76–83, 2018.

SR15

Takaya Saito and Marc Rehmsmeier. The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PloS one, 10(3):e0118432, 2015.

SCF+21

Wissam Siblini, Guillaume Coter, Rémy Fabry, Liyun He-Guelton, Frédéric Oblé, Bertrand Lebichot, Yann-Aël Le Borgne, and Gianluca Bontempi. Transfer learning for credit card fraud detection: A journey from research to production. In Proceedings of Data Science and Advanced Analytics (DSAA 2021). 2021. URL: https://arxiv.org/abs/2107.09323.

SKK18

Janvier Omar Sinayobye, Fred Kiwanuka, and Swaib Kaawaase Kyanda. A state-of-the-art review of machine learning techniques for fraud detection research. In 2018 IEEE/ACM Symposium on Software Engineering in Africa (SEiA), 11–19. IEEE, 2018.

SHK+14

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.

STZY18

Yu Sun, Ke Tang, Zexuan Zhu, and Xin Yao. Concept drift adaptation by exploiting historical knowledge. IEEE transactions on neural networks and learning systems, 29(10):4822–4832, 2018.

SVL14

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, 3104–3112. 2014.

Tha20

Alaa Tharwat. Classification assessment methods. Applied Computing and Informatics, 2020.

T+76

Ivan Tomek and others. Two modifications of cnn. IEEE Trans. Syst. Man Commun, 1:769–772, 1976.

VVBC+15

Véronique Van Vlasselaer, Cristián Bravo, Olivier Caelen, Tina Eliassi-Rad, Leman Akoglu, Monique Snoeck, and Bart Baesens. Apate: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decision Support Systems, 75:38–48, 2015.

VSP+17

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in neural information processing systems, 5998–6008. 2017.

VAK+16

Kalyan Veeramachaneni, Ignacio Arnaldo, Vamsi Korrapati, Constantinos Bassias, and Ke Li. Aiˆ 2: training a big data machine to defend. In 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE International Conference on Intelligent Data and Security (IDS), 49–54. IEEE, 2016.

WHJ+09

Christopher Whitrow, David J Hand, Piotr Juszczak, David Weston, and Niall M Adams. Transaction aggregation as a strategy for credit card fraud detection. Data mining and knowledge discovery, 18(1):30–55, 2009.

Wil72

Dennis L Wilson. Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, and Cybernetics, pages 408–421, 1972.

YL09

Show-Jane Yen and Yue-Shi Lee. Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications, 36(3):5718–5727, 2009.

YAG19

Niloofar Yousefi, Marie Alaghband, and Ivan Garibay. A comprehensive survey on machine learning techniques and user authentication approaches for credit card fraud detection. arXiv preprint arXiv:1912.02629, 2019.

ZP17

Chong Zhou and Randy C Paffenroth. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 665–674. 2017.

Zho21

Zhi-Hua Zhou. Ensemble learning. In Machine Learning, pages 181–210. Springer, 2021.

ZAM+16

Zahra Zojaji, Reza Ebrahimi Atani, Amir Hassan Monadjemi, and others. A survey of credit card fraud detection techniques: data and technique oriented perspective. arXiv preprint arXiv:1611.06439, 2016.