Machine Learning and Finance
A Review using Latent Dirichlet Allocation Technique (LDA)
DOI:
https://doi.org/10.31686/ijier.vol9.iss4.3016Keywords:
Machine Learning, topic modelling, structuring finance research, latent dirichlet allocationAbstract
The aim of this paper is provide a first comprehensive structuring of the literature applying machine learning to finance. We use a probabilistic topic modelling approach to make sense of this diverse body of research spanning across the disciplines of finance, economics, computer sciences, and decision sciences. Through the topic modelling approach, a Latent Dirichlet Allocation Technique (LDA), we can extract the 14 coherent research topics that are the focus of the 6,148 academic articles during the years 1990-2019 analysed. We first describe and structure these topics, and then further show how the topic focus has evolved over the last two decades. Our study thus provides a structured topography for finance researchers seeking to integrate machine learning research approaches in their exploration of finance phenomena. We also showcase the benefits to finance researchers of the method of probabilistic modelling of topics for deep comprehension of a body of literature, especially when that literature has diverse multi-disciplinary actors.
References
Abdou, H. A., Alam, S. T.; Mulkeen, J. (2014). Would credit scoring work for islamic finance? A neural network approach. International Journal of Islamic and Middle Eastern Finance and Management, 7(1):112–125. DOI: https://doi.org/10.1108/IMEFM-03-2013-0038
Abraham, A. (2002). Analysis of hybrid soft and hard computing techniques for forex monitoring systems. In 2002 IEEE World Congress on Computational Intelligence, volume 2, pages 1616–1621. IEEE. DOI: https://doi.org/10.1109/FUZZ.2002.1006749
Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4):589–609. DOI: https://doi.org/10.1111/j.1540-6261.1968.tb00843.x
Altman, E. I., Marco, G.; Varetto, F. (1994). Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience). Journal of Banking & Finance, 18(3):505–529. DOI: https://doi.org/10.1016/0378-4266(94)90007-8
AmirAskari, M.; Menhaj, M. B. (2016). A modified fuzzy relational model ap- proach to prediction of foreign exchange rates. In 2016 4th International Conference on Control, Instrumentation, and Automation (ICCIA), pages 457–461. IEEE. DOI: https://doi.org/10.1109/ICCIAutom.2016.7483206
Araújo, R. d. A., de Oliveira, A. L.; Soares, S. C. (2010). A quantum-inspired hybrid methodology for financial time series prediction. In The 2010 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE. DOI: https://doi.org/10.1109/IJCNN.2010.5604601
Athey, S. (2018). The impact of machine learning on economics. In Ajay K. Agrawal, J. G. and Goldfarb, A., editors, The Economics of Artificial Intelligence: An Agenda. University of Chicago Press.
Aziz, S., Michael D., Helmi H.; A. Piepenbrink. Machine learning in finance: A topic modellng approach. In: 1st International Banking and Finance Research Conference, Agadir, Morocco, October 2019. DOI: https://doi.org/10.2139/ssrn.3327277
Bhattacharya, S.; Ghosh, S. (2007). An artificial intelligence based approach for risk management using attack graph. In 2007 International Conference on Computational Intelligence and Security, pages 794–798. IEEE. DOI: https://doi.org/10.1109/CIS.2007.145
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4):77–84. DOI: https://doi.org/10.1145/2133806.2133826
Blei, D. M., Ng, A. Y.; Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022.
Bollen, J., Mao, H.; Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1):1–8. DOI: https://doi.org/10.1016/j.jocs.2010.12.007
Boyd-Graber, J., Hu, Y.; Mimno, D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval, 11(2-3):143–296. DOI: https://doi.org/10.1561/1500000030
Cerchiello, P., Giudici, P.; Nicola, G. (2017). Twitter data models for bank risk contagion. Neurocomputing, 264:50–56. DOI: https://doi.org/10.1016/j.neucom.2016.10.101
Chavarnakul, T.; Enke, D. (2008). Intelligent technical analysis based equiv- olume charting for stock trading using neural networks. Expert Systems with Applications, 34(2):1004–1017. DOI: https://doi.org/10.1016/j.eswa.2006.10.028
Chellaboina, V., Bhatia, A.; Bhat, S. P. (2013). Explicit formulas for optimal hedging stratergies for European contingent claims. In 2013 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr), pages 122–127. IEEE. DOI: https://doi.org/10.1109/CIFEr.2013.6611707
Cheng, D.; Cirillo, P. (2018). A reinforced urn process modelling of recovery rates and recovery times. Journal of Banking & Finance, 96:1–17. DOI: https://doi.org/10.1016/j.jbankfin.2018.08.014
Dempster, M. A., Payne, T. W., Romahi, Y.; Thompson, G. W. (2001). Computational learning techniques for intraday FX trading using popular technical indicators. IEEE Transactions on neural networks, 12(4):744–754. DOI: https://doi.org/10.1109/72.935088
Dyer, T., Lang, M.; Stice-Lawrence, L. (2017). The evolution of 10-K textual disclosure: Evidence from latent dirichlet allocation. Journal of Accounting and Economics, 64(2-3):221–245. DOI: https://doi.org/10.1016/j.jacceco.2017.07.002
Ferreira, J. Z., Rodrigues, J., Cristo, M.; de Oliveira, D. F. (2014). Multi-entity polarity analysis in financial documents. In Proceedings of the 20th Brazilian Symposium on Multimedia and the Web, pages 115–122. ACM. DOI: https://doi.org/10.1145/2664551.2664574
Figini, S., Bonelli, F.; Giovannini, E. (2017). Solvency prediction for small and medium enterprises in banking. Decision Support Systems, 102:91–97. DOI: https://doi.org/10.1016/j.dss.2017.08.001
Ghasemiyeh, R., Moghdani, R.; Sana, S. S. (2017). A hybrid artificial neural network with metaheuristic algorithms for predicting stock price. Cybernetics and Systems, 48(4):365–392. DOI: https://doi.org/10.1080/01969722.2017.1285162
Goh, Y. M.; Chua, D. (2009). Case-based reasoning approach to construction safety hazard identification: Adaptation and utilization. Journal of Construction Engineering and Management, 136(2):170–178. DOI: https://doi.org/10.1061/(ASCE)CO.1943-7862.0000116
Griffiths, T. L.; Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101:5228–5235. DOI: https://doi.org/10.1073/pnas.0307752101
Harvey, C. R., Liechty, J. C., Liechty, M. W.; Müller, P. (2010). Portfolio selection with higher moments. Quantitative Finance, 10(5):469–485. DOI: https://doi.org/10.1080/14697681003756877
Hawley, D. D., Johnson, J. D.; Raina, D. (1990). Artificial neural systems: A new tool for financial decision-making. Financial Analysts Journal, 46(6):63–72. DOI: https://doi.org/10.2469/faj.v46.n6.63
Heaton, J., Polson, N. G.; Witte, J. H. (2016). Deep learning in finance. arXiv preprint arXiv:1602.06561.
Hornik, K.; Grün, B. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13):1–30. DOI: https://doi.org/10.18637/jss.v040.i13
Hossain, A.; Nasser, M. (2011). Comparison of the finite mixture of ARMA- GARCH, back propagation neural networks and support-vector machines in fore- casting financial returns. Journal of Applied Statistics, 38(3):533–551. DOI: https://doi.org/10.1080/02664760903521435
Huang, D., Thottan, M.; Feather, F. (2013). Designing customized energy services based on disaggregation of heating usage. In 2013 IEEE PES Innovative Smart Grid Technologies (ISGT), pages 1–6. IEEE. DOI: https://doi.org/10.1109/ISGT.2013.6497863
Hussain, A. J., Al-Jumeily, D., Al-Askar, H.; Radi, N. (2016). Regularized dynamic self-organized neural network inspired by the immune algorithm for financial time series prediction. Neurocomputing, 188:23–30. DOI: https://doi.org/10.1016/j.neucom.2015.01.109
Ince, H.; Trafalis, T. B. (2008). Short term forecasting with support vector ma- chines and application to stock price prediction. International Journal of General Systems, 37(6):677–687. DOI: https://doi.org/10.1080/03081070601068595
Ito, T., Sakaji, H., Izumi, K., Tsubouchi, K.; Yamashita, T. (2017). Development of sentiment indicators using both unlabeled and labeled posts. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1–8. IEEE. DOI: https://doi.org/10.1109/SSCI.2017.8280918
Junyou, B. (2007). Stock price forecasting using PSO-trained neural networks. In IEEE Congress on Evolutionary Computation, pages 2879–2885. IEEE.
Kampouridis, M.; Otero, F. E. (2017). Heuristic procedures for improving the predictability of a genetic programming financial forecasting algorithm. Soft Computing, 21(2):295–310. DOI: https://doi.org/10.1007/s00500-015-1614-8
Kampouridis, M. and Tsang, E. (2010). EDDIE for investment opportunities forecast- ing: Extending the search space of the GP. In 2010 IEEE Congress on Evolutionary Computation (CEC), pages 1–8. IEEE. DOI: https://doi.org/10.1109/CEC.2010.5586094
Kaplan, S.; Vakili, K. (2015). The double-edged sword of recombination in breakthrough innovation. Strategic Management Journal, 36(10):1435–1457. DOI: https://doi.org/10.1002/smj.2294
Khandani, A. E., Kim, A. J.; Lo, A. W. (2010). Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34(11):2767–2787. DOI: https://doi.org/10.1016/j.jbankfin.2010.06.001
Kim, S. (1998). Time-delay recurrent neural network for temporal correlations and prediction. Neurocomputing, 20(1-3):253–263. DOI: https://doi.org/10.1016/S0925-2312(98)00018-6
Kim, Y. S.; Sohn, S. Y. (2004). Managing loan customers using misclassification patterns of credit scoring model. Expert Systems with Applications, 26(4):567–573. DOI: https://doi.org/10.1016/j.eswa.2003.10.013
Kodogiannis, V.; Lolis, A. (2002). Forecasting financial time series using neural network and fuzzy system-based techniques. Neural Computing & Applications, 11(2):90–102. DOI: https://doi.org/10.1007/s005210200021
Krippendorff, K. (1970). Estimating the reliability, systematic error and random error of interval data. Educational and Psychological Measurement, 30(1):61–70. DOI: https://doi.org/10.1177/001316447003000105
Liao, Z.; Wang, J. (2010). Forecasting model of global stock index by stochastic time effective neural network. Expert Systems with Applications, 37(1):834–841. DOI: https://doi.org/10.1016/j.eswa.2009.05.086
Liu, F.; Wang, J. (2012). Fluctuation prediction of stock market index by Legen- dre neural network with random time strength function. Neurocomputing, 83:12– 21. DOI: https://doi.org/10.1016/j.neucom.2011.09.033
Lumezanu, C., Feamster, N.; Klein, H. (2012). # bias: Measuring the tweeting behavior of propagandists. In Sixth International AAAI Conference on Weblogs and Social Media.
Ma, Y., Gong, X.; Tian, G. (2014). A mean-semi-variance portfolio opti- mization model with full transaction costs. In 2014 International Conference on Computational Intelligence and Communication Networks (CICN), pages 623–627. IEEE. DOI: https://doi.org/10.1109/CICN.2014.139
Mahalingam, P.; Vivek, S. (2016). Predicting financial savings decisions using sigmoid function and information gain ratio. Procedia Computer Science, 93:19–25. DOI: https://doi.org/10.1016/j.procs.2016.07.176
Marmier, F., Ioana, F. D., and Didier, G. (2014). Strategic decision-making in NPD projects according to risk: Application to satellites design projects. Computers in Industry, 65(8):1107 – 1114. DOI: https://doi.org/10.1016/j.compind.2014.06.001
Medeiros, C. M.; Barreto, G. A. (2007). Pruning the multilayer percep- tron through the correlation of backpropagated errors. In Seventh International Conference on Intelligent Systems Design and Applications, pages 64–69. IEEE. DOI: https://doi.org/10.1109/ISDA.2007.156
Michaud, R. O. (1989). The Markowitz optimization enigma: Is ‘optimized’ optimal? Financial Analysts Journal, 45(1):31–42. DOI: https://doi.org/10.2469/faj.v45.n1.31
Miglietta, N.; Remondino, M. (2009). Modelling cognitive distortions of be- havioural finance. In International Conference on Computational Intelligence, Modelling and Simulation, 2009., pages 204–209. IEEE. DOI: https://doi.org/10.1109/CSSim.2009.17
Mishra, A., Irwin, D., Shenoy, P., Kurose, J.; Zhu, T. (2012). Smartcharge: Cutting the electricity bill in smart homes with energy storage. In Proceedings of the 3rd International Conference on Future Energy Systems: Where Energy, Computing and Communication Meet, page 29. ACM. DOI: https://doi.org/10.1145/2208828.2208857
Moerland, T. M., Broekens, J., and Jonker, C. M. (2018). Emotion in reinforcement learning agents and robots: A survey. Machine Learning, 107(2):443–480. DOI: https://doi.org/10.1007/s10994-017-5666-0
Mogre, R., D’Amico, F., et al. (2016). A decision framework to mitigate supply chain risks: An application in the offshore-wind industry. IEEE Transactions on Engineering Management, 63(3):316–325. DOI: https://doi.org/10.1109/TEM.2016.2567539
Moosa, I. A. (2007). Operational Risk Management. Springer. DOI: https://doi.org/10.1057/9780230591486
Moro, S., Cortez, P.; Rita, P. (2015). Business intelligence in banking: A liter- ature analysis from 2002 to 2013 using text mining and latent dirichlet allocation. Expert Systems with Applications, 42(3):1314–1324. DOI: https://doi.org/10.1016/j.eswa.2014.09.024
Mukwazvure, A.; Supreethi, K. (2015). A hybrid approach to sentiment analysis of news comments. In 2015 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), pages 1–6. IEEE. DOI: https://doi.org/10.1109/ICRITO.2015.7359282
Mullainathan, S.; Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2):87–106. DOI: https://doi.org/10.1257/jep.31.2.87
Ng, W. W., Liang, X.-L., Chan, P. P.; Yeung, D. S. (2011). Stock investment decision support for Hong Kong market using RBFNN based candlestick models. In 2011 International Conference on Machine Learning and Cybernetics (ICMLC), volume 2, pages 538–543. IEEE. DOI: https://doi.org/10.1109/ICMLC.2011.6016839
Nian, K., Coleman, T. F.; Li, Y. (2018). Learning minimum variance discrete hedging directly from the market. Quantitative Finance, 18(7):1115–1128. DOI: https://doi.org/10.1080/14697688.2017.1413245
Niranjan, M. (1996). Sequential tracking in pricing financial options using model based and neural network approaches. In M.C. Mozer, M.I. Jordan, T. P., editor, Advances in Neural Information Processing Systems, pages 960–966. Cambridge: MIT Press.
Oprea, S. (2015). Informatics solutions for electricity consumption optimization. In 2015 16th IEEE International Symposium on Computational Intelligence and Informatics (CINTI), pages 193–198. IEEE. DOI: https://doi.org/10.1109/CINTI.2015.7382921
Parida, A., Bisoi, R., Dash, P., and Mishra, S. (2015). Financial time series prediction using a hybrid functional link fuzzy neural network trained by adaptive unscented kalman filter. In 2015 IEEE Power, Communication and Information Technology Conference (PCITC), pages 568–575. IEEE. DOI: https://doi.org/10.1109/PCITC.2015.7438229
Piepenbrink, A.; Gaur, A. S. (2017). Topic models as a novel approach to identify themes in content analysis. In Academy of Management Proceedings, volume 2017, page 11335. Academy of Management. DOI: https://doi.org/10.5465/AMBPP.2017.141
Piepenbrink, A.; Nurmammadov, E. (2015). Topics in the literature of transition economies and emerging markets. Scientometrics, 102(3):2107–2130. DOI: https://doi.org/10.1007/s11192-014-1513-2
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3):130–137. DOI: https://doi.org/10.1108/eb046814
Rather, A. M., Agarwal, A.; Sastry, V. (2015). Recurrent neural network and a hybrid model for prediction of stock returns. Expert Systems with Applications, 42(6):3234–3241. DOI: https://doi.org/10.1016/j.eswa.2014.12.003
Renault, T. (2017). Intraday online investor sentiment and return patterns in the US stock market. Journal of Banking & Finance, 84:25–40. DOI: https://doi.org/10.1016/j.jbankfin.2017.07.002
Salton, G. and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5):513–523. DOI: https://doi.org/10.1016/0306-4573(88)90021-0
Sezer, O. B.; Ozbayoglu, A. M. (2018). Algorithmic financial trading with deep convolutional neural networks: Time series to image conversion approach. Applied Soft Computing, 70:525–538. DOI: https://doi.org/10.1016/j.asoc.2018.04.024
Shen, W.; Wang, J. (2017). Portfolio selection via subset resampling. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pages 1517–1523. DOI: https://doi.org/10.1609/aaai.v31i1.10728
Smailović, J., Grčar, M., Lavrač, N.; Žnidaršič, M. (2014). Stream-based ac- tive learning for sentiment analysis in the financial domain. Information Sciences, 285:181–203. DOI: https://doi.org/10.1016/j.ins.2014.04.034
Son, Y., Byun, H.; Lee, J. (2016). Nonparametric machine learning models for predicting the credit default swaps: An empirical study. Expert Systems with Applications, 58:210–220. DOI: https://doi.org/10.1016/j.eswa.2016.03.049
Steiner, M.; Wittkemper, H.-G. (1997). Portfolio optimization with a neural network implementation of the coherent market hypothesis. European Journal of Operational Research, 100(1):27–40. DOI: https://doi.org/10.1016/S0377-2217(95)00339-8
Suganuma, M., Shirakawa, S., and Nagao, T. (2017). A genetic programming ap- proach to designing convolutional neural network architectures. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 497–504. ACM. DOI: https://doi.org/10.1145/3071178.3071229
Tilakaratne, C. D., Mammadov, M. A.; Morris, S. A. (2007). Effectiveness of using quantified intermarket influence for predicting trading signals of stock markets. In Proceedings of the sixth Australasian conference on Data mining and analytics, pages 171–179. Australian Computer Society.
Tirunillai, S; Tellis, G. J. (2014). Mining marketing meaning from online chatter: Strategic brand analysis of big data using latent dirichlet allocation. Journal of Marketing Research, 51(4):463–479. DOI: https://doi.org/10.1509/jmr.12.0106
Tsang, E., Yung, P.; Li, J. (2004). EDDIE-Automation, a decision support tool for financial forecasting. Decision Support Systems, 37(4):559–565. DOI: https://doi.org/10.1016/S0167-9236(03)00087-3
Varetto, F. (1998). Genetic algorithms applications in the analysis of insolvency risk. Journal of Banking & Finance, 22(10-11):1421–1439. DOI: https://doi.org/10.1016/S0378-4266(98)00059-4
Wang, K.; Huang, S. (2010). Using fast adaptive neural network classifier for mutual fund performance evaluation. Expert Systems with Applications, 37(8):6007– 6011. DOI: https://doi.org/10.1016/j.eswa.2010.02.003
Wang, Y.; Huang, L. (2009). Risk assessment of supply chain based on BP neural network. In KAM’09. Second International Symposium on Knowledge Acquisition and Modelling, 2009, volume 2, pages 186–188. IEEE. DOI: https://doi.org/10.1109/KAM.2009.232
Weng, B., Lu, L., Wang, X., Megahed, F. M., and Martinez, W. (2018). Predicting short-term stock prices using ensemble methods and online data sources. Expert Systems with Applications, 112:258–273. DOI: https://doi.org/10.1016/j.eswa.2018.06.016
Wong, B. K.; Selvi, Y. (1998). Neural network applications in finance: A review and analysis of literature (1990–1996). Information & Management, 34(3):129–139. DOI: https://doi.org/10.1016/S0378-7206(98)00050-0
Worasucheep, C. (2015). Forecasting currency exchange rates with an Artificial Bee Colony-optimized neural network. In 2015 IEEE Congress on Evolutionary Computation (CEC), pages 3319–3326. IEEE. DOI: https://doi.org/10.1109/CEC.2015.7257305
Xu, W., Zhang, Z., Gong, D.; Guan, X. (2014). Neural network model for the risk prediction in cold chain logistics. International Journal of Multimedia and Ubiquitous Engineering, 9(8):111–124. DOI: https://doi.org/10.14257/ijmue.2014.9.8.10
Yao, J.; Tan, C. L. (2000). A case study on using neural networks to perform technical forecasting of forex. Neurocomputing, 34(1-4):79–98. DOI: https://doi.org/10.1016/S0925-2312(00)00300-3
Yiwen, Y., Guizhong, L.; Zongping, Z. (2000). Stock market trend prediction based on neural networks, multiresolution analysis and dynamical reconstruction. In Proceedings of the IEEE/IAFE/INFORMS 2000 Conference on Computational Intelligence for Financial Engineering, pages 155–156. IEEE. DOI: https://doi.org/10.1109/CIFER.2000.844615
Yu, Y. (2011). Risk management game method of the weapons project based on bp neural network. In 2011 International Conference on Information Technology, Computer Engineering and Management Sciences (ICM), volume 1, pages 113–117. IEEE. DOI: https://doi.org/10.1109/ICM.2011.32
Zetzsche, Dirk Andrea; Arner, Douglas W. and Buckley, Ross P. and Tang, Brian, Artificial Intelligence in Finance: Putting the Human in the Loop (February 1, 2020). CFTE Academic Paper Series: Centre for Finance, Technology and Entrepreneurship, no. 1., University of Hong Kong Faculty of Law Research Paper No. 2020/006, Available at SSRN: https://ssrn.com/abstract=3531711 .
Downloads
Published
Issue
Section
License
Copyright (c) 2021 Ahmed Sameer El Khatib
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Copyrights for articles published in IJIER journals are retained by the authors, with first publication rights granted to the journal. The journal/publisher is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author for more visit Copyright & License.
How to Cite
Accepted 2021-03-21
Published 2021-04-01
Most read articles by the same author(s)
- Ahmed Sameer El Khatib, Financial Literacy , International Journal for Innovation Education and Research: Vol. 9 No. 3 (2021): International Journal for Innovation Education and Research