Leveraging Machine Learning for Corporate Fraud Detection: A Random Forest Study

Main Article Content

Szu-Hsien Lin
Tzu-Pu Chang
Huei-Hwa Lai
Yan Ting Chen

Abstract

The occurrence of corporate fraud often results in significant losses to stakeholders and society. Therefore, this study aims to construct a model to predict corporate fraud, with the goal of providing early warnings of potential fraudulent activities. The research focuses on fraudulent listed companies in Taiwan and selects matching non-fraudulent companies at a ratio of 1:2 as the research sample. To comprehensively capture the factors contributing to fraud, 53 indicators are selected from four dimensions: financial statements, corporate governance, market transactions, and the overall economy. This study further categorizes fraud methods into financial statement fraud and non-financial statement fraud (i.e., hollowing out/misappropriating assets/manipulating stock prices), and applies machine learning techniques, specifically decision tree and random forest algorithms, for prediction and analysis. The empirical results indicate that: (1) the random forest method, based on ensemble learning, achieves higher prediction accuracy than the decision tree model, and the prediction accuracy improves when fraud methods are distinguished; (2) the type I error of the random forest model is zero, meaning that if the model predicts a company as fraudulent, fraud will occur in the following year; and (3) the detailed techniques of fraud evolve structurally over time, leading to a relatively high type II error.

Article Details

How to Cite
Lin, S.-H., Chang, T.-P., Lai, H.-H., & Chen, Y. T. (2025). Leveraging Machine Learning for Corporate Fraud Detection: A Random Forest Study. Journal of Cultural Analysis and Social Change, 10(3), 209–222. https://doi.org/10.64753/jcasc.v10i3.2399
Section
Articles