Evaluation of Machine Learning-Based Algorithm to Predicting Loan Default in Nigeria

Authors

  • K. O Efekodo Department of Computer Sciences, Lead City University, Ibadan, Oyo State, Nigeria
  • O. S Akinola Department of Computer Science, University of Ibadan, Ibadan, Oyo State, Nigeria
  • A. A Waheed Department of Computer Sciences, Lead City University, Ibadan, Oyo State, Nigeria

Keywords:

Accuracy, Classifier, Decision Trees, Gaussian Naive Bayes, Gradient Boosting Classifiers, Logistic Regression, Random Forest

Abstract

Accurately predicting loan defaults is critical in the financial sector to minimize losses and optimize credit risk
management. Traditional creditworthiness assessment methods often fail to capture the complex, dynamic
interactions in financial data, leading to inaccurate predictions. This study harnesses advanced machine learning
techniques to enhance the prediction of loan defaults, aiming to outperform traditional statistical models. A
dataset containing 50,000 borrower records with diverse characteristics, including demographic, financial, and
loan-specific features, was utilized. The data was split into training (70%) and test (30%) sets for model
development and evaluation. Various machine learning algorithms were tested, including Logistic Regression,
Decision Trees, Gradient Boosting Classifiers, Random Forest, and Gaussian Naive Bayes. The Gaussian Naive
Bayes (GaussianNB) model demonstrated superior performance, achieving an accuracy of 78.8% on the test set.
This model effectively captured complex patterns in the high-dimensional data, significantly reducing false
positives and false negatives compared to other models. The findings suggest that machine learning models,
particularly GaussianNB, offer substantial improvements in predictive accuracy for loan default risk assessments.
This findings can enhance lenders' decision-making processes by improving risk stratification and resource
allocation. Future research should explore integrating non-traditional data sources, such as behavioral and
macroeconomic variables, and employing deep learning techniques to further refine predictive accuracy.

Downloads

Published

2025-03-07