Prediction of Loan Defaulters Using Machine Learning

Authors

  • L. Oluchi Eze Department of Computer Science, University of Ibadan, Ibadan, Nigeria.
  • Mutiat A. Ogunrinde 2Department of Mathematical and Computer Sciences, Fountain University, Osogbo, Nigeria.
  • Solomon O. Akinola Department of Computer Science, University of Ibadan, Ibadan, Nigeria.

Keywords:

Defaulters, Linear Model, Performance, Financial Institutions

Abstract

Financial institutions face significant challenges in accurately assessing the risk of loan defaults, which can lead to substantial financial losses and impact overall stability. The primary objective of this study is to develop predictive models that accurately identify potential loan defaulters, enabling lenders to make more informed lending decisions. The study addresses the critical need for more reliable and data-driven credit risk assessment tools by employing logistic regression, random forest, and decision tree algorithms. The research design involves a systematic approach to data collection, preprocessing, feature selection, model development, and evaluation. The dataset, sourced from Coursera's Loan Default Prediction Challenge, includes 255,347 instances and 18 features relevant to loan default prediction. The study employed an under sampling technique to address class imbalance and used train-test split to evaluate model performance. Logistic regression, random forest, and decision tree models were trained and assessed for their predictive capabilities. The results indicate that Logistic regression and random forest models demonstrated superior performance, with accuracy rates of approximately 69% and 68%, respectively. The feature importance analysis revealed key factors influencing loan defaults, such as credit score, loan amount, and employment history.

Downloads

Published

2025-12-26