Predicting Student Academic Performance Using a Scalable Regression Based Data Mining Approach

Authors

  • A. Adejumo University of Ibadan, Department of computer Science, Ibadan, Nigeria
  • N. C. Woods University of Ibadan, Department of computer Science, Ibadan, Nigeria
  • A. K. Ojo University of Ibadan, Department of computer Science, Ibadan, Nigeria

Keywords:

Student Performance Prediction, Machine Learning in Education, Educational Data Mining, Predictive Analytics, Regression Analysis

Abstract

Predicting student academic performance is a key tool for supporting academic planning and identifying those who may need extra help. This study develops a regression-based model aimed at forecasting academic outcomes among students at the University of Ibadan, Nigeria. Data were collected from 92 departments over a three-year period, covering both academic records and non-academic factors. After data preparation—which involved cleaning, feature selection, and encoding—three regression techniques were applied: Stochastic Gradient Descent (SGD), Gradient Boosting Machine (GBM), and Extra Trees Regressor (ETR). Among these, the ETR model gave the most accurate predictions, based on performance metrics like Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R-squared (R²). The use of loss functions such as Huber further improved the model’s ability to handle outliers. The findings show that this model can help pinpoint students at risk of poor academic performance and support better decisions in academic advising, resource planning, and policy implementation.

Downloads

Published

2025-12-19