Cargo Revenue Prediction Model Using Machine Learning Approach
Keywords:
Cargo Revenue Prediction, Gradient Boosting Regression, Machine Learning, Predictive ModelingAbstract
Abstract This study develops a machine learning (ML) predictive model tailored to Nigeria's cargo ecosystem, aiming to enhance revenue forecasting, strategic planning, and operational efficiency. Data encompassing 1,133 records from Nigerian logistics firms (detailing variables such as cargo weight, shipping rates, and transaction dates) was preprocessed using one-hot encoding, normalization, and median imputation. Four primary regression models (Decision Tree [DTR], Random Forest [RFR], Gradient Boosting [GBR], and a Stacked Adaptive Multi-Input Regression Algorithm [SAMIRA]) were deployed via Google Colab. Exploratory Data Analysis (EDA) revealed right-skewed revenue distributions and seasonal operational peaks in January, August, and September. Model evaluation demonstrated that GBR outperformed the others, achieving an R² of 0.9989, Mean Squared Error (MSE) of ?2.74 billion, Root Mean Squared Error (RMSE) of ?52,350.29, and Mean Absolute Error (MAE) of ?11,213.09. This superior performance was validated through 10-fold cross-validation (mean R² = 0.9969) and further visualized via a normalized error heatmap. Subsequently, the optimal model was prototyped into a Kotlin based Android application for real-time forecasting. The findings demonstrate that GBR can achieve >99% forecasting accuracy, presenting a robust alternative to traditional methods and offering actionable insights for dynamic pricing and resource optimization in emerging markets.