A Comparative Analysis of Ensemble Machine Learning Algorithms for Bank Customer Churn Prediction
Keywords:
Customers, Churn, LSTM, Ensemble, Random ForestAbstract
Customers churn became a serious issues to banks manager because customers have numerous options where to
save their money. This justify why many researchers are attracted to this area. This study developed a bank
customers churn predictive model. The study used dataset from kaggle.com repository. It consists of 10127
instances and 20 parameters. One Hot Encoder was used as data preprocessing on the dataset. The data was divided
into 80% for training and 20% for testing. The predictive model was created using Long Short-Term Memory
(LSTM), Ensemble LSTM, and Random Forest (RF). The results of the model revealed LSTM with F1 score of
0.94, accuracy of 0.9235, specificity of 0. 6635 sensitivity of 0.97, AUC of 0.95 and loss value of 0.1663. Ensemble
LSTM with F1 score of 0.94, accuracy of 0.9057, specificity of 0.554, sensitivity of 0.98, AUC of 0.92 and loss
value of 0.238. RF with F1 score of 0.97, accuracy of 0.95, specificity of 0. 774, sensitivity of 0.99, AUC of 0.99
and loss value of 0.15. The study concluded that RF outperformed both LSTM and Ensemble LSTM. Also pointed
out that customer’s gender, marital status, customer income category and age against attrition are determining factor
for customer churn prediction. The model is recommended for banking sector to assist in decision making. Future
work can be done using more ensembles techniques and perform more data expository