Comparative Analysis of Machine Learning Algorithms for the Classification of Twitter Bots

Authors

  • B. A. Ayogu Federal University, Oye Ekiti, Nigeria
  • G. O. Ogunleye Federal University, Oye Ekiti, Nigeria
  • L. B. Adewole Federal University, Oye Ekiti, Nigeria
  • M. Olagunju Federal University, Oye Ekiti, Nigeria
  • W. A. Oyatoyinbo Federal University, Oye Ekiti, Nigeria

Keywords:

Catboost, Decision tree, Feature Selection, Logistic Regression, Random Forest

Abstract

Social media platforms have become risky for actual users due to the rise in the number of bots. The security mechanisms put in place to help identify and categorize bots accounts from legitimate human accounts have significant drawbacks, such as the misclassification of accounts because of behavioral change. In general, studies on Twitter bots identification demonstrate that bots can be useful while also having a negative impact on users by broadcasting misleading news, spamming, or posing as a phony follower to boost an account's popularity. This study employed Logistic Regression, Catboost, and Random Forest algorithms to develop Twitter bots classification systems, capable of distinguishing between useful and harmful bots accounts in order to limit their impact on users and the Twitter community. The feasibility of the algorithms was tested on Twitter spam bots dataset gotten from Kaggle, containing eight(8) features, which were reduced to two (2) using decision tree. The selected features were further utilized to develop bots classification systems. Comparative analysis of the results showed that Random forest classifier recorded best performance when evaluated on training set, while the Logistic recorded highest performance in terms of accuracy, precision, and F1 Score achieving 83%, 78%, and 81%, respectively when evaluated on test set. The classification systems can help identify and mitigate the impact of harmful bots on Twitter, such as those used for spamming or disseminating fake news. The study has demonstrated the effectiveness of machine learning algorithms in classifying Twitter bots and provided a potential solution for improving online social media platforms.

Downloads

Published

2024-04-15