An Experimental Study on the Effectiveness of Random Forest (RF) Algorithm in Predicting Website Trustworthiness
Keywords:
Random Forest, trustworthiness, website, machines learning algorithmAbstract
Web users frequently depend on presentation and layout of a website for evaluating the trustworthiness of
information contained therein. This can be disguised by the pervasive availability of professionally designed
templates making the web information seem trustworthy regardless of its actual quality or source. As a result, web
users are liable to arrive at false conclusions about the trustworthiness of the information available to them. This
study seeks to improve the credibility of websites by assessing the effectiveness of Random Forest (RF) algorithm
in predicting web trustworthiness. Dataset used entails scrapped data of nine thousand, five hundred and forty
(9,540) websites collected from the training set and raw web files provided by Kaggle. The variables used in
predicting web trustworthiness were average daily visitors, child safety, average daily page view, privacy. The
dataset used was divided into two groups with a ratio of 80% to 20%. The 80% of the data was used for training
of models, while the remaining 20% was used for the testing (validation). The experiment was performed using
Sklearn Python library. The result showed that RF model was able to achieve an absolute precision, recall and Fmeasure of 1 for each class of website trustworthiness. The experimental study revealed that RF is effective in
predicting web trustworthiness on the bases of average daily visitors, child safety, average daily page view,
privacy, and traffic rank, while privacy and child safety were the most important input features for the model.