Sentiment Analysis of Low-Resource Yorùbá Tweets Using Fine-Tuned Bert Models
Keywords:
Sentiment Analysis, Low-Resource Yorùbá Language, BERT, Natural Language ProcessingAbstract
Sentiment analysis in low-resource languages poses a notable challenge because of the scarcity of labelled data and language-specific models. This study addresses this challenge of Yorùbá sentiment analysis using fine-tuned variants of Bidirectional Encoder Representations from Transformers (BERT) model. Yorùbá, being a low-resource language, lacks effective sentiment analysis tools for detecting the sentiment polarity of content written in the language. Solving
this problem is important for understanding beliefs of the public, cultural sentiment, and enhancing communication analytics in Yorùbá-speaking communities. The paper employs transfer learning techniques to adjust pretrained models to the unique linguistic properties of Yorùbá. The chosen models include Bert Base (Uncased), African Bidirectional Encoder Representations from Transformers (AfriBERTa), Multilingual version of BERT (mBERT), and multilingual version of RoBERTa (XLM-RoBERTa). AfriBERTa model demonstrates a superior performance in capturing sentiment nuances specific to Yorùbá language tweets after comparative analysis was done on the performance of the four models on two different datasets.