BahasaMelayuMalaysiaEnglish (UK)


Authors: Lyndia Anak Libin and Mohd Hanafi Ahmad Hijazi



Sentiment Analysis (SA) has gained its popularity over the years for the benefit it brings to development of economy, sociology and political. SA enables observation, experiment, and quantification of emotions of the public toward a particular issue. However, there is not much SA done on Malay Language, especially on its dialect in social media form. The research presented in this paper aims to perform SA on one of Malay’s Dialect, Sabah
Language. Sabah Language which does not have fix spelling as well as having unstructured form will pose difficulty to the SA performed. This project takes a lexiconbased approach to perform SA of Sabah Language on social media data. The corpus selected is Facebook post and tweets written in Sabah language of total 443 post and tweets. They have been preprocessed and classified into polarity class of positive and negative. Sentiment-Lexicon (SL) building, pre-processing, several polarity assignation, and classification method has been experimented in this project in order to investigate the effect to the SA performed. The highest accuracy of the classification is 85.10 %, whereby the classification implements SL expansion, pre-processing stage and perform switch negation with Simple Polarity Score Assignation and Bias-Aware Thresholding method to the SA performed.