Thai Language Tweet Emotion Prediction Based on
Use of Emojis
Volume 4 - Issue 3
Anocha Rugchatjaroen*
- National Science and Technology Development Agency, NECTEC, Thailand
Received:June 24, 2021; Published: July 06, 2021
Corresponding author:Anocha Rugchatjaroen, National Science and Technology Development Agency, NECTEC, Thailand
DOI: 10.32474/JAAS.2021.04.000189
Fulltext
PDF
To view the Full Article Peer-reviewed Article PDF
Abstract
Thai Language can be handled/considered in the same group of Chinese and Japanese where no explicit spaces exist between
words. This article presents a work on the emotional identification of tweets based on the use of emojis which focusses on a Thai
language context. The use of emojis in user tweets indicates the writer’s emotions. The first phase of this study was to collect Thai
tweets, clean them, and then to make a primary classification of the emojis into groups using K-nearest [1]. These group clusters
are used as target outputs for the prediction of emoji classes. It was found that 22 is the appropriate K for considering 70 emojis for
a collected set of tweets. The corpus includes any level of Thai language usage, which means that the processed data can consist of
suffixes, slang, and unknown word from tokenization process. The vector representation advances the unknown accent. In sum, this
research created a corpus of short messages collected from Twitter which were grouped into 22 emoji- classes. The corpus includes
7,825,857 messages prepared for classification based on emotions by applying 2 biLSTM layers. A table of emojis is proposed based
on Ekman’s six basic emotions: anger, disgust, fear, joy, sadness, and surprise were evaluated in both objective and subjective tests.
The results show that word vectors work well for the classification of emotions through the use of emojis.
Abstract|
Introduction|
The Corpus|
Emoji Recognition In Thai|
Emotional Representations From The Clustered
Emojis|
Evaluation And Results|
Discussion and Conclusion|
References|