Main Article Content
Abstract
Text-based communication has become a key means of interaction across various sectors. Previous studies have applied supervised learning algorithms to emotion classification in text. These studies used different datasets, but this diversity also introduced a risk of overfitting in text-based emotion classification models. Consequently, the use of cross-validation and hyperparameter optimization is required to ensure the model’s generalization ability. The aim of this research is to compare the performance of two supervised learning algorithms—Decision Tree (DT) and Support Vector Machine (SVM)—for emotion classification on an English-language text dataset of 16,000 labeled entries (anger, fear, joy, love, sadness, surprise) sourced from Kaggle. The dataset undergoes cleaning, tokenization, stopword removal, and lemmatization, after which features are extracted using TF-IDF. Both algorithms are evaluated with K-Fold and Stratified K-Fold cross-validation, then used to compute metrics of accuracy, precision, recall, and F1-score. Classification results show that the hyperparameter-tuned DT achieved an average accuracy of 88%, while the hyperparameter-tuned SVM achieved 89%. Meanwhile, Stratified K-Fold cross-validation yielded an accuracy variance of just 0.02% for DT and 0.15% for SVM. Therefore, it can be concluded that Stratified K-Fold performs better than standard K-Fold on imbalanced datasets, and that hyperparameter-tuned SVM outperforms hyperparameter-tuned DT.
Keywords
Article Details
References
- Ab Nasir, A. F., Seok Nee, E., Sern Choong, C., Shahrizan Abdul Ghani, A., Abdul Majeed, A. P. P., Adam, A., & Furqan, M. (2020). Text-based emotion prediction system using machine learning approach. IOP Conference Series: Materials Science and Engineering, 769(1). https://doi.org/10.10 88/1757-899X/769/1/012022
- Acheampong, F. A., Wenyu, C., & Nunoo‐Mensah, H. (2020). Text‐based emotion detection: Advances, challenges, and opportunities. Engineering Reports, 2(7). https://doi.org/10.1002/eng2.12189
- Agus Setiawan, H., & Yuliansyah, H. (2024). Aspect-Based Sentiment Analysis of User Reviews on the Game “Honkai: Star Rail” Using Naïve Bayes Classifier. SISTEMASI, 13(5), 1956. https://doi.org/10.32520/stmsi .v13i5.4343
- Arifian, A., Astuti, R., & Muhamad Basysyar, F. (2024). Analisis Sentimen Opini Supporter Pengguna Youtube terhadap Sistem Pembelian Tiket Pertandingan Persib menggunakan Metode Naïve Bayes. Jurnal Informatika Dan Rekayasa Perangkat Lunak, 6(1), 250–257. https://doi.org/10. 36499/jinrpl.v6i1.10310
- Ashraf, N., Khan, L., Butt, S., Chang, H.-T., Sidorov, G., & Gelbukh, A. (2022). Multi-label emotion classification of Urdu tweets. PeerJ Computer Science, 8, e896. https://doi.org/10.7717/peerj-cs.896
- Azam, N., Ahmad, T., & Ul Haq, N. (2021). Automatic emotion recognition in healthcare data using supervised machine learning. PeerJ Computer Science, 7, e751. https://doi.org/10.7717/peerj-cs.751
- Bijaksana Putra Negara, A., Muhardi, H., Sajid, F., & DrHHadari Nawawi, J. (2021). Perbandingan Algoritma Klasifikasi terhadap Emosi Tweet Berbahasa Indonesia. JEPIN (Jurnal Edukasi Dan Penelitian Informatika), 7(2). https://doi. org/10.26418/jp.v7i2
- Cahyaningtyas, C., Nataliani, Y., & Widiasari, I. R. (2021). Analisis Sentimen Pada Rating Aplikasi Shopee Menggunakan Metode Decision Tree Berbasis SMOTE. AITI: Jurnal Teknologi Informasi, 18(2), 173–184. https://doi.org/10.24246/aiti.v18i2.17 3- 184
- Chowanda, A., Sutoyo, R., Meiliana, & Tanachutiwat, S. (2021). Exploring Text-based Emotions Recognition Machine Learning Techniques on Social Media Conversation. Procedia Computer Science, 179, 821–828. https://doi.org/10.1016/j. procs.2021.01.099
- Chowdhary, K. R. (2020). Natural Language Processing. In Fundamentals of Artificial Intelligence (pp. 603–649). Springer India. https://doi.org/10.1007/978-81-322-3972-7_19
- Depari, D. H., Widiastiwi, Y., & Santoni, M. M. (2022). Perbandingan Model Decision Tree, Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung. Informatik : Jurnal Ilmu Komputer, 18(3), 239. https://doi.org/10. 52958/iftk.v18i3.4694
- Elinda, E., Yuliansyah, H., & Latiffi, M. I. A. (2024). Sentiment Analysis of the Sheikh Zayed Grand Mosque’s Visitor Reviews on Google Maps Using the VADER Method. International Journal of Advances in Data and Information Systems, 5(1), 71–84. https://doi.org/10.59395/ijadis.v5i1.1320
- Fernandes, J. V. M. R., Alexandria, A. R. de, Marques, J. A. L., Assis, D. F. de, Motta, P. C., & Silva, B. R. dos S. (2024). Emotion Detection from EEG Signals Using Machine Deep Learning Models. Bioengineering, 11(8). https://doi.org/10. 3390/bioengineering11080782
- Gunawan, L., Anggreainy, M. S., Wihan, L., Santy, Lesmana, G. Y., & Yusuf, S. (2023). Support vector machine based emotional analysis of restaurant reviews. Procedia Computer Science, 216, 479–484. https://doi.org/10.1016/j.procs.2022.12.160
- Kusal, S., Patil, S., Choudrie, J., Kotecha, K., Vora, D., & Pappas, I. (2022). A Review on Text-Based Emotion Detection -- Techniques, Applications, Datasets, and Future Directions. ArXiv, 2205. https://doi.org/10.48550/arXiv.2205.03235
- Liu, X., Shi, T., Zhou, G., Liu, M., Yin, Z., Yin, L., & Zheng, W. (2023). Emotion classification for short texts: an improved multi-label method. Humanities and Social Sciences Communications, 10(1), 306. https://doi.org/10.1057/s41599-023-0181 6-6
- Machová, K., Szabóova, M., Paralič, J., & Mičko, J. (2023). Detection of emotion by text analysis using machine learning. Frontiers in Psychology, 14. https://doi.org/10.3389/fpsyg.2023.1190326
- Maruf, A. Al, Ziyad, Z. M., Haque, Md. M., & Khanam, F. (2022). Emotion Detection from Text and Sentiment Analysis of Ukraine Russia War using Machine Learning Technique. International Journal of Advanced Computer Science and Applications, 13(12), 2022. https://doi.org/ 10.14569/IJACSA.2022.01312101
- Nandwani, P., & Verma, R. (2021). A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11(1), 81. https://doi.org/ 10.1007/s13278-021-00776-6
- Okta, B., Miranda, S., Yuliansyah, H., & Biddinika, M. K. (2024). Machine Translation Indonesian Bengkulu Malay Using Neural Machine Translation-LSTM. IJCCS (Indonesian Journal of Computing and Cybernetics Systems, 18(3), 1–5. https://doi.org/https://doi.org/10.22146/ijccs.98384
- Patel, N., Patel, F., & Kumar Bharti, S. (2022). Live Emotion Verifier for Chat Applications Using Emotional Intelligence. In Smart Innovation, Systems and Technologies, 267, 11–19. Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-16-6616-2_2
- Pramesti, L. A., & Pratiwi, N. (2023). Analisis Sentimen Twitter Terhadap Program MBKM Menggunakan Decision Tree dan Support Vector Machine. Journal of Information System Research (JOSH), 4(4), 1145–1154. https://doi.org/10. 47065/josh.v4i4.3807
- Rahayu, K., Fitria, V., Septhya, D., Rahmaddeni, R., & Efrizoni, L. (2023). Klasifikasi Teks untuk Mendeteksi Depresi dan Kecemasan pada Pengguna Twitter Berbasis Machine Learning. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 3(2), 108–114. https://doi.org/10. 57152/malcom.v3i2.780
- Rohman, A. N., Utami, E., & Raharjo, S. (2019). Deteksi Kondisi Emosi pada Media Sosial Menggunakan Pendekatan Leksikon dan Natural Language Processing. Eksplora Informatika, 9(1), 70–76. https://doi.org /10.30864/eksplora.v9i1.277
- Rokhman, K. A., Berlilana, B., & Arsi, P. (2021). Perbandingan Metode Support Vector Machine Dan Decision Tree Untuk Analisis Sentimen Review Komentar Pada Aplikasi Transportasi Online. Journal of Information System Management (JOISM), 3(1), 1–7. https://doi.org/10.24076/JOISM .2021v3i1.341
- Sinaga, H. H., & Agustian, S. (2022). Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter. Jurnal Nasional Teknologi Dan Sistem Informasi, 8(3), 107–114. https://doi.org/10.25077/ TEKNOSI.v8i3.2022.107-114
- Sondakh, D. E., Maringka, R. C., Ayorbaba, F. P., Mangi, J. S. C. B. T., & Pungus, S. R. (2023). Emotion Mining User Review of the BRImo Mobile Banking Application Using the Decision Tree Algorithm. Jurnal Sisfokom (Sistem Informasi Dan Komputer), 12(3), 350–355. https://doi.or g/10.32736/sisfokom.v12i3.1721
- Sontayasara, T., Jariyapongpaiboon, S., Promjun, A., Seelpipat, N., Saengtabtim, K., Tang, J., & Leelawat, N. (2021). Twitter Sentiment Analysis of Bangkok Tourism During COVID-19 Pandemic Using Support Vector Machine Algorithm. Journal of Disaster Research, 16(1), 24–30. https://doi.org/10.20965/jdr.2021 .p0024
- Susandri, S., Defit, S., & Tajuddin, M. (2023). Sentiment Labeling And Text Classification Machine Learning For Whatsapp Group. JITK (Jurnal Ilmu Pengetahuan Dan Teknologi Komputer), 9(1), 119–125. https://doi.org/10.33480 /jitk.v9i1.4201
- Syafia, A. N., Hidayattullah, M. F., & Suteddy, W. (2023). Studi Komparasi Algoritma SVM Dan Random Forest Pada Analisis Sentimen Komentar Youtube BTS. Jurnal Informatika: Jurnal Pengembangan IT, 8(3), 207–212. https://doi.org/10.30591/j pit.v8i3.5064
- Thomas, S., Yuliana, & Noviyanti. P. (2021). Study Analisis Metode Analisis Sentimen pada YouTube. Journal of Information Technology, 1(1), 1–7. https://doi.org/10.4 6229/jifotech.v1i1.201
- Wulan, P. P., & Basri, H. (2024). Analisis Sentimen Terhadap Layanan Nasabah Bank Menggunakan Teknik Klasifikasi Naive Bayes. Jurnal Kecerdasan Buatan Dan Teknologi Informasi, 3(2), 68–74. https://doi.org/10.69916/jkbti.v3i2.131
- Yuliansyah, H., Wahyuni Sukesi, T., Asti Mulasari, S., & Nur Syamilah Wan Ali, W. (2023). Bulletin of Social Informatics Theory and Application Artificial intelligence in malnutrition research: a bibliometric analysis. Bulletin of Social Informatics Theory and Application, 7(1), 32–42. https://doi.org/10.31763/businta. 73i1.605
References
Ab Nasir, A. F., Seok Nee, E., Sern Choong, C., Shahrizan Abdul Ghani, A., Abdul Majeed, A. P. P., Adam, A., & Furqan, M. (2020). Text-based emotion prediction system using machine learning approach. IOP Conference Series: Materials Science and Engineering, 769(1). https://doi.org/10.10 88/1757-899X/769/1/012022
Acheampong, F. A., Wenyu, C., & Nunoo‐Mensah, H. (2020). Text‐based emotion detection: Advances, challenges, and opportunities. Engineering Reports, 2(7). https://doi.org/10.1002/eng2.12189
Agus Setiawan, H., & Yuliansyah, H. (2024). Aspect-Based Sentiment Analysis of User Reviews on the Game “Honkai: Star Rail” Using Naïve Bayes Classifier. SISTEMASI, 13(5), 1956. https://doi.org/10.32520/stmsi .v13i5.4343
Arifian, A., Astuti, R., & Muhamad Basysyar, F. (2024). Analisis Sentimen Opini Supporter Pengguna Youtube terhadap Sistem Pembelian Tiket Pertandingan Persib menggunakan Metode Naïve Bayes. Jurnal Informatika Dan Rekayasa Perangkat Lunak, 6(1), 250–257. https://doi.org/10. 36499/jinrpl.v6i1.10310
Ashraf, N., Khan, L., Butt, S., Chang, H.-T., Sidorov, G., & Gelbukh, A. (2022). Multi-label emotion classification of Urdu tweets. PeerJ Computer Science, 8, e896. https://doi.org/10.7717/peerj-cs.896
Azam, N., Ahmad, T., & Ul Haq, N. (2021). Automatic emotion recognition in healthcare data using supervised machine learning. PeerJ Computer Science, 7, e751. https://doi.org/10.7717/peerj-cs.751
Bijaksana Putra Negara, A., Muhardi, H., Sajid, F., & DrHHadari Nawawi, J. (2021). Perbandingan Algoritma Klasifikasi terhadap Emosi Tweet Berbahasa Indonesia. JEPIN (Jurnal Edukasi Dan Penelitian Informatika), 7(2). https://doi. org/10.26418/jp.v7i2
Cahyaningtyas, C., Nataliani, Y., & Widiasari, I. R. (2021). Analisis Sentimen Pada Rating Aplikasi Shopee Menggunakan Metode Decision Tree Berbasis SMOTE. AITI: Jurnal Teknologi Informasi, 18(2), 173–184. https://doi.org/10.24246/aiti.v18i2.17 3- 184
Chowanda, A., Sutoyo, R., Meiliana, & Tanachutiwat, S. (2021). Exploring Text-based Emotions Recognition Machine Learning Techniques on Social Media Conversation. Procedia Computer Science, 179, 821–828. https://doi.org/10.1016/j. procs.2021.01.099
Chowdhary, K. R. (2020). Natural Language Processing. In Fundamentals of Artificial Intelligence (pp. 603–649). Springer India. https://doi.org/10.1007/978-81-322-3972-7_19
Depari, D. H., Widiastiwi, Y., & Santoni, M. M. (2022). Perbandingan Model Decision Tree, Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung. Informatik : Jurnal Ilmu Komputer, 18(3), 239. https://doi.org/10. 52958/iftk.v18i3.4694
Elinda, E., Yuliansyah, H., & Latiffi, M. I. A. (2024). Sentiment Analysis of the Sheikh Zayed Grand Mosque’s Visitor Reviews on Google Maps Using the VADER Method. International Journal of Advances in Data and Information Systems, 5(1), 71–84. https://doi.org/10.59395/ijadis.v5i1.1320
Fernandes, J. V. M. R., Alexandria, A. R. de, Marques, J. A. L., Assis, D. F. de, Motta, P. C., & Silva, B. R. dos S. (2024). Emotion Detection from EEG Signals Using Machine Deep Learning Models. Bioengineering, 11(8). https://doi.org/10. 3390/bioengineering11080782
Gunawan, L., Anggreainy, M. S., Wihan, L., Santy, Lesmana, G. Y., & Yusuf, S. (2023). Support vector machine based emotional analysis of restaurant reviews. Procedia Computer Science, 216, 479–484. https://doi.org/10.1016/j.procs.2022.12.160
Kusal, S., Patil, S., Choudrie, J., Kotecha, K., Vora, D., & Pappas, I. (2022). A Review on Text-Based Emotion Detection -- Techniques, Applications, Datasets, and Future Directions. ArXiv, 2205. https://doi.org/10.48550/arXiv.2205.03235
Liu, X., Shi, T., Zhou, G., Liu, M., Yin, Z., Yin, L., & Zheng, W. (2023). Emotion classification for short texts: an improved multi-label method. Humanities and Social Sciences Communications, 10(1), 306. https://doi.org/10.1057/s41599-023-0181 6-6
Machová, K., Szabóova, M., Paralič, J., & Mičko, J. (2023). Detection of emotion by text analysis using machine learning. Frontiers in Psychology, 14. https://doi.org/10.3389/fpsyg.2023.1190326
Maruf, A. Al, Ziyad, Z. M., Haque, Md. M., & Khanam, F. (2022). Emotion Detection from Text and Sentiment Analysis of Ukraine Russia War using Machine Learning Technique. International Journal of Advanced Computer Science and Applications, 13(12), 2022. https://doi.org/ 10.14569/IJACSA.2022.01312101
Nandwani, P., & Verma, R. (2021). A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11(1), 81. https://doi.org/ 10.1007/s13278-021-00776-6
Okta, B., Miranda, S., Yuliansyah, H., & Biddinika, M. K. (2024). Machine Translation Indonesian Bengkulu Malay Using Neural Machine Translation-LSTM. IJCCS (Indonesian Journal of Computing and Cybernetics Systems, 18(3), 1–5. https://doi.org/https://doi.org/10.22146/ijccs.98384
Patel, N., Patel, F., & Kumar Bharti, S. (2022). Live Emotion Verifier for Chat Applications Using Emotional Intelligence. In Smart Innovation, Systems and Technologies, 267, 11–19. Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-16-6616-2_2
Pramesti, L. A., & Pratiwi, N. (2023). Analisis Sentimen Twitter Terhadap Program MBKM Menggunakan Decision Tree dan Support Vector Machine. Journal of Information System Research (JOSH), 4(4), 1145–1154. https://doi.org/10. 47065/josh.v4i4.3807
Rahayu, K., Fitria, V., Septhya, D., Rahmaddeni, R., & Efrizoni, L. (2023). Klasifikasi Teks untuk Mendeteksi Depresi dan Kecemasan pada Pengguna Twitter Berbasis Machine Learning. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 3(2), 108–114. https://doi.org/10. 57152/malcom.v3i2.780
Rohman, A. N., Utami, E., & Raharjo, S. (2019). Deteksi Kondisi Emosi pada Media Sosial Menggunakan Pendekatan Leksikon dan Natural Language Processing. Eksplora Informatika, 9(1), 70–76. https://doi.org /10.30864/eksplora.v9i1.277
Rokhman, K. A., Berlilana, B., & Arsi, P. (2021). Perbandingan Metode Support Vector Machine Dan Decision Tree Untuk Analisis Sentimen Review Komentar Pada Aplikasi Transportasi Online. Journal of Information System Management (JOISM), 3(1), 1–7. https://doi.org/10.24076/JOISM .2021v3i1.341
Sinaga, H. H., & Agustian, S. (2022). Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter. Jurnal Nasional Teknologi Dan Sistem Informasi, 8(3), 107–114. https://doi.org/10.25077/ TEKNOSI.v8i3.2022.107-114
Sondakh, D. E., Maringka, R. C., Ayorbaba, F. P., Mangi, J. S. C. B. T., & Pungus, S. R. (2023). Emotion Mining User Review of the BRImo Mobile Banking Application Using the Decision Tree Algorithm. Jurnal Sisfokom (Sistem Informasi Dan Komputer), 12(3), 350–355. https://doi.or g/10.32736/sisfokom.v12i3.1721
Sontayasara, T., Jariyapongpaiboon, S., Promjun, A., Seelpipat, N., Saengtabtim, K., Tang, J., & Leelawat, N. (2021). Twitter Sentiment Analysis of Bangkok Tourism During COVID-19 Pandemic Using Support Vector Machine Algorithm. Journal of Disaster Research, 16(1), 24–30. https://doi.org/10.20965/jdr.2021 .p0024
Susandri, S., Defit, S., & Tajuddin, M. (2023). Sentiment Labeling And Text Classification Machine Learning For Whatsapp Group. JITK (Jurnal Ilmu Pengetahuan Dan Teknologi Komputer), 9(1), 119–125. https://doi.org/10.33480 /jitk.v9i1.4201
Syafia, A. N., Hidayattullah, M. F., & Suteddy, W. (2023). Studi Komparasi Algoritma SVM Dan Random Forest Pada Analisis Sentimen Komentar Youtube BTS. Jurnal Informatika: Jurnal Pengembangan IT, 8(3), 207–212. https://doi.org/10.30591/j pit.v8i3.5064
Thomas, S., Yuliana, & Noviyanti. P. (2021). Study Analisis Metode Analisis Sentimen pada YouTube. Journal of Information Technology, 1(1), 1–7. https://doi.org/10.4 6229/jifotech.v1i1.201
Wulan, P. P., & Basri, H. (2024). Analisis Sentimen Terhadap Layanan Nasabah Bank Menggunakan Teknik Klasifikasi Naive Bayes. Jurnal Kecerdasan Buatan Dan Teknologi Informasi, 3(2), 68–74. https://doi.org/10.69916/jkbti.v3i2.131
Yuliansyah, H., Wahyuni Sukesi, T., Asti Mulasari, S., & Nur Syamilah Wan Ali, W. (2023). Bulletin of Social Informatics Theory and Application Artificial intelligence in malnutrition research: a bibliometric analysis. Bulletin of Social Informatics Theory and Application, 7(1), 32–42. https://doi.org/10.31763/businta. 73i1.605