PENERAPAN WESTCLASS UNTUK ANALISIS SENTIMEN KEBIJAKAN PPN 12% BERBASIS DATA MEDIA SOSIAL X
DOI:
https://doi.org/10.36499/psnst.v15i1.14863Abstract
The increase in Value Added Tax (VAT) to 12% elicited a broad public response on platform X, necessitating sentiment analysis to understand public perception. However, the predominance of unlabeled data makes the supervised learning approach less than optimal. This study applies WeSTClass as a weakly supervised learning method by utilizing seed words and pseudo-documents based on the von Mises–Fisher (vMF) distribution to build initial representations without requiring a large amount of labeled data. The research data consists of 13,962 tweets related to the 12% VAT issue, with 2,980 data points manually labeled by an expert. The pseudo-document generation process resulted in 150 pseudo-documents that enrich the semantic distribution of each sentiment class. A CNN model was used as the main classifier, trained through pre-training and self-training stages based on high-confidence pseudo-labels. The evaluation results showed that CNN provided the best performance with an accuracy of 0.83 and an F1-Macro of 0.72, outperforming BiLSTM and SVM. These findings indicate that the weak supervision approach through WeSTClass is effective in overcoming the limitations of labeled data and improving model stability in social media-based sentiment analysis.
Kata kunci: Analisis Sentimen, CNN, PPN 12%, Pseudo-Document, WeSTClass