Thiang, Steven (2025) Komparasi kinerja algoritme Naive Bayes Classifier dan K-Nearest Neighbor dalam analisis sentimen pada media sosial X dengan Vader Lexicon. Bachelor thesis, Universitas Pelita Harapan.
![Title [thumbnail of Title]](http://repository.uph.edu/style/images/fileicons/text.png)
Title.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (234kB)
![Abstract [thumbnail of Abstract]](http://repository.uph.edu/style/images/fileicons/text.png)
Abstract.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (419kB)
![ToC [thumbnail of ToC]](http://repository.uph.edu/style/images/fileicons/text.png)
ToC.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (759kB)
![Chapter1 [thumbnail of Chapter1]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter1.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (886kB)
![Chapter2 [thumbnail of Chapter2]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (3MB)
![Chapter3 [thumbnail of Chapter3]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (7MB)
![Chapter4 [thumbnail of Chapter4]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter4.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (4MB)
![Chapter5 [thumbnail of Chapter5]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (352kB)
![Bibliography [thumbnail of Bibliography]](http://repository.uph.edu/style/images/fileicons/text.png)
Bibliography.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (501kB)
![Appendices [thumbnail of Appendices]](http://repository.uph.edu/style/images/fileicons/text.png)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (7MB)
Abstract
Meningkatnya penggunaan media sosial sebagai sarana penyampaian opini
publik menjadikan platform X (sebelumnya Twitter) sebagai sumber data penting
untuk analisis sentimen. Namun, besarnya volume data yang terus bertambah
menimbulkan tantangan dalam proses analisis manual yang tidak efisien, sehingga
diperlukan metode otomatis yang akurat dan efisien. Penelitian ini bertujuan
untuk membandingkan performa algoritme Naïve Bayes Classifier dan K-Nearest
Neighbor (KNN) dalam klasifikasi sentimen terhadap topik kenaikan Pajak
Pertambahan Nilai (PPN) pada media sosial X. Untuk mendukung akurasi
klasifikasi, pelabelan sentimen dilakukan secara otomatis menggunakan Vader
Lexicon. Metodologi penelitian meliputi scraping data dari media sosial X,
pelabelan sentimen secara otomatis, implementasi dan pelatihan model klasifikasi,
serta evaluasi performa menggunakan Confusion Matrix dan kurva ROC. Hasil
penelitian menunjukkan bahwa algoritme KNN dengan nilai k = 1 memiliki
performa terbaik dengan akurasi 93,19%, presisi 94,07%, recall 92,96%, dan
misclassification error 6,81%, serta AUC sebesar 0,95. Sedangkan, Naïve Bayes
Classifier memperoleh akurasi 88,29%, presisi 87,43%, recall 86,67%,
misclassification error 11,71%, dan AUC 0,93. Dengan demikian, KNN terbukti
lebih unggul dalam mengklasifikasikan sentimen secara lebih akurat dan efisien
dibandingkan Naïve Bayes Classifier.
/The increasing use of social media as a platform for expressing public opinion has made platform X (formerly Twitter) an important data source for
sentiment analysis. However, the ever-growing volume of data poses challenges
for manual analysis, which is inefficient, thus necessitating accurate and efficient
automated methods. This study aims to compare the performance of the Naïve
Bayes Classifier and K-Nearest Neighbor (KNN) algorithms in sentiment
classification on the topic of the Value Added Tax (VAT) increase on social media
platform X. To support classification accuracy, sentiment labeling is carried out
automatically using the Vader Lexicon. The research methodology includes data
scraping from social media X, automatic sentiment labeling, implementation and
training of classification models, and performance evaluation using a Confusion
Matrix and ROC curve. The results show that the KNN algorithm with k = 1
achieved the best performance with an accuracy of 93.19%, precision of 94.07%,
recall of 92.96%, a misclassification error of 6.81%, and an AUC of 0.95. In
contrast, the Naïve Bayes Classifier achieved an accuracy of 88.29%, precision of
87.43%, recall of 86.67%, misclassification error of 11.71%, and an AUC of 0.93.
Therefore, KNN is proven to be superior in classifying sentiment more accurately
and efficiently than the Naïve Bayes Classifier.
Item Type: | Thesis (Bachelor) |
---|---|
Creators: | Creators NIM Email ORCID Thiang, Steven NIM03082210020 steventhiang3@gmail.com UNSPECIFIED |
Contributors: | Contribution Contributors NIDN/NIDK Email Thesis advisor Chandra, Wenripin NIDN0116088001 wenripin@lecturer.uph.edu |
Uncontrolled Keywords: | Analisis Sentimen; Media Sosial X; Naïve Bayes Classifier; K-Nearest Neighbor; Vader Lexicon; Klasifikasi Teks |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Depositing User: | Steven Thiang |
Date Deposited: | 18 Jul 2025 07:38 |
Last Modified: | 18 Jul 2025 07:38 |
URI: | http://repository.uph.edu/id/eprint/69722 |