Kaory, Nicholas (2024) Analisis perbandingan metode naïve bayes dan metode support vector machine terhadap klasifikasi kepada spam email. Bachelor thesis, Universitas Pelita Harapan.
![Title [thumbnail of Title]](http://repository.uph.edu/style/images/fileicons/text.png)
Title.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (39kB)
![Abstract [thumbnail of Abstract]](http://repository.uph.edu/style/images/fileicons/text.png)
Abstract.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (343kB)
![ToC [thumbnail of ToC]](http://repository.uph.edu/style/images/fileicons/text.png)
ToC.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (326kB)
![Chapter 1 [thumbnail of Chapter 1]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter1.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (361kB)
![Chapter 2 [thumbnail of Chapter 2]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (468kB)
![Chapter 3 [thumbnail of Chapter 3]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (800kB)
![Chapter 4 [thumbnail of Chapter 4]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter4.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (634kB)
![Chapter 5 [thumbnail of Chapter 5]](http://repository.uph.edu/style/images/fileicons/text.png)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (291kB)
![Bibliography [thumbnail of Bibliography]](http://repository.uph.edu/style/images/fileicons/text.png)
Bibliography.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (299kB)
![Appendices [thumbnail of Appendices]](http://repository.uph.edu/style/images/fileicons/text.png)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (1MB)
Abstract
Email merupakan alat komunikasi yang digunakan oleh masyarakyat untuk
memberikan informasi secara cepat melalui digital pada zaman ini. Walaupun email
digunakan untuk memberikan informasi kepada penerima, spam juga terdapat
mengirim ke pengguna untuk mencuri informasi atau memberi virus untuk
penerima. Oleh karena itu, diperlukan deteksi untuk menyaring email dan spam.
Penelitian ini bertujuan untuk membuat aplikasi Orange3 dengan menggunakan
metode SVM dan Naïve Bayes dan membandingkan dua metode tersebut untuk
mencari metode yang lebih akurat untuk mengklasifikasi email. Dataset terdapat
dalam website kaggle mengandung 5157 data. Proses perhitungan adalah
preprocess data, transformasi ke TF-IDF, klasifikasi dengan menggunakan metode
SVM dan Naïve Bayes. Hasil evaluasi dari kedua metode tersebut adalah Naïve
Bayes memiliki AUC sebesar 0,949, akurasi setinggi 0,969, F-1 score dengan
0,970, persisi sebesar 0,971, recall dengan nilai 0,971 dan MCC sebesar 0,854
sedangkan hasil pengujian untuk SVM memiliki AUC sebesar 0,950, akurasi
setinggi 0,948, F-1 score dengan 0,950, persisi sebesar 0,953, recall dengan nilai
0,948 dan MCC sebesar 0,760. Hasil menunjukkan bahwa metode Naïve Bayes
merupakan metode yang lebih akurat untuk mengklasifikasi spam email./Email is a communication tool used by people to quickly deliver information
digitally in this era. Although email is used to convey information to recipients,
spam emails are also sent to users to steal information or deliver viruses to
recipients. Therefore, detection is needed to filter emails and spam. This study aims
to create an application in Orange3 using the SVM and Naïve Bayes methods and
to compare these two methods to find the more accurate one for classifying emails.
The dataset, available on the Kaggle website, contains 5,157 data points. The
calculation process involves data preprocessing, transformation to TF-IDF, and
classification using SVM and Naïve Bayes methods. The evaluation results of both
methods show that Naïve Bayes has an AUC of 0.949, an accuracy of 0.969, an F1 score of 0.970, a precision of 0.971, a recall of 0.971, and an MCC of 0.854,
whereas the SVM testing results show an AUC of 0.950, an accuracy of 0.948, an
F-1 score of 0.950, a precision of 0.953, a recall of 0.948, and an MCC of 0.760.
The results indicate that the Naïve Bayes method is more accurate for classifying
spam emails.
Item Type: | Thesis (Bachelor) |
---|---|
Creators: | Creators NIM Email ORCID Kaory, Nicholas NIM03082200015 nick.kaory@gmail.com UNSPECIFIED |
Contributors: | Contribution Contributors NIDN/NIDK Email Thesis advisor Romindo, Romindo NIDN0111119101 romindo@uph.edu |
Uncontrolled Keywords: | Email; klasifikasi; naïve bayes; spam; svm |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | University Subject > Current > Faculty/School - UPH Medan > School of Information Science and Technology > Informatics Current > Faculty/School - UPH Medan > School of Information Science and Technology > Informatics |
Depositing User: | Nicholas Kaory |
Date Deposited: | 25 Apr 2025 01:34 |
Last Modified: | 25 Apr 2025 01:34 |
URI: | http://repository.uph.edu/id/eprint/68197 |