Analisis perbandingan metode naïve bayes dan metode support vector machine terhadap klasifikasi kepada spam email

Kaory, Nicholas (2024) Analisis perbandingan metode naïve bayes dan metode support vector machine terhadap klasifikasi kepada spam email. Bachelor thesis, Universitas Pelita Harapan.

[thumbnail of Title] Text (Title)
Title.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (39kB)
[thumbnail of Abstract] Text (Abstract)
Abstract.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (343kB)
[thumbnail of Chapter 1] Text (Chapter 1)
Chapter1.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (361kB)
[thumbnail of Chapter 2] Text (Chapter 2)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (468kB)
[thumbnail of Chapter 3] Text (Chapter 3)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (800kB)
[thumbnail of Chapter 4] Text (Chapter 4)
Chapter4.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (634kB)
[thumbnail of Chapter 5] Text (Chapter 5)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (291kB)
[thumbnail of Bibliography] Text (Bibliography)
Bibliography.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (299kB)
[thumbnail of Appendices] Text (Appendices)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)

Abstract

Email merupakan alat komunikasi yang digunakan oleh masyarakyat untuk memberikan informasi secara cepat melalui digital pada zaman ini. Walaupun email digunakan untuk memberikan informasi kepada penerima, spam juga terdapat mengirim ke pengguna untuk mencuri informasi atau memberi virus untuk penerima. Oleh karena itu, diperlukan deteksi untuk menyaring email dan spam. Penelitian ini bertujuan untuk membuat aplikasi Orange3 dengan menggunakan metode SVM dan Naïve Bayes dan membandingkan dua metode tersebut untuk mencari metode yang lebih akurat untuk mengklasifikasi email. Dataset terdapat dalam website kaggle mengandung 5157 data. Proses perhitungan adalah preprocess data, transformasi ke TF-IDF, klasifikasi dengan menggunakan metode SVM dan Naïve Bayes. Hasil evaluasi dari kedua metode tersebut adalah Naïve Bayes memiliki AUC sebesar 0,949, akurasi setinggi 0,969, F-1 score dengan 0,970, persisi sebesar 0,971, recall dengan nilai 0,971 dan MCC sebesar 0,854 sedangkan hasil pengujian untuk SVM memiliki AUC sebesar 0,950, akurasi setinggi 0,948, F-1 score dengan 0,950, persisi sebesar 0,953, recall dengan nilai 0,948 dan MCC sebesar 0,760. Hasil menunjukkan bahwa metode Naïve Bayes merupakan metode yang lebih akurat untuk mengklasifikasi spam email./Email is a communication tool used by people to quickly deliver information digitally in this era. Although email is used to convey information to recipients, spam emails are also sent to users to steal information or deliver viruses to recipients. Therefore, detection is needed to filter emails and spam. This study aims to create an application in Orange3 using the SVM and Naïve Bayes methods and to compare these two methods to find the more accurate one for classifying emails. The dataset, available on the Kaggle website, contains 5,157 data points. The calculation process involves data preprocessing, transformation to TF-IDF, and classification using SVM and Naïve Bayes methods. The evaluation results of both methods show that Naïve Bayes has an AUC of 0.949, an accuracy of 0.969, an F1 score of 0.970, a precision of 0.971, a recall of 0.971, and an MCC of 0.854, whereas the SVM testing results show an AUC of 0.950, an accuracy of 0.948, an F-1 score of 0.950, a precision of 0.953, a recall of 0.948, and an MCC of 0.760. The results indicate that the Naïve Bayes method is more accurate for classifying spam emails.
Item Type: Thesis (Bachelor)
Creators:
Creators
NIM
Email
ORCID
Kaory, Nicholas
NIM03082200015
nick.kaory@gmail.com
UNSPECIFIED
Contributors:
Contribution
Contributors
NIDN/NIDK
Email
Thesis advisor
Romindo, Romindo
NIDN0111119101
romindo@uph.edu
Uncontrolled Keywords: Email; klasifikasi; naïve bayes; spam; svm
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: University Subject > Current > Faculty/School - UPH Medan > School of Information Science and Technology > Informatics
Current > Faculty/School - UPH Medan > School of Information Science and Technology > Informatics
Depositing User: Nicholas Kaory
Date Deposited: 25 Apr 2025 01:34
Last Modified: 25 Apr 2025 01:34
URI: http://repository.uph.edu/id/eprint/68197

Actions (login required)

View Item
View Item