Implementasi algoritma tf-idf pada kolom komentar akun Youtube Cretivox untuk mendeteksi ujaran kebencian

Jackie, Jackie (2022) Implementasi algoritma tf-idf pada kolom komentar akun Youtube Cretivox untuk mendeteksi ujaran kebencian. Bachelor thesis, Universitas Pelita Harapan.

Preview

Text (Title)
Title.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (58kB) | Preview

Preview

Text (Abstract)
Abstract.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (242kB) | Preview

Preview

Text (ToC)
ToC.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (293kB) | Preview

Preview

Text (Chapter1)
Chapter1.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (379kB) | Preview

Text (Chapter2)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (595kB)

Text (Chapter3)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (858kB)

Preview

Text (Chapter4)
Chapter4.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (601kB) | Preview

Text (Chapter5)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (198kB)

Preview

Text (Bibliography)
Bibliography.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (298kB) | Preview

Text (Appendices)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (1MB)

Abstract

YouTube is an online video sharing services website that can be accessed by everyone and leave remarks in the comment section. Because of the freedom of speech available on YouTube, many people are unaware of the limitations that must be adhered to when leaving comments. It's not uncommon to come across hate speech in a video's comments area. However, there are still a few of us who can tell the difference between hate speech and non-hate speech. The public has the ability to construct a hate speech detection system utilizing data mining to limit the amount of hate speech displayed in the YouTube comments section and discriminate between hate speech and non-hate speech comments. The methods used are Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine (SVM). The initial stage is to gather comments, which will be labeled separately afterwards. Before the SVM algorithm is used to detect hate speech, the data will be preprocessed and weighted using the TF-IDF approach. From the analysis that has been carried out, the TF-IDF weighting method with the implementation of the SVM algorithm using 80% training data and 20% testing data succeeded in producing an accuracy rate of 99% with a precision value of 100%, f1-score of 73.68% and recall by 58.3%. / YouTube adalah sebuah website penyedia layanan berbagi video secara daring yang dapat diakses oleh seluruh orang dan dapat memberikan komentar pada video yang tersedia pada kolom komentar. Akibat dari kebebasan berkomentar yang dapat dilakukan pada YouTube, orang-orang menjadi kurang mengerti batasan apa saja yang perlu dilakakukan saat memberikan komentar. Tidak jarang ditemukan ujaran kebencian pada kolom komentar sebuah video. Hanya saja, masih sedikit dari kita yang dapat membedakan mana yang dapat dikategorikan dengan ujaran kebencian dan mana yang bukan. Dengan menggunakan data mining, masyarakat berpotensi untuk mengembangkan suatu sistem pendeteksian ujaran kebencian untuk mengurangi jumlah ujaran kebencian yang ditampilkan pada kolom komentar YouTube serta dapat membedakan komentar yang terindikasi ujaran kebencian dan yang bukan. Metode yang digunakan adalah Term Frequency-Inverse Document Frequency (TF-IDF) dan Support Vector Machine (SVM). Langkah awal yang dilakukan adalah mengumpulkan komentar-komentar yang kemudian akan dilabeli secara mandiri. Selanjutnya data akan melewati proses text preprocessing dan pembobotan menggunakan metode TF-IDF sebelum diimplementasikan algoritma SVM untuk mendeteksi ujaran kebencian. Dari analisis yang telah dilalukan, metode pembobotan TF-IDF dengan implementasi algoritma SVM dengan menggunakan 80% data training dan 20% data testing berhasil menghasilkan tingkat akurasi sebesar 99% dengan nilai precission sebesar 100%, f1-score sebesar 73,68% dan recall sebesar 58,3%.

Item Type:

Thesis (Bachelor)

Creators:

Creators	NIM	Email
Jackie, Jackie	NIM03082180006	leonarjackie@gmail.com

Contributors:

Contribution	Contributors	NIDN/NIDK	Email
Thesis advisor	Damanik, Rudolfo Rizki	NIDN0125049001	rudolfo.damanik@uph.edu

Uncontrolled Keywords:

web scrapping; ujaran kebencian; tf-idf; svm

Subjects:

Q Science > QA Mathematics > QA75 Electronic computers. Computer science

Divisions:

University Subject > Current > Faculty/School - UPH Medan > School of Information Science and Technology > Informatics
Current > Faculty/School - UPH Medan > School of Information Science and Technology > Informatics

Depositing User:

Users 22304 not found.

Date Deposited:

20 Aug 2022 12:01

Last Modified:

20 Aug 2022 12:01

URI:

http://repository.uph.edu/id/eprint/49194

Actions (login required)

View Item