Predicting kindergarten, primary, and secondary level student retention at a private school in Jakarta using machine learning

Panggabean, Abiella (2023) Predicting kindergarten, primary, and secondary level student retention at a private school in Jakarta using machine learning. Masters thesis, Universitas Pelita Harapan.

[thumbnail of Title]
Preview
Text (Title)
Cover Tesis - Abiella N.A.P. Panggabean - MI.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (109kB) | Preview
[thumbnail of Abstract]
Preview
Text (Abstract)
Abstract.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (287kB) | Preview
[thumbnail of ToC]
Preview
Text (ToC)
ToC.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (532kB) | Preview
[thumbnail of Chapter1]
Preview
Text (Chapter1)
Chapter1.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (609kB) | Preview
[thumbnail of Chapter2] Text (Chapter2)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[thumbnail of Chapter3] Text (Chapter3)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[thumbnail of Chapter4] Text (Chapter4)
Chapter4.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[thumbnail of Chapter5] Text (Chapter5)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (283kB)
[thumbnail of Bibliography]
Preview
Text (Bibliography)
Bibliography.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (655kB) | Preview
[thumbnail of Appendices] Text (Appendices)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)

Abstract

Customer retention is important for most companies from various industries, including the field of education. Student retention prediction is necessary for schools because it can identify students who are most likely to retain in the next academic year and help school management to evaluate and improve its service and strategy. This research raises a case study of a private school in Jakarta, Indonesia, that has a target of student retention. A tool called Orange3 was used to execute this experiment. The data was collected from a private school. After carrying out the data pre-processing stage, the models were trained using Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Neural Network (NN). This experiment also searched for the best predictor using Gini impurity. Although generally all machine learning models in this research showed good results, the analysis results reveal that Random Forest was the best machine learning model to predict student retention. Random Forest was able to achieve an accuracy score of 0.891 in the training phase and 0.917 in the testing phase. For the best predictor, the results showed that the most important feature and the best predictor for student retention prediction is the customer service team’s observation on parents’ daily behavior. / Retensi pelanggan adalah hal yang penting bagi banyak perusahaan dari berbagai industri, termasuk di bidang pendidikan. Prediksi retensi murid dibutuhkan oleh sekolah karena hal tersebut dapat mengidentifikasi murid yang kemungkinan besar akan mendaftar kembali di tahun ajaran selanjutnya dan membantu manajemen sekolah untuk mengevaluasi dan meningkatan layanan dan strategi perusahaan. Penelitian ini mengangkat studi kasus tentang sebuah sekolah swasta di Jakarta, Indonesia, yang memilik target retensi murid. Data diperoleh dari sebuah sekolah swasta. Proses pengolahan data dilakukan menggunakan alat bernama Orange3. Model machine learning yang dipakai dalam penelitian ini adalah Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Neural Network (NN). Penelitian ini juga berusaha untuk menemukan variabel prediktor terbaik menggunakan Gini impurity. Meskipun secara umum semua model machine learning dalam penelitian ini menunjukkan hasil yang baik, analisa dari hasil tiap model menunjukkan bahwa Random Forest adalah model terbaik untuk memprediksi retensi murid. Random Forest mampu mencapai skor akurasi sebesar 0,891 pada fase pelatihan (training) dan 0,917 pada fase pengujian (testing). Untuk prediktor terbaik, hasil menujukkan bahwa variable terpenting dan predictor terbaik untuk prediksi retensi murid adalah pengamatan tim layanan pelanggan terhadap perilaku sehari-hari orang tua murid.
Item Type: Thesis (Masters)
Creators:
Creators
NIM
Email
ORCID
Panggabean, Abiella
NIM01679210009
abiella.andrea@gmail.com
UNSPECIFIED
Contributors:
Contribution
Contributors
NIDN/NIDK
Email
Thesis advisor
Yugopuspito, Pujianto
NIDN0324086701
yugopuspito@uph.edu
Uncontrolled Keywords: student retention ; educational data mining ; random forest ; logistic regression ; support vector machine ; neural network
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: University Subject > Current > Faculty/School - UPH Karawaci > School of Information Science and Technology > Master of Informatics
Current > Faculty/School - UPH Karawaci > School of Information Science and Technology > Master of Informatics
Depositing User: Users 29658 not found.
Date Deposited: 01 Aug 2023 05:48
Last Modified: 01 Sep 2023 02:45
URI: http://repository.uph.edu/id/eprint/57196

Actions (login required)

View Item
View Item