Predicting kindergarten, primary, and secondary level student retention at a private school in Jakarta using machine learning

Panggabean, Abiella (2023) Predicting kindergarten, primary, and secondary level student retention at a private school in Jakarta using machine learning. Masters thesis, Universitas Pelita Harapan.

[img]
Preview
Text (Title)
Cover Tesis - Abiella N.A.P. Panggabean - MI.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (109kB) | Preview
[img]
Preview
Text (Abstract)
Abstract.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (287kB) | Preview
[img]
Preview
Text (ToC)
ToC.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (532kB) | Preview
[img]
Preview
Text (Chapter1)
Chapter1.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (609kB) | Preview
[img] Text (Chapter2)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[img] Text (Chapter3)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[img] Text (Chapter4)
Chapter4.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[img] Text (Chapter5)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (283kB)
[img]
Preview
Text (Bibliography)
Bibliography.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (655kB) | Preview
[img] Text (Appendices)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)

Abstract

Customer retention is important for most companies from various industries, including the field of education. Student retention prediction is necessary for schools because it can identify students who are most likely to retain in the next academic year and help school management to evaluate and improve its service and strategy. This research raises a case study of a private school in Jakarta, Indonesia, that has a target of student retention. A tool called Orange3 was used to execute this experiment. The data was collected from a private school. After carrying out the data pre-processing stage, the models were trained using Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Neural Network (NN). This experiment also searched for the best predictor using Gini impurity. Although generally all machine learning models in this research showed good results, the analysis results reveal that Random Forest was the best machine learning model to predict student retention. Random Forest was able to achieve an accuracy score of 0.891 in the training phase and 0.917 in the testing phase. For the best predictor, the results showed that the most important feature and the best predictor for student retention prediction is the customer service team’s observation on parents’ daily behavior. / Retensi pelanggan adalah hal yang penting bagi banyak perusahaan dari berbagai industri, termasuk di bidang pendidikan. Prediksi retensi murid dibutuhkan oleh sekolah karena hal tersebut dapat mengidentifikasi murid yang kemungkinan besar akan mendaftar kembali di tahun ajaran selanjutnya dan membantu manajemen sekolah untuk mengevaluasi dan meningkatan layanan dan strategi perusahaan. Penelitian ini mengangkat studi kasus tentang sebuah sekolah swasta di Jakarta, Indonesia, yang memilik target retensi murid. Data diperoleh dari sebuah sekolah swasta. Proses pengolahan data dilakukan menggunakan alat bernama Orange3. Model machine learning yang dipakai dalam penelitian ini adalah Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), and Neural Network (NN). Penelitian ini juga berusaha untuk menemukan variabel prediktor terbaik menggunakan Gini impurity. Meskipun secara umum semua model machine learning dalam penelitian ini menunjukkan hasil yang baik, analisa dari hasil tiap model menunjukkan bahwa Random Forest adalah model terbaik untuk memprediksi retensi murid. Random Forest mampu mencapai skor akurasi sebesar 0,891 pada fase pelatihan (training) dan 0,917 pada fase pengujian (testing). Untuk prediktor terbaik, hasil menujukkan bahwa variable terpenting dan predictor terbaik untuk prediksi retensi murid adalah pengamatan tim layanan pelanggan terhadap perilaku sehari-hari orang tua murid.

Item Type: Thesis (Masters)
Creators:
CreatorsNIMEmail
Panggabean, AbiellaNIM01679210009abiella.andrea@gmail.com
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorYugopuspito, PujiantoNIDN0324086701yugopuspito@uph.edu
Uncontrolled Keywords: student retention ; educational data mining ; random forest ; logistic regression ; support vector machine ; neural network
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: University Subject > Current > Faculty/School - UPH Karawaci > School of Information Science and Technology > Master of Informatics
Current > Faculty/School - UPH Karawaci > School of Information Science and Technology > Master of Informatics
Depositing User: Users 29658 not found.
Date Deposited: 01 Aug 2023 05:48
Last Modified: 01 Sep 2023 02:45
URI: http://repository.uph.edu/id/eprint/57196

Actions (login required)

View Item View Item