Hayashi, Sergius Tadao (2019) Imputation Using Statistical and Machine Learning Methods in Forecasting Life Expectancy. Bachelor thesis, Universitas Pelita Harapan.
Text (Title)
Title.pdf Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (1MB) |
||
|
Text (Abstract)
Abstract.pdf Available under License Creative Commons Attribution Non-commercial Share Alike. Download (274kB) | Preview |
|
|
Text (ToC)
ToC.pdf Available under License Creative Commons Attribution Non-commercial Share Alike. Download (303kB) | Preview |
|
|
Text (Chapter1)
Chapter1.pdf Available under License Creative Commons Attribution Non-commercial Share Alike. Download (270kB) | Preview |
|
Text (Chapter2)
Chapter2.pdf Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (381kB) |
||
Text (Chapter3)
Chapter3.pdf Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (581kB) |
||
Text (Chapter4)
Chapter4.pdf Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (339kB) |
||
Text (Chapter5)
Chapter5.pdf Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (1MB) |
||
Text (Chapter6)
Chapter6.pdf Restricted to Registered users only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (267kB) |
||
|
Text (Bibliography)
Bibliography.pdf Available under License Creative Commons Attribution Non-commercial Share Alike. Download (317kB) | Preview |
|
Text (Appendices)
Appendices.pdf Restricted to Repository staff only Available under License Creative Commons Attribution Non-commercial Share Alike. Download (836kB) |
Abstract
Processing, collecting and reporting data is essential in making decisions. Yet even in a well-designed and controlled study, the occurrence of missing data is not improbable. The occurrence of missing data decreases the statistical power of the dataset and training power for machine learning purposes. This thesis aims to compare six imputation method, three of which are statistical imputation methods and three are machine learning methods for life expectancy data to determine an optimal method for cases with its type of missingness pattern. The life expectancy data consist of 22 variables in relation to social, economic and health of 194 countries collected from World Health Organization’s and Wold Bank’s database. An artificial dataset was built for simulating the missingness of the original dataset to measure the performance of each method by error metrics. The artificial dataset mimics the original dataset’s missingness patterns and the nullity correlation between variables. Imputed artificial dataset were evaluated through its mean squared error, mean absolute error, and mean absolute percentage error while the original dataset were evaluated through its mean and variance changes. Surprisingly, given that the multi-layer perceptron had 10 iterations, the Hot-Deck and KNN method showed the best results for statistical and machine learning, respectively, with Hot-Deck slighly outperforming KNN.
Item Type: | Thesis (Bachelor) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Creators: |
|
||||||||||||
Contributors: |
|
||||||||||||
Additional Information: | SK 112-15 HAY i 2019; 31001000244419 | ||||||||||||
Uncontrolled Keywords: | Missing Data; Imputation; Statistical Imputation; Machine Learning Imputation; Mean Imputation; Hot-Deck Imputation; Multiple Imputation; Multi-Layer Perceptron; Self-Organizing Map; K-Nearest Neighbor | ||||||||||||
Subjects: | Q Science > QA Mathematics | ||||||||||||
Divisions: | University Subject > Current > Faculty/School - UPH Karawaci > Faculty of Science and Technology > Mathematics Current > Faculty/School - UPH Karawaci > Faculty of Science and Technology > Mathematics |
||||||||||||
Depositing User: | Nicholas Sio Pradiva | ||||||||||||
Date Deposited: | 09 Nov 2021 08:10 | ||||||||||||
Last Modified: | 09 Nov 2021 08:10 | ||||||||||||
URI: | http://repository.uph.edu/id/eprint/42926 |
Actions (login required)
View Item |