Imputation Using Statistical and Machine Learning Methods in Forecasting Life Expectancy

Hayashi, Sergius Tadao (2019) Imputation Using Statistical and Machine Learning Methods in Forecasting Life Expectancy. Bachelor thesis, Universitas Pelita Harapan.

[img] Text (Title)
Title.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[img]
Preview
Text (Abstract)
Abstract.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (274kB) | Preview
[img]
Preview
Text (ToC)
ToC.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (303kB) | Preview
[img]
Preview
Text (Chapter1)
Chapter1.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (270kB) | Preview
[img] Text (Chapter2)
Chapter2.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (381kB)
[img] Text (Chapter3)
Chapter3.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (581kB)
[img] Text (Chapter4)
Chapter4.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (339kB)
[img] Text (Chapter5)
Chapter5.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (1MB)
[img] Text (Chapter6)
Chapter6.pdf
Restricted to Registered users only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (267kB)
[img]
Preview
Text (Bibliography)
Bibliography.pdf
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (317kB) | Preview
[img] Text (Appendices)
Appendices.pdf
Restricted to Repository staff only
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (836kB)

Abstract

Processing, collecting and reporting data is essential in making decisions. Yet even in a well-designed and controlled study, the occurrence of missing data is not improbable. The occurrence of missing data decreases the statistical power of the dataset and training power for machine learning purposes. This thesis aims to compare six imputation method, three of which are statistical imputation methods and three are machine learning methods for life expectancy data to determine an optimal method for cases with its type of missingness pattern. The life expectancy data consist of 22 variables in relation to social, economic and health of 194 countries collected from World Health Organization’s and Wold Bank’s database. An artificial dataset was built for simulating the missingness of the original dataset to measure the performance of each method by error metrics. The artificial dataset mimics the original dataset’s missingness patterns and the nullity correlation between variables. Imputed artificial dataset were evaluated through its mean squared error, mean absolute error, and mean absolute percentage error while the original dataset were evaluated through its mean and variance changes. Surprisingly, given that the multi-layer perceptron had 10 iterations, the Hot-Deck and KNN method showed the best results for statistical and machine learning, respectively, with Hot-Deck slighly outperforming KNN.

Item Type: Thesis (Bachelor)
Creators:
CreatorsNIMEmail
Hayashi, Sergius TadaoNIM00000013162UNSPECIFIED
Contributors:
ContributionContributorsNIDN/NIDKEmail
Thesis advisorSaputra, Kie Van IvankyNIDN0401038203UNSPECIFIED
Thesis advisorFerdinand, Ferry VincenttiusNIDN0323059001UNSPECIFIED
Additional Information: SK 112-15 HAY i 2019; 31001000244419
Uncontrolled Keywords: Missing Data; Imputation; Statistical Imputation; Machine Learning Imputation; Mean Imputation; Hot-Deck Imputation; Multiple Imputation; Multi-Layer Perceptron; Self-Organizing Map; K-Nearest Neighbor
Subjects: Q Science > QA Mathematics
Divisions: University Subject > Current > Faculty/School - UPH Karawaci > Faculty of Science and Technology > Mathematics
Current > Faculty/School - UPH Karawaci > Faculty of Science and Technology > Mathematics
Depositing User: Nicholas Sio Pradiva
Date Deposited: 09 Nov 2021 08:10
Last Modified: 09 Nov 2021 08:10
URI: http://repository.uph.edu/id/eprint/42926

Actions (login required)

View Item View Item