Comparison of Four Methods for Handing Missing Data in Longitudinal Data Analysis through a Simulation Study
Comparison of Four Methods for Handing Missing Data in Longitudinal Data Analysis through a Simulation Study作者机构:Biostatistics & Data Management Regeneron Pharmaceuticals Inc. Basking Ridge USA
出 版 物:《Open Journal of Statistics》 (统计学期刊(英文))
年 卷 期:2014年第4卷第11期
页 面:933-944页
学科分类:1002[医学-临床医学] 100214[医学-肿瘤学] 10[医学]
主 题:MCAR MAR Complete Case Mean Substitution LOCF Multiple Imputation
摘 要:Missing data can frequently occur in a longitudinal data analysis. In the literature, many methods have been proposed to handle such an issue. Complete case (CC), mean substitution (MS), last observation carried forward (LOCF), and multiple imputation (MI) are the four most frequently used methods in practice. In a real-world data analysis, the missing data can be MCAR, MAR, or MNAR depending on the reasons that lead to data missing. In this paper, simulations under various situations (including missing mechanisms, missing rates, and slope sizes) were conducted to evaluate the performance of the four methods considered using bias, RMSE, and 95% coverage probability as evaluation criteria. The results showed that LOCF has the largest bias and the poorest 95% coverage probability in most cases under both MAR and MCAR missing mechanisms. Hence, LOCF should not be used in a longitudinal data analysis. Under MCAR missing mechanism, CC and MI method are performed equally well. Under MAR missing mechanism, MI has the smallest bias, smallest RMSE, and best 95% coverage probability. Therefore, CC or MI method is the appropriate method to be used under MCAR while MI method is a more reliable and a better grounded statistical method to be used under MAR.