i just tried missing data imputation with various packages(Hmisc,Mice, missforest, Amelia) , i want to compare those results of each package and find out which package is best?
There's a section on Comparing packages at the end of Thomas Leeper's Multiple Imputation tutorial, with code. See the post for details, but the approach he describes as:
It is useful at this point to compare the coefficients from each of our multiple imputation methods. To do so, we'll pull out the coefficients from each of the three packages' results, our original observed results (with case deletion), and the results for the real data-generating process (before we introduced missingness).
What constitutes the "best" likely depends on a number of factors, including the nature of your data/analysis. See, for example (don't worry, they're all Open Access):
- Olanrewaju Akande, Fan Li & Jerome Reiter (2017) An Empirical Comparison of Multiple Imputation Methods for Categorical Data, The American Statistician, 71:2, 162-170, DOI: 10.1080/00031305.2016.1277158
- Schmitt P, Mandel J, Guedj M (2015) A Comparison of Six Methods for Missing Data Imputation. J Biom Biostat 6:224. doi: 10.4172/2155-6180.1000224
- Beck, Marcus W, Neeraj Bokde, Gualberto Asencio-Cortés, and Kishore Kulat. 2018. “R Package ImputeTestbench to Compare Imputations Methods for Univariate Time Series.” The R Journal 10 (July): 218–33. https://journal.r-project.org/archive/2018/RJ-2018-024/RJ-2018-024.pdf
Thank you for your response.