
تعداد نشریات | 26 |
تعداد شمارهها | 447 |
تعداد مقالات | 4,557 |
تعداد مشاهده مقاله | 5,380,006 |
تعداد دریافت فایل اصل مقاله | 3,580,083 |
On Hierarchical Multiple Imputation Method for Handling Missing Data | ||
Journal of Mahani Mathematical Research | ||
دوره 10، شماره 2، دی 2021، صفحه 103-114 اصل مقاله (985.85 K) | ||
نوع مقاله: Research Paper | ||
شناسه دیجیتال (DOI): 10.22103/jmmrc.2021.17749.1153 | ||
نویسندگان | ||
Ayyub Sheikhi* 1؛ Alireza Arabpour1؛ Khosravi Mohsen1؛ Mashallah Mashinchi1؛ Reza Pourmousa1؛ Mohsen Rezapour1؛ Mohammad Javad Roastami2؛ Amin Abbdollah Nejad3؛ Abed Badakhshan1 | ||
1Department of Statistics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran | ||
2Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran. and Kerman Chamber of Commerce, Industries, Mines and Agriculture, Kerman, Iran | ||
3Kerman Chamber of Commerce, Industries, Mines and Agriculture, Kerman, Iran | ||
چکیده | ||
In this work we carry out a multiple imputation technique for handling missing observations. We propose an algorithm, which performs a hierarchical multiple imputation using edition rules to impute missing values. We assess our algorithm using a simulation study and a numerical application of our algorithm in dataset of Kerman Chamber of Commerce, Industries, Mines and Agriculture is presented for more illustration. | ||
کلیدواژهها | ||
Missing Data؛ Multiple Imputation؛ Editing Rules؛ Data Cleaning | ||
مراجع | ||
[1] Charu C Aggarwal and Saket Sathe. Outlier ensembles: An introduction. Springer, 2017. [2] Malik Agyemang, Ken Barker, and Rada Alhajj. A comprehensive survey of numeric and symbolic outlier mining techniques. Intelligent Data Analysis, 10(6):521{538, 2006. [3] Zohreh Akbari and Rainer Unland. Automated determination of the input parameter of dbscan based on outlier detection. In IFIP International Conference on Arti cial Intelligence Applications and Innovations, pages 280{291. Springer, 2016. [4] Krishnan Bhaskaran and Liam Smeeth. What is the di erence between missing completely at random and missing at random? International Journal of Epidemiology, 43(4):1336{1339, 2014. [5] Nicole M Butera, Siying Li, Kelly R Evenson, Chongzhi Di, David M Buchner, Michael J LaMonte, Andrea Z LaCroix, and Amy Herring. Hot deck multiple imputation for handling missing accelerometer data. Statistics in Biosciences, 11(2):422{448, 2019. [6] S van Buuren and Karin Groothuis-Oudshoorn. mice: Multivariate imputation by chained equations in r. Journal of statistical software, pages 1{68, 2010. [7] James R Carpenter, Michael G Kenward, and Ian R White. Sensitivity analysis after multiple imputation under missing at random: a weighting approach. Statistical methods in medical research, 16(3):259{275, 2007. [8] Ya Chen, Yongjun Li, Huaqing Wu, and Liang Liang. Data envelopment analysis with missing data: A multiple linear regression analysis approach. International Journal of Information Technology & Decision Making, 13(01):137{153, 2014. [9] Zhangyu Cheng, Chengming Zou, and Jianwei Dong. Outlier detection using isolation forest and local outlier factor. In Proceedings of the conference on research in adaptive and convergent systems, pages 161{168, 2019. [10] Tamraparni Dasu and Theodore Johnson. Exploratory data mining and data cleaning. John Wiley & Sons, 2003. [11] Ivan P Fellegi and David Holt. A systematic approach to automatic edit and imputation. Journal of the American Statistical Association, 71(353):17{35, 1976. [12] Gary Fraser and Ru Yan. Guided multiple imputation of missing data: using a subsample to strengthen the missing-at-random assumption. Epidemiology, pages 246{252, 2007. [13] Alex A Freitas. Data mining and knowledge discovery with evolutionary algorithms. Springer Science & Business Media, 2013. [14] Salvador Garca, Julian Luengo, and Francisco Herrera. Data preprocessing in data mining. Springer, 2015. [15] Benjamin Yael Gravesteijn, Charlie Aletta Sewalt, Esmee Venema, Daan Nieboer, Ewout W Steyerberg, and CENTER-TBI Collaborators. Missing data in prediction research: A ve-step approach for multiple imputation, illustrated in the center-tbi study. Journal of neurotrauma, 38(13):1842{1857, 2021. [16] Simon Grund, Oliver Ludtke, and Alexander Robitzsch. Multiple imputation of missing data in multilevel models with the r package mdmb: a exible sequential modeling approach. Behavior Research Methods, pages 1{19, 2021. [17] Julie Josse and Francois Husson. Handling missing values in exploratory multivariate data analysis methods. Journal de la Societe Francaise de Statistique, 153(2):79{99, 2012. [18] Hyun Kang. The prevention and handling of the missing data. Korean Journal of Anes-thesiology, 64(5):402, 2013. [19] Shahidul Islam Khan and Abu Sayed Md Latiful Hoque. Sice: an improved missing data imputation technique. Journal of Big Data, 7(1):1{21, 2020. [20] Hang J Kim, Alan F Karr, and Jerome P Reiter. Statistical disclosure limitation in the presence of edit rules. Journal of Ocial Statistics, 31(1):121{138, 2015. [21] Sang Kyu Kwak and Jong Hae Kim. Statistical data preparation: management of missing values and outliers. Korean Journal of Anesthesiology, 70(4):407, 2017. [22] Roderick JA Little and Donald B Rubin. Statistical analysis with missing data, volume 793. John Wiley & Sons, 2019. [23] Daniel McNeish. Missing data methods for arbitrary missingness with small samples. Journal of Applied Statistics, 44(1):24{39, 2017. [24] Jared S Murray et al. Multiple imputation: a review of practical and theoretical ndings. Statistical Science, 33(2):142{159, 2018. [25] Irfan Pratama, Adhistya Erna Permanasari, Igi Ardiyanto, and Rini Indrayani. A review of missing values handling methods on time-series data. In 2016 International Conference on Information Technology Systems and Innovation (ICITSI), pages 1{6. IEEE, 2016. [26] Burim Ramosaj and Markus Pauly. Predicting missing values: a comparative study on non-parametric approaches for imputation. Computational Statistics, 34(4):1741{1764, 2019. [27] Peter J Rousseeuw and Mia Hubert. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):73{79, 2011. [28] Donald B Rubin. Multiple imputation after 18+ years. Journal of the American statistical Association, 91(434):473{489, 1996. [29] Donald B Rubin. Multiple imputation for nonresponse in surveys, volume 81. John Wiley & Sons, 2004. [30] Akiyo Sasaki-Otomaru, Kotaro Yamasue, Osamu Tochikubo, Kyoko Saito, and Masahiko Inamori. Association of home blood pressure with sleep and physical and mental activity, assessed via a wristwatch-type pulsimeter with accelerometer in adults. Clinical and Experimental Hypertension, 42(2):131{138, 2020. [31] Joseph L Schafer. Analysis of incomplete multivariate data. CRC press, 1997. [32] Joseph L Schafer and Maren K Olsen. Multiple imputation for multivariate missing-data problems: A data analyst's perspective. Multivariate behavioral research, 33(4):545{571,1998. [33] Shaun Seaman, John Galati, Dan Jackson, and John Carlin. What is meant by "missing at random"? Statistical Science, 1:257{268, 2013. [34] Ronald E Shier. Maximum z scores and outliers. The American Statistician, 42(1):79{80, 1988. [35] K Shobha and S Nickolas. Imputation of multivariate attribute values in big data. In Smart intelligent computing and applications, pages 53{60. Springer, 2019. | ||
آمار تعداد مشاهده مقاله: 332 تعداد دریافت فایل اصل مقاله: 219 |