
تعداد نشریات | 26 |
تعداد شمارهها | 447 |
تعداد مقالات | 4,557 |
تعداد مشاهده مقاله | 5,380,005 |
تعداد دریافت فایل اصل مقاله | 3,580,079 |
A multi-objective optimization approach for online streaming feature selection using fuzzy Pareto dominance | ||
Journal of Mahani Mathematical Research | ||
دوره 13، شماره 1 - شماره پیاپی 26، بهمن 2023، صفحه 467-490 اصل مقاله (760.83 K) | ||
نوع مقاله: Research Paper | ||
شناسه دیجیتال (DOI): 10.22103/jmmr.2023.21044.1402 | ||
نویسندگان | ||
Amin Hashemi1؛ Mohammad-Reza Pajoohan* 1؛ Mohammad Bagher Dowlatshahi2 | ||
1Department of Computer Engineering, Faculty of Engineering, Yazd University, Yazd, Iran | ||
2Department of Computer Engineering, Faculty of Engineering, Lorestan University, Khorramabad, Iran. | ||
چکیده | ||
Feature selection is one of the most important tasks in machine learning. Traditional feature selection methods are inadequate for reducing the dimensionality of online data streams because they assume that the feature space is fixed and every time a feature is added, the algorithm must be executed from the beginning, which in addition to not performing real-time processing, causes many unnecessary calculations and resource consumption. In many real-world applications such as weather forecasting, stock markets, clinical research, natural disasters, and vital-sign monitoring, the feature space changes dynamically, and feature streams are added to the data over time. Existing online streaming feature selection (OSFS) methods suffer from problems such as high computational complexity, long processing time, sensitivity to parameters, and failure to account for redundancy between features. In this paper, the process of OSFS is modeled as a multi-objective optimization problem for the first time. When a feature stream arrives, it is evaluated in the multi-objective space using fuzzy Pareto dominance, where three feature selection methods are considered as our objectives. Features are ranked according to their degree of dominance in the multi-objective space over other features. We proposed an effective method to select a minimum subset of features in a short time. Experiments were conducted using two classifiers and eight OSFS algorithms with real-world datasets. The results show that the proposed method selects a minimal subset of features in a reasonable time for all datasets. | ||
کلیدواژهها | ||
Online streaming feature selection؛ Fuzzy Pareto dominance؛ High-dimensional data؛ multi-objective optimization | ||
مراجع | ||
[1] Bayati, H., Dowlatshahi, M. B., & Hashemi, A. (2022). MSSL: A memetic-based sparse subspace learning algorithm for multi-label classification. International Journal of Machine Learning and Cybernetics, 13(11), 3607–3624. https://doi.org/10.1007/s13042- 022-01616-5. [2] Bolón-Canedo, V., & Alonso-Betanzos, A. (2018). Evaluation of Ensembles for Feature Selection. In V. Bolón-Canedo & A. Alonso-Betanzos (Eds.), Recent Advances in Ensembles for Feature Selection (pp. 97–113). Springer International Publishing. https://doi.org/10.1007/978-3-319-90080-3-6. [3] Dhal, P., & Azad, C. (2022). A comprehensive survey on feature selection in the various fields of machine learning. Applied Intelligence, 52(4), 4543–4581. https://doi.org/10.1007/s10489-021-02550-9. [4] Dowlatshahi, M. B., & Hashemi, A. (2023). Unsupervised feature selection: A fuzzy multi-criteria decision-making approach. Iranian Journal of Fuzzy Systems, 20(7), 55– 70. https://doi.org/10.22111/IJFS.2023.7630. [5] Dowlatshahi, M. B., Zare-Chahooki, M. A., Beiranvand, S., & Hashemi, A. (2022). GKRR: A gravitational-based kernel ridge regression for software development effort estimation. Journal of Mahani Mathematical Research, 11(3), 147–174. https://doi.org/10.22103/jmmr.2022.18988.1202. [6] Eskandari, S., & Seifaddini, M. (2023). Online and offline streaming feature selection methods with bat algorithm for redundancy analysis. Pattern Recognition, 133, 109007. https://doi.org/10.1016/j.patcog.2022.109007. [7] Friedman, M. (1940). A Comparison of Alternative Tests of Significance for the Problem of m Rankings. The Annals of Mathematical Statistics, 11(1), 86–92. https://doi.org/10.1214/aoms/1177731944. [8] Hashemi, A., Bagher Dowlatshahi, M., & Nezamabadi-pour, H. (2021). A pareto-based ensemble of feature selection algorithms. Expert Systems with Applications, 180, 115130. https://doi.org/10.1016/j.eswa.2021.115130. [9] Hashemi, A., Bagher Dowlatshahi, M., & Nezamabadi-pour, H. (2021). An efficient Pareto-based feature selection algorithm for multi-label classification. Information Sciences, 581, 428–447. https://doi.org/10.1016/j.ins.2021.09.052. [10] Hashemi, A., Dowlatshahi, M. B., & Nezamabadi-pour, H. (2021). Minimum redundancy maximum relevance ensemble feature selection: A bi-objective Pareto-based approach. Journal of Soft Computing and Information Technology. https://jscit.nit.ac.ir/article- 138958-en.html. [11] Hashemi, A., Dowlatshahi, M. B., & Nezamabadi-pour, H. (2021). VMFS: A VIKORbased multi-target feature selection. Expert Systems with Applications, 182, 115224. https://doi.org/10.1016/j.eswa.2021.115224. [12] Hashemi, A., Dowlatshahi, M. B., & Nezamabadi-pour, H. (2022). Ensemble of feature selection algorithms: A multi-criteria decision-making approach. International Journal of Machine Learning and Cybernetics, 13(1), 49–69. https://doi.org/10.1007/s13042- 021-01347-z. [13] Hashemi, A., Joodaki, M., Joodaki, N. Z., & Dowlatshahi, M. B. (2022). Ant colony optimization equipped with an ensemble of heuristics through multi-criteria decision making: A case study in ensemble feature selection. Applied Soft Computing, 124, 109046. https://doi.org/10.1016/j.asoc.2022.109046. [14] Hashemi, A., Pajoohan, M.-R., & Dowlatshahi, M. B. (2022). Online streaming feature selection based on Sugeno fuzzy integral. 2022 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS), 1–6. https://doi.org/10.1109/CFIS54774.2022.9756477. [15] Hashemi, A., Pajoohan, M.-R., & Dowlatshahi, M. B. (2023). An election strategy for online streaming feature selection. 28th International Computer Conference, Computer Society of Iran (CSICC), 01–04. https://doi.org/10.1109/CSICC58665.2023.10105319. [16] Hu, X., Zhou, P., Li, P., Wang, J., & Wu, X. (2018). A survey on online feature selection with streaming features. Frontiers of Computer Science, 12(3), 479–493. https://doi.org/10.1007/s11704-016-5489-3. [17] Joodaki, M., Dowlatshahi, M. B., & Joodaki, N. Z. (2021). An ensemble feature selection algorithm based on PageRank centrality and fuzzy logic. Knowledge-Based Systems, 233, 107538. https://doi.org/10.1016/j.knosys.2021.107538. [18] Kashef, S., & Nezamabadi-pour, H. (2019). A label-specific multi-label feature selection algorithm based on the Pareto dominance concept. Pattern Recognition, 88, 654–667. https://doi.org/10.1016/j.patcog.2018.12.020. [19] Krzeszowska-Zakrzewska, B. (2015). Fuzzy Pareto Dominance in Multiple Criteria Project Scheduling Problem. Multiple Criteria Decision Making, 10, 93–104. [20] Li, M., Yang, S.,& Liu, X. (2015). Bi-goal evolution for manyobjective optimization problems. Artificial Intelligence, 228, 45–65. https://doi.org/10.1016/j.artint.2015.06.007. [21] Luo, C., Wang, S., Li, T., Chen, H., Lv, J., & Yi, Z. (2023). RHDOFS: A Distributed Online Algorithm Towards Scalable Streaming Feature Selection. IEEE Transactions on Parallel and Distributed Systems, 34(6), 1830–1847. https://doi.org/10.1109/TPDS.2023.3265974. [22] Miri, M., Dowlatshahi, M. B., Hashemi, A., Rafsanjani, M. K., Gupta, B. B., & Alhalabi, W. (2022). Ensemble feature selection for multi-label text classification: An intelligent order statistics approach. International Journal of Intelligent Systems, 37(12), 11319– 11341. https://doi.org/10.1002/int.23044. [23] Pajoohan, M.-R., Hashemi, A., & Dowlatshahi, M. B. (2022). An online streaming feature selection method based on the Choquet fuzzy integral. Fuzzy Systems and Its Applications, 5(1), 161–185. https://doi.org/10.22034/jfsa.2022.331660.1116. [24] Prajapati, A. (2021). Two-Archive Fuzzy-Pareto-Dominance Swarm Optimization for Many-Objective Software Architecture Reconstruction. Arabian Journal for Science and Engineering, 46(4), 3503–3518. https://doi.org/10.1007/s13369-020-05147-5. [25] Rafie, A., Moradi, P., & Ghaderzadeh, A. (2023). A Multi-Objective online streaming Multi-Label feature selection using mutual information. Expert Systems with Applications, 216, 119428. https://doi.org/10.1016/j.eswa.2022.119428. [26] Rahmaninia, M., & Moradi, P. (2018). OSFSMI: Online stream feature selection method based on mutual information. Applied Soft Computing, 68, 733–746. https://doi.org/10.1016/j.asoc.2017.08.034. [27] Serrano-Guerrero, J., Romero, F. P., & Olivas, J. A. (2021). Fuzzy logic applied to opinion mining: A review. Knowledge-Based Systems, 222, 107018. https://doi.org/10.1016/j.knosys.2021.107018. [28] Suryanarayan, P., Subramanian, A., & Mandalapu, D. (2010). Dynamic Hand Pose Recognition Using Depth Data. 20th International Conference on Pattern Recognition, 3105–3108. https://doi.org/10.1109/ICPR.2010.760. [29] Talbi, E. (2009). Metaheuristics: From design to implementation. John Wiley & Sons. [30] Wang, J., Zhao, P., Hoi, S. C. H., & Jin, R. (2014). Online Feature Selection and Its Applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698–710. https://doi.org/10.1109/TKDE.2013.32. [31] Wu, D., He, Y., Luo, X., & Zhou, M. (2022). A Latent Factor Analysis- Based Approach to Online Sparse Streaming Feature Selection. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(11), 6744–6758. https://doi.org/10.1109/TSMC.2021.3096065. [32] You, D., Sun, M., Liang, S., Li, R., Wang, Y., Xiao, J., Yuan, F., Shen, L., & Wu, X. (2022). Online feature selection for multi-source streaming features. Information Sciences, 590, 267–295. https://doi.org/10.1016/j.ins.2022.01.008. [33] Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. https://doi.org/10.1016/S0019-9958(65)90241-X. [34] Zaman, E. A. K., Mohamed, A., & Ahmad, A. (2022). Feature selection for online streaming high-dimensional data: A state-of-the-art review. Applied Soft Computing, 127, 109355. https://doi.org/10.1016/j.asoc.2022.109355. [35] Zhou, J., P. Foster, D., A. Stine, R., & H. Ungar, L. (2006). Streamwise feature selection. Journal of Machine Learning Research, 3(2), 1532–4435. https://dl.acm.org/doi/abs/10.5555/1248547.1248614. [36] Zhou, P., Hu, X., Li, P., & Wu, X. (2019). OFS-Density: A novel online streaming feature selection method. Pattern Recognition, 86, 48–61. https://doi.org/10.1016/j.patcog.2018.08.009. [37] Zhou, P., Hu, X., Li, P., & Wu, X. (2019). Online streaming feature selection using adapted Neighborhood Rough Set. Information Sciences, 481, 258–279. https://doi.org/10.1016/j.ins.2018.12.074. [38] Zhou, P., Zhang, Y., Li, P., & Wu, X. (2022). General assembly framework for online streaming feature selection via Rough Set models. Expert Systems with Applications, 204, 117520. https://doi.org/10.1016/j.eswa.2022.117520. [39] ZhouPeng, ZhaoShu, YanYuanting, & WuXindong. (2022). Online Scalable Streaming Feature Selection via Dynamic Decision. ACM Transactions on Knowledge Discovery from Data (TKDD), 16(5), 1–20. https://doi.org/10.1145/3502737. | ||
آمار تعداد مشاهده مقاله: 180 تعداد دریافت فایل اصل مقاله: 218 |