Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-rules mining algorithm for combinatorially exploded decision trees with modified Aitchison-Aitken function-based Bayesian optimization (2310.02633v1)

Published 4 Oct 2023 in cs.LG and cs.AI

Abstract: Decision trees offer the benefit of easy interpretation because they allow the classification of input data based on if--then rules. However, as decision trees are constructed by an algorithm that achieves clear classification with minimum necessary rules, the trees possess the drawback of extracting only minimum rules, even when various latent rules exist in data. Approaches that construct multiple trees using randomly selected feature subsets do exist. However, the number of trees that can be constructed remains at the same scale because the number of feature subsets is a combinatorial explosion. Additionally, when multiple trees are constructed, numerous rules are generated, of which several are untrustworthy and/or highly similar. Therefore, we propose "MAABO-MT" and "GS-MRM" algorithms that strategically construct trees with high estimation performance among all possible trees with small computational complexity and extract only reliable and non-similar rules, respectively. Experiments are conducted using several open datasets to analyze the effectiveness of the proposed method. The results confirm that MAABO-MT can discover reliable rules at a lower computational cost than other methods that rely on randomness. Furthermore, the proposed method is confirmed to provide deeper insights than single decision trees commonly used in previous studies. Therefore, MAABO-MT and GS-MRM can efficiently extract rules from combinatorially exploded decision trees.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, pp. 81–106, 3 1986.
  2. S. S. Sundhari, “A knowledge discovery using decision tree by gini coefficient,” International Conference on Business, Engineering and Industrial Applications, pp. 232–235, 2011.
  3. D. Y. Yeh, C. H. Cheng, and S. C. Hsiao, “Classification knowledge discovery in mold tooling test using decision tree algorithm,” Journal of Intelligent Manufacturing, vol. 22, pp. 585–595, 8 2011.
  4. Z. Wen and Y. Tao, “Building a rule-based machine-vision system for defect inspection on apple sorting and packing lines,” Expert Systems with Applications, vol. 16, pp. 307–313, 4 1999.
  5. H. Hamsa, S. Indiradevi, and J. J. Kizhakkethottam, “Student academic performance prediction model using decision tree and fuzzy genetic algorithm,” Procedia Technology, vol. 25, pp. 326–332, 1 2016.
  6. S. Tangirala, “Evaluating the impact of gini index and information gain on classification using decision tree classifier algorithm,” International Journal of Advanced Computer Science and Applications, vol. 11, pp. 612–619, 2020.
  7. C. Bessiere, E. Hebrard, and B. O’Sullivan, “Minimising decision tree size as combinatorial optimisation,” Lecture Notes in Computer Science, vol. 5732 LNCS, pp. 173–187, 2009.
  8. L. Breiman, “Random forests,” Machine Learning, vol. 45, pp. 5–32, 10 2001.
  9. D. Petkovic, R. Altman, M. Wong, and A. Vigil, “Improving the explainability of random forest classifier – user centered approach,” Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, vol. 23, 2018.
  10. M. P. Neto and F. V. Paulovich, “Explainable matrix - visualization for global and local interpretability of random forest classification ensembles,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, pp. 1427–1437, 2 2021.
  11. F. Gossen and B. Steffen, “Algebraic aggregation of random forests: towards explainability and rapid evaluation,” International Journal on Software Tools for Technology Transfer, pp. 1–19, 9 2021.
  12. J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for hyper-parameter optimization,” Advances in Neural Information Processing Systems, vol. 24, pp. 2546–2554, 2011.
  13. S. Watanabe, “Tree-structured parzen estimator: Understanding its algorithm components and their roles for better empirical performance,” arXiv preprint, 4 2023. [Online]. Available: https://arxiv.org/abs/2304.11127v3
  14. A. Agnesina, P. Rajvanshi, T. Yang, G. Pradipta, A. Jiao, B. Keller, B. Khailany, and H. Ren, “Autodmp: Automated dreamplace-based macro placement,” Proceedings of the International Symposium on Physical Design, pp. 149–157, 3 2023.
  15. L. E. Raileanu and K. Stoffel, “Theoretical comparison between the gini index and information gain criteria,” Annals of Mathematics and Artificial Intelligence, vol. 41, pp. 77–93, 5 2004.
  16. L. Antonioli, A. Pella, R. Ricotti, M. Rossi, M. R. Fiore, G. Belotti, G. Magro, C. Paganelli, E. Orlandi, M. Ciocca, and G. Baroni, “Convolutional neural networks cascade for automatic pupil and iris detection in ocular proton therapy,” Sensors 2021, Vol. 21, Page 4400, vol. 21, p. 4400, 6 2021.
  17. X. Wu, V. Kumar, Q. J. Ross, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z. H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg, “Top 10 algorithms in data mining,” Knowledge and Information Systems, vol. 14, pp. 1–37, 12 2008.
  18. A. Cherfi, K. Nouira, and A. Ferchichi, “Very fast c4.5 decision tree algorithm,” Applied Artificial Intelligence, vol. 32, pp. 119–137, 4 2018.
  19. M. Milanovic and M. Stamenkovic, “Chaid decision tree: Methodological frame and application,” Economic Themes, vol. 54, pp. 563–586, 12 2016.
  20. L. Rutkowski, M. Jaworski, L. Pietruczuk, and P. Duda, “The cart decision tree for mining data streams,” Information Sciences, vol. 266, pp. 1–15, 5 2014.
  21. A. L. Boulesteix, S. Janitza, J. Kruppa, and I. R. Konig, “Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 2, pp. 493–507, 11 2012.
  22. Y. Qi, “Random forest for bioinformatics,” Ensemble Machine Learning, pp. 307–323, 2012.
  23. M. Belgiu and L. Dragu, “Random forest in remote sensing: A review of applications and future directions,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 114, pp. 24–31, 4 2016.
  24. M. Loecher, “Unbiased variable importance for random forests,” Communications in Statistics - Theory and Methods, vol. 51, pp. 1413–1425, 2020.
  25. R. Yao, J. Li, M. Hui, L. Bai, and Q. Wu, “Feature selection based on random forest for partial discharges characteristic set,” IEEE Access, vol. 8, pp. 159 151–159 161, 2020.
  26. J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Annals of statistics, vol. 29, pp. 1189–1232, 10 2001.
  27. T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” Proceedings of the 22nd ACM International Conference on Knowledge Discovery and Data Mining, 2016.
  28. G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  29. L. Huang, Y. Liu, W. Huang, Y. Dong, H. Ma, K. Wu, and A. Guo, “Combining random forest and xgboost methods in detecting early and mid-term winter wheat stripe rust using canopy level hyperspectral measurements,” Agriculture, vol. 12, p. 74, 1 2022.
  30. M. Z. Joharestani, C. Cao, X. Ni, B. Bashir, and S. Talebiesfandarani, “Pm2.5 prediction based on random forest, xgboost, and deep learning using multisource remote sensing data,” Atmosphere, vol. 10, p. 373, 7 2019.
  31. P. Tao, H. Shen, Y. Zhang, P. Ren, J. Zhao, and Y. Jia, “Status forecast and fault classification of smart meters using lightgbm algorithm improved by random forest,” Wireless Communications and Mobile Computing, vol. 2022, 2022.
  32. D. ni Wang, L. Li, and D. Zhao, “Corporate finance risk prediction based on lightgbm,” Information Sciences, vol. 602, pp. 259–268, 7 2022.
  33. J. Aitchison and C. G. Aitken, “Multivariate binary discrimination by the kernel method,” Biometrika, vol. 63, pp. 413–420, 12 1976.
  34. “sklearn.tree.decisiontreeclassifier – scikit-learn 1.3.0 documentation.” [Online]. Available: https://scikit-learn.org/stable/modules/classes.html
  35. R. Mohammed, J. Rawashdeh, and M. Abdullah, “Machine learning with oversampling and undersampling techniques: Overview study and experimental results,” 11th International Conference on Information and Communication Systems, pp. 243–248, 4 2020.
  36. K. Singh, R. Nagpal, and R. Sehgal, “Exploratory data analysis and machine learning on titanic disaster dataset,” Proceedings of the Confluence 2020 - 10th International Conference on Cloud Computing, Data Science and Engineering, pp. 320–326, 1 2020.
  37. A. Singh, S. Saraswat, and N. Faujdar, “Analyzing titanic disaster using machine learning algorithms,” Proceeding of 2017 IEEE International Conference on Computing, Communication and Automation, pp. 406–411, 12 2017.
  38. N. Farag and G. Hassan, “Predicting the survivors of the titanic - kaggle, machine learning from disaster -,” Proceedings of the 7th International Conference on Software and Information Engineering, pp. 32–37, 5 2018.
  39. S. Singh and M. Giri, “Comparative study id3, cart and c4.5 decision tree algorithm: A survey,” International Journal of Advanced Information Science and Technology, vol. 3, pp. 47–52, 2014.
  40. “The boston house-price data.” [Online]. Available: http://lib.stat.cmu.edu/datasets/boston
  41. D. Harrison Jr and D. L. Rubinfeld, “Hedonic housing prices and the demand for clean air,” Journal of environmental economics and management, vol. 5, no. 1, pp. 81–102, 1978.
  42. “Toy datasets (7.1.2. diabetes dataset) – scikit-learn 1.4.dev0 documentation.” [Online]. Available: https://scikit-learn.org/dev/datasets/toy_dataset.html#diabetes-dataset
  43. B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle regression,” Ann. Statist., vol. 32, pp. 407–499, 4 2004.
  44. A. Honda, M. Itabashi, and S. James, “A neural network based on the inclusion-exclusion integral and its application to data analysis,” Information Sciences, vol. 648, p. 119549, 11 2023.
  45. S. Oh, “Feature interaction in terms of prediction performance,” Applied Sciences, vol. 9, p. 5191, 11 2019.
  46. Y. Chen and Y. Yang, “The one standard error rule for model selection: Does it work?” Stats, vol. 4, pp. 868–892, 11 2021.
  47. X. Peng, “Tsvr: An efficient twin support vector machine for regression,” Neural Networks, vol. 23, pp. 365–372, 4 2010.
  48. M. F. Lopes-Virella, P. G. Stone, and J. A. Colwell, “Serum high density lipoprotein in diabetic patients,” Diabetologia, vol. 13, pp. 285–291, 8 1977.

Summary

We haven't generated a summary for this paper yet.