Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WildWood: a new Random Forest algorithm (2109.08010v2)

Published 16 Sep 2021 in cs.LG and stat.ML

Abstract: We introduce WildWood (WW), a new ensemble algorithm for supervised learning of Random Forest (RF) type. While standard RF algorithms use bootstrap out-of-bag samples to compute out-of-bag scores, WW uses these samples to produce improved predictions given by an aggregation of the predictions of all possible subtrees of each fully grown tree in the forest. This is achieved by aggregation with exponential weights computed over out-of-bag samples, that are computed exactly and very efficiently thanks to an algorithm called context tree weighting. This improvement, combined with a histogram strategy to accelerate split finding, makes WW fast and competitive compared with other well-established ensemble methods, such as standard RF and extreme gradient boosting algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (85)
  1. Hierarchical shrinkage: Improving the accuracy and interpretability of tree-based models. In International Conference on Machine Learning, pp. 111–135. PMLR.
  2. Predicting intensive care unit length of stay and mortality using patient vital signs: Machine learning model development and validation. JMIR Medical Informatics 9(5), e21347.
  3. Analysis of purely random forests bias. arXiv preprint arXiv:1407.3939.
  4. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences 113(27), 7353–7360.
  5. Generalized random forests.
  6. Hyperopt: a python library for model selection and hyperparameter optimization. Computational Science & Discovery 8(1), 014008.
  7. Biau, G. (2012). Analysis of a random forests model. The Journal of Machine Learning Research 13(1), 1063–1095.
  8. Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research 9, 2015–2033.
  9. A random forest guided tour. TEST 25(2), 197–227.
  10. Breiman, L. (1996). Bagging predictors. Machine learning 24(2), 123–140.
  11. Breiman, L. (2001). Random forests. Machine learning 45(1), 5–32.
  12. Classification and regression trees. The Wadsworth statistics/probability series. Monterey, CA: CRC.
  13. Breslow, L. A. and D. W. Aha (1997). Simplifying decision trees: A survey. The Knowledge Engineering Review 12(01), 1–40.
  14. Buntine, W. (1992). Learning classification trees. Statistics and computing 2(2), 63–73.
  15. Calviño, A. (2020). On random-forest-based prediction intervals. In C. Analide, P. Novais, D. Camacho, and H. Yin (Eds.), Intelligent Data Engineering and Automated Learning – IDEAL 2020, Cham, pp.  172–184. Springer International Publishing.
  16. Catoni, O. (2004). Statistical Learning Theory and Stochastic Optimization: Ecole d’Eté de Probabilités de Saint-Flour XXXI - 2001, Volume 1851 of Lecture Notes in Mathematics. Springer-Verlag Berlin Heidelberg.
  17. Catoni, O. (2007). PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning, Volume 56 of IMS Lecture Notes Monograph Series. Institute of Mathematical Statistics.
  18. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory 50(9), 2050–2057.
  19. Prediction, Learning, and Games. Cambridge, New York, USA: Cambridge University Press.
  20. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, New York, NY, USA, pp. 785–794. Association for Computing Machinery.
  21. Adasyn-random forest based intrusion detection model.
  22. Bayesian CART model search. Journal of the American Statistical Association 93(443), 935–948.
  23. BART: Bayesian additive regression trees. The Annals of Applied Statistics 4(1), 266–298.
  24. Partitioning nominal attributes in decision trees. Data Mining and Knowledge Discovery 3, 197–217.
  25. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found. Trends Comput. Graph. Vis. 7, 81–227.
  26. Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity. Machine Learning 72(1), 39–61.
  27. A bayesian CART algorithm. Biometrika 85(2), 363–377.
  28. Lightweight adaptive random-forest for iot rule generation and execution. Journal of Information Security and Applications 34, 218–224.
  29. Donoho, D. L. (1993). Nonlinear wavelet methods for recovery of signals, densities, and spectra from indirect and noisy data. In In Proceedings of Symposia in Applied Mathematics. Citeseer.
  30. Donoho, D. L. and J. M. Johnstone (1994). Ideal spatial adaptation by wavelet shrinkage. biometrika 81(3), 425–455.
  31. Donsker, M. D. and S. S. Varadhan (1976). Asymptotic evaluation of certain markov process expectations for large time—iii. Communications on pure and applied Mathematics 29(4), 389–461.
  32. Catboost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363.
  33. UCI machine learning repository.
  34. Improvements on cross-validation: the 632+ bootstrap method. Journal of the American Statistical Association 92(438), 548–560.
  35. Genuer, R. (2012). Variance reduction in purely random forests. Journal of Nonparametric Statistics 24(3), 543–562.
  36. Extremely randomized trees. Machine learning 63(1), 3–42.
  37. Shrinking trees. AT & T Bell Laboratories.
  38. Helmbold, D. P. and R. E. Schapire (1997). Predicting nearly as well as the best pruning of a decision tree. Machine Learning 27(1), 51–68.
  39. Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, Volume 1, pp.  278–282. IEEE.
  40. Lightweight multilayer random forests for monitoring driver emotional status. IEEE Access 8, 60344–60354.
  41. Intelligent driver emotion monitoring based on lightweight multilayer random forests. In 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), Volume 1, pp.  280–283.
  42. Joblib Development Team (2020). Joblib: running python functions as pipeline jobs.
  43. One tree to explain them all. In 2011 IEEE Congress of Evolutionary Computation (CEC), pp. 1444–1451. IEEE.
  44. Variable importance using decision trees. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp.  425–434.
  45. Lightgbm: A highly efficient gradient boosting decision tree. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Advances in Neural Information Processing Systems, Volume 30. Curran Associates, Inc.
  46. Fast pedestrian detection in surveillance video based on soft target training of shallow random forest. IEEE Access 7, 12415–12426.
  47. Mondrian forests: Efficient online random forests. In Advances in Neural Information Processing Systems 27, pp. 3140–3148. Curran Associates, Inc.
  48. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15, New York, NY, USA. Association for Computing Machinery.
  49. Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining, pp.  413–422.
  50. Louppe, G. (2014). Understanding random forests: From theory to practice. Ph. D. thesis, University of Liege.
  51. Understanding variable importances in forests of randomized trees. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger (Eds.), Advances in Neural Information Processing Systems, Volume 26, pp.  431–439. Curran Associates, Inc.
  52. McAllester, D. A. (1998). Some pac-bayesian theorems. In Proceedings of the 11th Annual conference on Computational Learning Theory (COLT), pp.  230–234. ACM.
  53. McAllester, D. A. (1999). Pac-bayesian model averaging. In Proceedings of the 20th Annual Conference on Computational Learning Theory (COLT), pp.  164–170. ACM.
  54. Meinshausen, N. (2006). Quantile regression forests. Journal of Machine Learning Research 7(35), 983–999.
  55. Randomization as regularization: A degrees of freedom explanation for random forest success.
  56. A modal search technique for predictive nominal scale multivariate analysis. Journal of the American statistical association 67(340), 768–772.
  57. Morgan, J. N. and J. A. Sonquist (1963). Problems in the analysis of survey data, and a proposal. Journal of the American statistical association 58(302), 415–434.
  58. AMF: Aggregated Mondrian forests for online learning. Journal of the Royal Statistical Society Series B: Statistical Methodology 83(3), 505–533.
  59. Minimax optimal rates for mondrian trees and forests. Annals of Statistics 48(4), 2253–2276.
  60. Online bagging and boosting. In Proceedings of the 8th International Conference on Artificial Intelligence and Statistics (AISTATS).
  61. Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features.
  62. Differential private random forest. In 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp.  2623–2630.
  63. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830.
  64. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 9(3), e1301.
  65. CatBoost: unbiased boosting with categorical features. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), Advances in Neural Information Processing Systems, Volume 31, pp.  6638–6648. Curran Associates, Inc.
  66. Quinlan, J. R. (1986). Induction of decision trees. Machine learning 1(1), 81–106.
  67. Random forest algorithms for recognizing daily life activities using plantar pressure information: a smart-shoe study. PeerJ 8, e10170.
  68. Prediction intervals with random forests. Statistical Methods in Medical Research 29(1), 205–229. PMID: 30786820.
  69. Scornet, E. (2016a). On the asymptotics of random forests. Journal of Multivariate Analysis 146, 72–83.
  70. Scornet, E. (2016b). Random forests and kernel methods. IEEE Transactions on Information Theory 62, 1485–1500.
  71. Consistency of random forests. The Annals of Statistics 43(4), 1716–1741.
  72. Estimating County-Level COVID-19 Exponential Growth Rates Using Generalized Random Forests. Papers 2011.01219, arXiv.org.
  73. Diagnosis of chronic kidney disease by using random forest. In CMBEBIH 2017, pp.  589–594. Springer.
  74. Dynamic trees for learning and design. Journal of the American Statistical Association 106(493), 109–123.
  75. Driver state and behavior detection through smart wearables.
  76. Sequential weighting algorithms for multi-alphabet sources. In 6th Joint Swedish-Russian International Workshop on Information Theory, pp.  230–234.
  77. Tsybakov, A. B. (2003). Optimal rates of aggregation. In Learning theory and kernel machines, pp.  303–313. Springer.
  78. Vovk, V. (1998). A game of prediction with expert advice. Journal of Computer and System Sciences 56(2), 153–173.
  79. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association 113(523), 1228–1242.
  80. The context-tree weighting method: Extensions. IEEE Transactions on Information Theory 44(2), 792–798.
  81. The context-tree weighting method: Basic properties. IEEE Transactions on Information Theory 41(3), 653–664.
  82. Edge machine learning: Enabling smart internet of things applications. Big data and cognitive computing 2(3), 26.
  83. Random forest prediction intervals. The American Statistician 74(4), 392–406.
  84. Zhou, S. and L. K. Mentch (2021). Trees, forests, chickens, and eggs: when and why to prune trees in a random forest. Statistical Analysis and Data Mining: The ASA Data Science Journal 16, 45–64.
  85. Approximation trees: Statistical stability in model distillation. arXiv preprint arXiv:1808.07573.
Citations (5)

Summary

We haven't generated a summary for this paper yet.