Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Learning of Minimax Risk Classifiers in High Dimensions (2306.06649v1)

Published 11 Jun 2023 in stat.ML and cs.LG

Abstract: High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient learning. Constraint generation methods have recently enabled efficient learning of L1-regularized support vector machines (SVMs). In this paper, we leverage such methods to obtain an efficient learning algorithm for the recently proposed minimax risk classifiers (MRCs). The proposed iterative algorithm also provides a sequence of worst-case error probabilities and performs feature selection. Experiments on multiple high-dimensional datasets show that the proposed algorithm is efficient in high-dimensional scenarios. In addition, the worst-case error probability provides useful information about the classifier performance, and the features selected by the algorithm are competitive with the state-of-the-art.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Introduction to Linear Optimization, volume 6. Athena Scientific, Belmont, MA, 1997.
  2. MRCpy: A library for minimax risk classifiers. arXiv preprint, arXiv:2108.01952, 2023.
  3. Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. Journal of Machine Learning Research, 13:27–66, 2012.
  4. Fundamental barriers to high-dimensional regression with convex penalties. The Annals of Statistics, 50(1):170–196, 2022.
  5. Solving L1-regularized SVMs and related linear programs: Revisiting the effectiveness of column and constraint generation. Journal of Machine Learning Research, 23:1–41, 2022.
  6. A primer in column generation. Springer, 2005.
  7. Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology, 3:185–205, 2005.
  8. UCI Machine Learning Repository, 2017. URL http://archive.ics.uci.edu/ml.
  9. Enriched random forest for high dimensional genomic data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19:2817–2828, 2022.
  10. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157–1182, 2003.
  11. Gene selection for cancer classification using support vector machines. Machine learning, 46:389–422, 2002.
  12. A dual coordinate descent method for large-scale linear SVM. In Proceedings of the 25th International Conference on Machine Learning, pages 408–415, 2008.
  13. An interior-point method for large-scale L1-regularized logistic regression. Journal of Machine learning research, 8:1519–1555, 2007.
  14. Infinite kernel learning: generalization bounds and algorithms. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017.
  15. Minimax classification with 0-1 loss and performance guarantees. In Advances in Neural Information Processing Systems, volume 33, pages 302–312, 2020.
  16. Generalized maximum entropy for supervised classification. IEEE Transactions on Information Theory, 68:2530–2550, 2022.
  17. Minimax risk classifiers with 0-1 loss. arXiv preprint, arXiv:2201.06487, 2023.
  18. The role of regularization in classification of high-dimensional noisy gaussian mixture. In Proceedings of the 37th International Conference on Machine Learning, pages 6874–6883, 2020.
  19. Foundations of machine learning. MIT press, Cambridge, MA, second edition, 2018.
  20. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27:1226–1238, 2005.
  21. Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems, volume 20, pages 1177–1184, 2008.
  22. Pegasos: Primal estimated sub-gradient solver for SVM. In Proceedings of the 24th International Conference on Machine Learning, pages 807–814, 2007.
  23. A fast hybrid algorithm for large-scale L1-regularized logistic regression. Journal of Machine Learning Research, 11:713–741, 2010.
  24. Large scale multiple kernel learning. Journal of Machine Learning Research, 7:1531–1565, 2006.
  25. Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74:245–266, 2012.
  26. Machine learning algorithm validation with a limited sample size. PloS One, 14, 2019.
  27. Gaël Varoquaux. Cross-validation failure: Small sample sizes lead to large error bars. Neuroimage, 180:68–77, 2018.
  28. Dual coordinate descent methods for logistic regression and maximum entropy models. Machine Learning, 85:41–75, 2011.
  29. Recent advances of large-scale linear classification. Proceedings of the IEEE, 100:2584–2603, 2012.

Summary

We haven't generated a summary for this paper yet.