Difference of Submodular Minimization via DC Programming (2305.11046v2)
Abstract: Minimizing the difference of two submodular (DS) functions is a problem that naturally occurs in various machine learning problems. Although it is well known that a DS problem can be equivalently formulated as the minimization of the difference of two convex (DC) functions, existing algorithms do not fully exploit this connection. A classical algorithm for DC problems is called the DC algorithm (DCA). We introduce variants of DCA and its complete form (CDCA) that we apply to the DC program corresponding to DS minimization. We extend existing convergence properties of DCA, and connect them to convergence properties on the DS problem. Our results on DCA match the theoretical guarantees satisfied by existing DS algorithms, while providing a more complete characterization of convergence properties. In the case of CDCA, we obtain a stronger local minimality guarantee. Our numerical results show that our proposed algorithms outperform existing baselines on two applications: speech corpus selection and feature selection.
- On the rate of convergence of the difference-of-convex algorithm (dca). arXiv preprint arXiv:2109.13566, 2021.
- Near-optimal approximate discrete and continuous submodular function minimization. arXiv preprint arXiv:1909.00171, 2019.
- Bach, F. Learning with submodular functions: A convex optimization perspective. Foundations and Trends® in Machine Learning, 6(2-3):145–373, 2013.
- Guarantees for greedy maximization of non-submodular functions with applications. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 498–507. JMLR. org, 2017.
- Bubeck, S. Theory of convex optimization for machine learning. arXiv preprint arXiv:1405.4980, 15, 2014.
- A tight linear time (1/2)-approximation for unconstrained submodular maximization. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on, pp. 649–658. IEEE, 2012.
- Byrnes, K. M. A tight analysis of the submodular–supermodular procedure. Discrete Applied Mathematics, 186:275–282, 2015. ISSN 0166-218X. doi: https://doi.org/10.1016/j.dam.2015.01.026. URL https://www.sciencedirect.com/science/article/pii/S0166218X15000281.
- Provable submodular minimization via fujishige-wolfe’s algorithm. Adv. in Neu. Inf. Proc. Sys.(NIPS), 2014.
- Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. Image Processing, IEEE Transactions on, 7(3):319–335, 1998.
- Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1):1–22, 1977.
- UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
- Durier, R. On locally polyhedral convex functions. Trends in Mathematical Optimization, pp. 55–66, 1988.
- Optimal approximation for unconstrained non-submodular minimization. ICML, 2020.
- Maximizing non-monotone submodular functions. SIAM Journal on Computing, 40(4):1133–1153, 2011.
- Feldman, M. Guess free maximization of submodular and linear sums. In Friggstad, Z., Sack, J., and Salavatipour, M. R. (eds.), Algorithms and Data Structures - 16th International Symposium, WADS 2019, Edmonton, AB, Canada, August 5-7, 2019, Proceedings, volume 11646 of Lecture Notes in Computer Science, pp. 380–394. Springer, 2019. doi: 10.1007/978-3-030-24766-9_28. URL https://doi.org/10.1007/978-3-030-24766-9_28.
- A submodular function minimization algorithm based on the minimum-norm base. Pacific Journal of Optimization, 7(1):3–17, 2011.
- Ghadimi, S. Conditional gradient type methods for composite nonlinear and stochastic optimization. Math. Program., 173(1-2):431–464, 2019. doi: 10.1007/s10107-017-1225-5. URL https://doi.org/10.1007/s10107-017-1225-5.
- Submodular maximization beyond non-negativity: Guarantees, fast algorithms, and applications. In Chaudhuri, K. and Salakhutdinov, R. (eds.), Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pp. 2634–2643, Long Beach, California, USA, 09–15 Jun 2019. PMLR. URL http://proceedings.mlr.press/v97/harshaw19a.html.
- Hiriart-Urruty, J.-B. From convex optimization to nonconvex optimization. necessary and sufficient conditions for global optimality. In Nonsmooth optimization and related topics, pp. 219–239. Springer, 1989.
- Algorithms for approximate minimization of the difference between submodular functions, with applications. In Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI’12, pp. 407–417, Arlington, Virginia, United States, 2012. AUAI Press. ISBN 978-0-9749039-8-9. URL http://dl.acm.org/citation.cfm?id=3020652.3020697.
- Curvature and optimal algorithms for learning and minimizing submodular functions. In Advances in Neural Information Processing Systems, pp. 2742–2750, 2013.
- Online submodular minimization for combinatorial structures. In ICML, pp. 345–352. Citeseer, 2011.
- On fast approximate submodular minimization. In NIPS, pp. 460–468, 2011.
- Prismatic algorithm for discrete dc programming problem. Advances in Neural Information Processing Systems, 24, 2011.
- Lacoste-Julien, S. Convergence rate of frank-wolfe for non-convex objectives. arXiv preprint arXiv:1607.00345, 2016.
- Solving a class of linearly constrained indefinite quadratic problems by dc algorithms. Journal of global optimization, 11(3):253–285, 1997.
- The dc (difference of convex functions) programming and dca revisited with dc models of real world nonconvex optimization problems. Annals of operations research, 133(1):23–46, 2005.
- Dc programming and dca: thirty years of developments. Mathematical Programming, 169(1):5–68, 2018.
- Combinatorial auctions with decreasing marginal utilities. Games and Economic Behavior, 55(2):270–296, 2006.
- Optimal selection of limited vocabulary speech corpora. In Twelfth Annual Conference of the International Speech Communication Association, 2011.
- Lovász, L. Submodular functions and convexity. In Mathematical Programming The State of the Art, pp. 235–257. Springer, 1983.
- A framework of discrete dc programming by discrete convex analysis. Mathematical Programming, 152(1):435–466, 2015.
- A submodular-supermodular procedure with applications to discriminative structure learning. In UAI ’05, Proceedings of the 21st Conference in Uncertainty in Artificial Intelligence, Edinburgh, Scotland, July 26-29, 2005, pp. 404–412. AUAI Press, 2005. URL https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1243&proceeding_id=21.
- Accelerated difference of convex functions algorithm and its application to sparse binary logistic regression. In IJCAI, pp. 1369–1375, 2018.
- On the approximation relationship between optimizing ratio of submodular (rs) and difference of submodular (ds) functions. arXiv preprint arXiv: Arxiv-2101.01631, 2021.
- Convex analysis approach to dc programming: theory, algorithms and applications. Acta mathematica vietnamica, 22(1):289–355, 1997.
- Duality in dc (difference of convex functions) optimization. subgradient methods. Trends in Mathematical Optimization, pp. 277–293, 1988.
- Alternating dc algorithm for partial dc programming problems. Journal of Global Optimization, 82(4):897–928, 2022.
- Optimal approximation for submodular and supermodular optimization with bounded curvature. Mathematics of Operations Research, 42(4):1197–1218, 2017.
- Convex analysis and minimization algorithms. Springer-Verlag, 1993.
- Vo, X. T. Learning with sparsity and uncertainty by difference of convex functions optimization. PhD thesis, Université de Lorraine, 2015.
- Learning interpretable decision rule sets: A submodular optimization approach. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=pZHGKM9mAp.
- The concave-convex procedure (cccp). In Dietterich, T., Becker, S., and Ghahramani, Z. (eds.), Advances in Neural Information Processing Systems, volume 14. MIT Press, 2001. URL https://proceedings.neurips.cc/paper/2001/file/a012869311d64a44b5a0d567cd20de04-Paper.pdf.
- Cccp is frank-wolfe in disguise. arXiv preprint arXiv:2206.12014, 2022.