A DEIM-CUR factorization with iterative SVDs (2310.00636v3)
Abstract: A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of \emph{one-round sampling} and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify $A$ after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.
- J. Baglama, L. Reichel, Augmented implicitly restarted Lanczos bidiagonalization methods, SIAM J. Sci. Comput. 27 (2005) 19–42.
- An ‘empirical interpolation’ method: Application to efficient reduced-basis discretization of partial differential equations, Comptes Rendus Math. 339 (2004) 667–672.
- C. H. Bischof, P. C. Hansen, Structure-preserving and rank-revealing QR-factorizations, SIAM J. Sci. Comput. 12 (1991) 1332–1350.
- An improved approximation algorithm for the column subset selection problem, in: Proc. Annu. ACM-SIAM Symp. Discrete Algorithms, 2009, pp. 968–977.
- C. Boutsidis, D. P. Woodruff, Optimal CUR matrix decompositions, SIAM J. Comput. 46 (2017) 543–589.
- Document clustering using locality preserving indexing, IEEE Trans. Knowl. Data Eng. 17 (2005) 1624–1637.
- S. Chandrasekaran, I. C. F. Ipsen, On rank-revealing factorisations, SIAM J. Matrix Anal. Appl. 15 (1994) 592–622.
- S. Chaturantabut, D. C. Sorensen, Nonlinear model reduction via discrete empirical interpolation, SIAM J. Sci. Comput. 32 (2010) 2737–2764.
- J. Chiu, L. Demanet, Sublinear Randomized Algorithms for Skeleton Decompositions, SIAM J. Matrix Anal. Appl. 34 (2013) 1361–1383.
- A. Cortinovis, D. Kressner, Low-rank approximation in the Frobenius norm by column and row subset selection, SIAM J. Matrix Anal. Appl. 41 (2020) 1651–1673.
- A. Deshpande, L. Rademacher, Efficient volume sampling for row/column subset selection, in: IEEE 51st Annual Symposium on Foundations of Computer Science–FOCS, 2010, pp. 329–338.
- Matrix approximation and projective clustering via volume sampling, Theory Comput. 2 (2006) 225–247.
- A. Deshpande, S. Vempala, Adaptive sampling and fast low-rank matrix approximation, in: Proceedings of the 10th RANDOM APPROX, 2006, pp. 292–303.
- Robust Blockwise Random Pivoting: Fast and Accurate Adaptive Interpolative Decomposition, arXiv:arXiv:2309.16002, (2023).
- Y. Dong, PG. Martinsson, Simpler is better: a comparative study of randomized pivoting algorithms for CUR and interpolative decompositions, Adv. Comput. Math., 49 (2023) 66.
- Relative-error CUR matrix decompositions, SIAM J. Matrix Anal. Appl. 30 (2008) 844–881.
- Z. Drmac, S. Gugercin, A new selection operator for the discrete empirical interpolation method—Improved a priori error bound and extensions, SIAM J. Sci. Comput. 38 (2016) A631–A648.
- Fast monte-carlo algorithms for finding low-rank approximations, J. ACM 51 (2004) 1025–1041.
- E. Gabrilovich, S. Markovitch, Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4.5, in: The 21st International Conference on Machine Learning, 2004, p. 41.
- Eigentaste: A constant time collaborative filtering algorithm, Inform. Retrieval 4 (2001) 133–151.
- How to find a good submatrix, in: Matrix Methods: Theory, Algorithms And Applications, World Scientific, Singapore, 2010, pp. 247–256.
- M. Gu, S. C. Eisenstat, Efficient algorithms for computing a strong rank-revealing QR factorization, SIAM J. Sci. Comput. 17 (1996) 848–869.
- V. Guruswami, A. K. Sinop, Optimal column-based low-rank matrix reconstruction, in: Proc. Annu. ACM-SIAM Symp., 2012, pp. 1207–1214.
- K. Hamm, L. Huang, Stability of sampling for CUR decompositions, Found. Data Sci. 2 (2020) 83–99.
- K. Hamm, L. Huang, Perturbations of CUR decompositions, SIAM J. Matrix Anal. Appl. 42 (2021) 351–375.
- K. Hamm, L. Huang, Perspectives on CUR decompositions, Appl. Comput. Harmon. Anal. 48 (2020) 1088–1099.
- M. W. Mahoney, P. Drineas, CUR matrix decompositions for improved data analysis, Proc. Natl. Acad. Sci. USA 106 (2009) 697–702.
- Column selection via adaptive sampling, Adv. Neural Inf. Process Syst. 28 (2015).
- D. C. Sorensen, M. Embree, A DEIM induced CUR factorization, SIAM J. Sci. Comput. 33 (2016) A1454–A1482.
- M. Stroll, A Krylov–Schur approach to the truncated SVD, Linear Algebra Appl. 8 (2012) 2795–2806.
- S. Voronin, P. G. Martinsson, Efficient algorithms for CUR and interpolative matrix decompositions, Adv. Comput. Math. 43 (2017) 495–516.
- S. Wang, Z. Zhang, Improving CUR matrix decomposition and the Nyström approximation via adaptive sampling, J. Mach. Learn. Res. 14 (2013) 2729–2769.