CUR matrix approximation through convex optimization for feature selection
The singular value decomposition (SVD) is commonly used in applications that require a low-rank matrix approximation. However, the singular vectors cannot be interpreted in terms of the original data. For applications requiring this type of interpretation, e.g., selection of important data matrix co...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-08-01
|
| Series: | Frontiers in Applied Mathematics and Statistics |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fams.2025.1632218/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849229130404462592 |
|---|---|
| author | Kathryn Linehan Kathryn Linehan Radu Balan |
| author_facet | Kathryn Linehan Kathryn Linehan Radu Balan |
| author_sort | Kathryn Linehan |
| collection | DOAJ |
| description | The singular value decomposition (SVD) is commonly used in applications that require a low-rank matrix approximation. However, the singular vectors cannot be interpreted in terms of the original data. For applications requiring this type of interpretation, e.g., selection of important data matrix columns or rows, the approximate CUR matrix factorization can be used. Work on the CUR matrix approximation has generally focused on algorithm development, theoretical guarantees, and applications. In this study, we present a novel deterministic CUR formulation and algorithm with theoretical convergence guarantees. The algorithm utilizes convex optimization, finds important columns and rows separately, and allows the user to control the number of important columns and rows selected from the original data matrix. We present numerical results and demonstrate the effectiveness of our CUR algorithm as a feature selection method on gene expression data. These results are compared to those using the SVD and other CUR algorithms as the feature selection method. Finally, we present a novel application of CUR as a feature selection method to determine discriminant proteins when clustering protein expression data in a self-organizing map (SOM), and compare the performance of multiple CUR algorithms in this application. |
| format | Article |
| id | doaj-art-e4512c329b7e40a7b27cc0d22e5ed5bb |
| institution | Kabale University |
| issn | 2297-4687 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Applied Mathematics and Statistics |
| spelling | doaj-art-e4512c329b7e40a7b27cc0d22e5ed5bb2025-08-22T05:45:44ZengFrontiers Media S.A.Frontiers in Applied Mathematics and Statistics2297-46872025-08-011110.3389/fams.2025.16322181632218CUR matrix approximation through convex optimization for feature selectionKathryn Linehan0Kathryn Linehan1Radu Balan2Department of Mathematics, University of Maryland, College Park, MD, United StatesResearch Computing, University of Virginia, Charlottesville, VA, United StatesDepartment of Mathematics, University of Maryland, College Park, MD, United StatesThe singular value decomposition (SVD) is commonly used in applications that require a low-rank matrix approximation. However, the singular vectors cannot be interpreted in terms of the original data. For applications requiring this type of interpretation, e.g., selection of important data matrix columns or rows, the approximate CUR matrix factorization can be used. Work on the CUR matrix approximation has generally focused on algorithm development, theoretical guarantees, and applications. In this study, we present a novel deterministic CUR formulation and algorithm with theoretical convergence guarantees. The algorithm utilizes convex optimization, finds important columns and rows separately, and allows the user to control the number of important columns and rows selected from the original data matrix. We present numerical results and demonstrate the effectiveness of our CUR algorithm as a feature selection method on gene expression data. These results are compared to those using the SVD and other CUR algorithms as the feature selection method. Finally, we present a novel application of CUR as a feature selection method to determine discriminant proteins when clustering protein expression data in a self-organizing map (SOM), and compare the performance of multiple CUR algorithms in this application.https://www.frontiersin.org/articles/10.3389/fams.2025.1632218/fullCUR matrix approximationconvex optimizationlow-rank matrix approximationfeature selectioninterpretation |
| spellingShingle | Kathryn Linehan Kathryn Linehan Radu Balan CUR matrix approximation through convex optimization for feature selection Frontiers in Applied Mathematics and Statistics CUR matrix approximation convex optimization low-rank matrix approximation feature selection interpretation |
| title | CUR matrix approximation through convex optimization for feature selection |
| title_full | CUR matrix approximation through convex optimization for feature selection |
| title_fullStr | CUR matrix approximation through convex optimization for feature selection |
| title_full_unstemmed | CUR matrix approximation through convex optimization for feature selection |
| title_short | CUR matrix approximation through convex optimization for feature selection |
| title_sort | cur matrix approximation through convex optimization for feature selection |
| topic | CUR matrix approximation convex optimization low-rank matrix approximation feature selection interpretation |
| url | https://www.frontiersin.org/articles/10.3389/fams.2025.1632218/full |
| work_keys_str_mv | AT kathrynlinehan curmatrixapproximationthroughconvexoptimizationforfeatureselection AT kathrynlinehan curmatrixapproximationthroughconvexoptimizationforfeatureselection AT radubalan curmatrixapproximationthroughconvexoptimizationforfeatureselection |