Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data

Hierarchical clustering analysis (HCA) is a widely used unsupervised learning method. Limitations of HCA, however, include imposing an artificial hierarchy onto non-hierarchical data and fixed two-way mergers at every level. To address this, the current work describes a novel rootlets hierarchical p...

Full description

Saved in:
Bibliographic Details
Main Authors: Korey P. Wylie, Jason R. Tregellas
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/13/1/72
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549147357839360
author Korey P. Wylie
Jason R. Tregellas
author_facet Korey P. Wylie
Jason R. Tregellas
author_sort Korey P. Wylie
collection DOAJ
description Hierarchical clustering analysis (HCA) is a widely used unsupervised learning method. Limitations of HCA, however, include imposing an artificial hierarchy onto non-hierarchical data and fixed two-way mergers at every level. To address this, the current work describes a novel rootlets hierarchical principal component analysis (hPCA). This method extends typical hPCA using multivariate statistics to construct adaptive multiway mergers and Riemannian geometry to visualize nested dependencies. The rootlets hPCA algorithm and its projection onto the Poincaré disk are presented as examples of this extended framework. The algorithm constructs high-dimensional mergers using a single parameter, interpreted as a <i>p</i>-value. It decomposes a similarity matrix from GL(<i>m</i>, ℝ) using a sequence of rotations from SO(<i>k</i>), <i>k</i> << <i>m</i>. Analysis shows that the rootlets algorithm limits the number of distinct eigenvalues for any merger. Nested clusters of arbitrary size but equal correlations are constructed and merged using their leading principal components. The visualization method then maps elements of SO(<i>k</i>) onto a low-dimensional hyperbolic manifold, the Poincaré disk. Rootlets hPCA was validated using simulated datasets with known hierarchical structure, and a neuroimaging dataset with an unknown hierarchy. Experiments demonstrate that rootlets hPCA accurately reconstructs known hierarchies and, unlike HCA, does not impose a hierarchy on data.
format Article
id doaj-art-6e6559bbc831491e9e5bfc30c6cf519d
institution Kabale University
issn 2227-7390
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj-art-6e6559bbc831491e9e5bfc30c6cf519d2025-01-10T13:18:10ZengMDPI AGMathematics2227-73902024-12-011317210.3390/math13010072Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical DataKorey P. Wylie0Jason R. Tregellas1Department of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus, Anschutz Health Sciences Building, 1890 N Revere Ct, Aurora, CO 80045, USADepartment of Psychiatry, University of Colorado School of Medicine, Anschutz Medical Campus, Anschutz Health Sciences Building, 1890 N Revere Ct, Aurora, CO 80045, USAHierarchical clustering analysis (HCA) is a widely used unsupervised learning method. Limitations of HCA, however, include imposing an artificial hierarchy onto non-hierarchical data and fixed two-way mergers at every level. To address this, the current work describes a novel rootlets hierarchical principal component analysis (hPCA). This method extends typical hPCA using multivariate statistics to construct adaptive multiway mergers and Riemannian geometry to visualize nested dependencies. The rootlets hPCA algorithm and its projection onto the Poincaré disk are presented as examples of this extended framework. The algorithm constructs high-dimensional mergers using a single parameter, interpreted as a <i>p</i>-value. It decomposes a similarity matrix from GL(<i>m</i>, ℝ) using a sequence of rotations from SO(<i>k</i>), <i>k</i> << <i>m</i>. Analysis shows that the rootlets algorithm limits the number of distinct eigenvalues for any merger. Nested clusters of arbitrary size but equal correlations are constructed and merged using their leading principal components. The visualization method then maps elements of SO(<i>k</i>) onto a low-dimensional hyperbolic manifold, the Poincaré disk. Rootlets hPCA was validated using simulated datasets with known hierarchical structure, and a neuroimaging dataset with an unknown hierarchy. Experiments demonstrate that rootlets hPCA accurately reconstructs known hierarchies and, unlike HCA, does not impose a hierarchy on data.https://www.mdpi.com/2227-7390/13/1/72eigendecompositionmultivariate statisticshyperbolic manifoldRiemannian geometrymanifold learning
spellingShingle Korey P. Wylie
Jason R. Tregellas
Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
Mathematics
eigendecomposition
multivariate statistics
hyperbolic manifold
Riemannian geometry
manifold learning
title Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
title_full Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
title_fullStr Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
title_full_unstemmed Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
title_short Rootlets Hierarchical Principal Component Analysis for Revealing Nested Dependencies in Hierarchical Data
title_sort rootlets hierarchical principal component analysis for revealing nested dependencies in hierarchical data
topic eigendecomposition
multivariate statistics
hyperbolic manifold
Riemannian geometry
manifold learning
url https://www.mdpi.com/2227-7390/13/1/72
work_keys_str_mv AT koreypwylie rootletshierarchicalprincipalcomponentanalysisforrevealingnesteddependenciesinhierarchicaldata
AT jasonrtregellas rootletshierarchicalprincipalcomponentanalysisforrevealingnesteddependenciesinhierarchicaldata