Leveraging Subjective Parameters and Biomarkers in Machine Learning Models: The Feasibility of <i>lnc-IL7R</i> for Managing Emphysema Progression

<b>Background/Objectives:</b> Chronic obstructive pulmonary disease (COPD) remains a leading cause of death worldwide, with emphysema progression providing valuable insights into disease development. Clinical assessment approaches, including pulmonary function tests and high-resolution c...

Full description

Saved in:
Bibliographic Details
Main Authors: Tzu-Tao Chen, Tzu-Yu Cheng, I-Jung Liu, Shu-Chuan Ho, Kang-Yun Lee, Huei-Tyng Huang, Po-Hao Feng, Kuan-Yuan Chen, Ching-Shan Luo, Chien-Hua Tseng, Yueh-His Chen, Arnab Majumdar, Cheng-Yu Tsai, Sheng-Ming Wu
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/15/9/1165
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<b>Background/Objectives:</b> Chronic obstructive pulmonary disease (COPD) remains a leading cause of death worldwide, with emphysema progression providing valuable insights into disease development. Clinical assessment approaches, including pulmonary function tests and high-resolution computed tomography, are limited by accessibility constraints and radiation exposure. This study, therefore, proposed an alternative approach by integrating the novel biomarker long non-coding interleukin-7 receptor α-subunit gene (<i>lnc-Il7R</i>), along with other easily accessible clinical and biochemical metrics, into machine learning (ML) models. <b>Methods:</b> This cohort study collected baseline characteristics, COPD Assessment Test (CAT) scores, and biochemical details from the enrolled participants. Associations with emphysema severity, defined by a low attenuation area percentage (LAA%) threshold of 15%, were evaluated using simple and multivariate-adjusted models. The dataset was then split into training and validation (80%) and test (20%) subsets. Five ML models were employed, with the best-performing model being further analyzed for feature importance. <b>Results:</b> The majority of participants were elderly males. Compared to the LAA% <15% group, the LAA% ≥15% group demonstrated a significantly higher body mass index (BMI), poor pulmonary function, and lower expression levels of <i>lnc-Il7R</i> (all <i>p</i> < 0.01). Fold changes in <i>lnc-IL7R</i> were strongly and negatively associated with LAA% (<i>p</i> < 0.01). The random forest (RF) model achieved the highest accuracy and area under the receiver operating characteristic curve (AUROC) across datasets. A feature importance analysis identified <i>lnc-IL7R</i> fold changes as the strongest predictor for emphysema classification (LAA% ≥15%), followed by CAT scores and BMI. <b>Conclusions:</b> Machine learning models incorporated accessible clinical and biochemical markers, particularly the novel biomarker <i>lnc-IL7R</i>, achieving classification accuracy and AUROC exceeding 75% in emphysema assessments. These findings offer promising opportunities for improving emphysema classification and COPD management.
ISSN:2075-4418