Adaptive machine learning framework: Predicting UHPC performance from data to modelling
Ultra-High Performance Concrete (UHPC) is vital for next-generation infrastructure, necessitating complex interaction modeling beyond empirical methods. This study proposes an interpretable machine learning (ML) framework to predict the compressive strength (CS) of UHPC and analyze input variable in...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-09-01
|
| Series: | Results in Engineering |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2590123025027914 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Ultra-High Performance Concrete (UHPC) is vital for next-generation infrastructure, necessitating complex interaction modeling beyond empirical methods. This study proposes an interpretable machine learning (ML) framework to predict the compressive strength (CS) of UHPC and analyze input variable influences. The framework has several key modules: data preprocessing, feature selection, outlier detection, model training, hyperparameter optimization, and model interpretation. First, a CS dataset of 924 samples with 20 input features was constructed. Outliers were removed using the Isolation Forest algorithm based on binary search trees. Four feature subsets (FSs) were generated via F-score and mutual information score analyses, then used to train Ridge Regression, SVR, RF, GBDT, XGBoost, and LightGBM models. Bayesian optimization fine-tuned each model's hyperparameters to identify the optimal model-FS combination. Finally, SHAP interpreted input feature contributions for the best model. Results showed outlier detection reduced extreme values, improving data distribution. LightGBM demonstrated the most stable performance among all models. As the number of features decreased, model performance initially increased then decreased, peaking at FS_14 (test set: R2 = 0.9677, MAE = 4.4621 MPa, RMSE = 6.9226 MPa), showing the mutual information scoring method can improve the model performance by reducing redundant information. SHAP analysis identified the six most influential features on CS in FS_14: age, silica fume, steel fiber, steel fiber diameter, cement, and polycarboxylate superplasticizer (in order). As CS increases, the relative contributions of constituent materials increase, while the contribution of age decreases. Overall, the proposed framework enhances UHPC prediction accuracy and generalization, advancing engineering application. |
|---|---|
| ISSN: | 2590-1230 |