Applicability of Machine Learning and Mathematical Equations to the Prediction of Total Organic Carbon in Cambrian Shale, Sichuan Basin, China

Accurate Total Organic Carbon (TOC) prediction in the deeply buried Lower Cambrian Qiongzhusi Formation shale is constrained by extreme heterogeneity (TOC variability: 0.5–12 wt.%, mineral composition Coefficient of Variation > 40%) and ambiguous geophysical responses. This study introduces three...

Full description

Saved in:
Bibliographic Details
Main Authors: Majia Zheng, Meng Zhao, Ya Wu, Kangjun Chen, Jiwei Zheng, Xianglu Tang, Dadong Liu
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/9/4957
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate Total Organic Carbon (TOC) prediction in the deeply buried Lower Cambrian Qiongzhusi Formation shale is constrained by extreme heterogeneity (TOC variability: 0.5–12 wt.%, mineral composition Coefficient of Variation > 40%) and ambiguous geophysical responses. This study introduces three key innovations to address these challenges: (1) A Dynamic Weighting–Calibrated Random Forest Regression (DW-RFR) model integrating high-resolution Gamma-Ray-guided dynamic time warping (±0.06 m depth alignment precision derived from 237 core-log calibration points using cross-validation), Principal Component Analysis-Deyang–Anyue Rift Trough Shapley Additive Explanations (PCA-SHAP) hybrid feature engineering (89.3% cumulative variance, VIF < 4), and Bayesian-optimized ensemble learning; (2) systematic benchmarking against conventional ΔlogR (R<sup>2</sup> = 0.700, RMSE = 0.264) and multi-attribute joint inversion (R<sup>2</sup> = 0.734, RMSE = 0.213) methods, demonstrating superior accuracy (R<sup>2</sup> = 0.917, RMSE = 0.171); (3) identification of Gamma Ray (r = 0.82) and bulk density (r = −0.76) as principal TOC predictors, contrasted with resistivity’s thermal maturity-dependent signal attenuation (r = 0.32 at Ro > 3.0%). The methodology establishes a transferable framework for organic-rich shale evaluation, directly applicable to the Longmaxi Formation and global Precambrian–Cambrian transition sequences. Future directions emphasize real-time drilling data integration and quantum computing-enhanced modeling for ultra-deep shale systems, advancing predictive capabilities in tectonically complex basins.
ISSN:2076-3417