Estimating area under the curve from graph-derived summary data: a systematic comparison of standard and Monte Carlo approaches

Abstract Background Response curves are widely used in biomedical literature to summarize time-dependent outcomes, yet raw data are not always available in published reports. Meta-analysts must frequently extract means and standard errors from figures and estimate outcome measures like the area unde...

Full description

Saved in:
Bibliographic Details
Main Authors: Sean Titensor, Joshua Ebbert, Karen Della Corte, Dennis Della Corte
Format: Article
Language:English
Published: BMC 2025-08-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-025-02645-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Background Response curves are widely used in biomedical literature to summarize time-dependent outcomes, yet raw data are not always available in published reports. Meta-analysts must frequently extract means and standard errors from figures and estimate outcome measures like the area under the curve (AUC) without access to participant-level data. No standardized method exists for calculating AUC or propagating error under these constraints. Methods We evaluate two methods for estimating AUC from figure-derived data: (1) a trapezoidal integration approach with extrema variance propagation, and (2) a Monte Carlo method that samples plausible response curves and integrates over their posterior distribution. We generated 3,920 synthetic datasets from seven functional response types commonly found in glycemic response and pharmacokinetic research, varying the number of timepoints (4–10) and participants (5–40). All response curves were normalized to a true AUC of 1.0. Results The standard method consistently underestimated the true AUC, especially in curves with skewed or long-tailed structures. Monte Carlo method produced near-unbiased estimates with tighter alignment to the known AUC across all settings. Increasing the number of datapoints and participants improved performance for both methods, but the Monte Carlo approach retained robustness even under sparse conditions. Conclusion This is the first large-scale benchmarking of AUC estimation accuracy from graphically extracted data. The Monte Carlo method outperforms standard approaches in both accuracy and uncertainty quantification. We recommend its adoption in meta-analytic contexts where only figure-derived data are available and advocate for improved data sharing practices in primary publications.
ISSN:1471-2288