Comparative Evaluation of Ensemble Machine Learning Models for Methane Production from Anaerobic Digestion

This study provides a comparative evaluation of several ensemble model constructions for the prediction of specific methane yield (SMY) from anaerobic digestion. From the authors’ knowledge based on existing research, present knowledge of their prediction accuracy and utilization in anaerobic digest...

Full description

Saved in:
Bibliographic Details
Main Authors: Dorijan Radočaj, Mladen Jurišić
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Fermentation
Subjects:
Online Access:https://www.mdpi.com/2311-5637/11/3/130
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study provides a comparative evaluation of several ensemble model constructions for the prediction of specific methane yield (SMY) from anaerobic digestion. From the authors’ knowledge based on existing research, present knowledge of their prediction accuracy and utilization in anaerobic digestion modeling relative to individual machine learning methods is incomplete. Three input datasets from compiled anaerobic digestion samples using agricultural and forestry lignocellulosic residues from previous studies were used in this study. A total of six individual machine learning methods and five ensemble constructions were evaluated per dataset, whose prediction accuracy was assessed using a robust 10-fold cross-validation in 100 repetitions. Ensemble models outperformed individual methods in one out of three datasets in terms of prediction accuracy. They also produced notably lower coefficients of variation in root-mean-square error (RMSE) than most accurate individual methods (0.031 to 0.393 for dataset A, 0.026 to 0.272 for dataset B, and 0.021 to 0.217 for dataset AB), being much less prone to randomness in the training and test data split. The optimal ensemble constructions generally benefited from the higher number of individual methods included, as well as from their diversity in terms of prediction principles. Since the reporting of prediction accuracy based on final model fitting and the single split-sample approach is highly prone to randomness, the adoption of a cross-validation in multiple repetitions is proposed as a standard in future studies.
ISSN:2311-5637