Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics

Detailed and accurate statistics on crop productivity are key to inform decision-making related to sustainable food production and supply ensuring global food security. However, annual and high-resolution crop yield data provided by official agricultural statistics are generally lacking. Earth obser...

Full description

Saved in:
Bibliographic Details
Main Authors: Patric Brandt, Florian Beyer, Peter Borrmann, Markus Möller, Heike Gerighausen
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:GIScience & Remote Sensing
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/15481603.2024.2367808
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846138981700337664
author Patric Brandt
Florian Beyer
Peter Borrmann
Markus Möller
Heike Gerighausen
author_facet Patric Brandt
Florian Beyer
Peter Borrmann
Markus Möller
Heike Gerighausen
author_sort Patric Brandt
collection DOAJ
description Detailed and accurate statistics on crop productivity are key to inform decision-making related to sustainable food production and supply ensuring global food security. However, annual and high-resolution crop yield data provided by official agricultural statistics are generally lacking. Earth observation (EO) imagery, geodata on meteorological and soil conditions, as well as advances in machine learning (ML) provide huge opportunities for model-based crop yield estimation in terms of covering large spatial scales with unprecedented granularity. This study proposes a novel yield estimation approach that is bottom-up scalable from parcel to administrative levels by leveraging ML-ensembles, comprising of six regression estimators (base estimators), and multi-source geodata, including EO imagery. To ensure the approach’s robustness, two ensemble learning techniques are investigated, namely meta-learning through model stacking and majority voting. ML-ensembles were evaluated multi-annually and crop-specifically for three major winter crops, namely winter wheat (WW), winter barley (WB), and winter rapeseed (WR) in two German federal states, covering 140,000 to 155,000 parcels per year. ML-ensembles were evaluated at the parcel and district level for two German federal states against official yield reports, ranging from 2019 to 2022, based on metrics such as coefficient of determination ([Formula: see text]) and normalized root mean square error ([Formula: see text]). Overall, the most robustly performing ensemble learning technique was majority voting yielding [Formula: see text] and [Formula: see text] values of 0.74, 13.4% for WW, 0.68, 16.9% for WB, and 0.66, 14.1% for WR, respectively, through cross-validation at parcel level. At the district level, majority voting reached [Formula: see text] and [Formula: see text] ranges of 0.79–0.89, 7.2–8.1% for WW, 0.80–0.84, 6.0–9.9% for WB, and 0.60–0.78, 6.1–10.4% for WR, respectively. Capitalizing on ensemble learning-based majority voting, examples of unprecedented high-resolution crop yield maps at [Formula: see text] spatial resolution are presented. Implementing a scalable yield estimation approach, as proposed in this study, into crop yield reporting frameworks of public authorities mandated to provide official agricultural statistics would increase the spatial resolution of annually reported yields, eventually covering the entire cropland available. Such unprecedented data products delivered through map services may improve decision-making support for a variety of stakeholders across different spatial scales, ranging from parcel to higher administrative levels.
format Article
id doaj-art-77def4b21fca44c1a6c6d1bdc799879c
institution Kabale University
issn 1548-1603
1943-7226
language English
publishDate 2024-12-01
publisher Taylor & Francis Group
record_format Article
series GIScience & Remote Sensing
spelling doaj-art-77def4b21fca44c1a6c6d1bdc799879c2024-12-06T13:51:51ZengTaylor & Francis GroupGIScience & Remote Sensing1548-16031943-72262024-12-0161110.1080/15481603.2024.2367808Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statisticsPatric Brandt0Florian Beyer1Peter Borrmann2Markus Möller3Heike Gerighausen4Julius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Institute for Crop and Soil Science, Braunschweig, GermanyJulius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Institute for Crop and Soil Science, Braunschweig, GermanyJulius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Institute for Crop and Soil Science, Braunschweig, GermanyJulius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Institute for Crop and Soil Science, Braunschweig, GermanyJulius Kühn Institute (JKI) – Federal Research Centre for Cultivated Plants, Institute for Crop and Soil Science, Braunschweig, GermanyDetailed and accurate statistics on crop productivity are key to inform decision-making related to sustainable food production and supply ensuring global food security. However, annual and high-resolution crop yield data provided by official agricultural statistics are generally lacking. Earth observation (EO) imagery, geodata on meteorological and soil conditions, as well as advances in machine learning (ML) provide huge opportunities for model-based crop yield estimation in terms of covering large spatial scales with unprecedented granularity. This study proposes a novel yield estimation approach that is bottom-up scalable from parcel to administrative levels by leveraging ML-ensembles, comprising of six regression estimators (base estimators), and multi-source geodata, including EO imagery. To ensure the approach’s robustness, two ensemble learning techniques are investigated, namely meta-learning through model stacking and majority voting. ML-ensembles were evaluated multi-annually and crop-specifically for three major winter crops, namely winter wheat (WW), winter barley (WB), and winter rapeseed (WR) in two German federal states, covering 140,000 to 155,000 parcels per year. ML-ensembles were evaluated at the parcel and district level for two German federal states against official yield reports, ranging from 2019 to 2022, based on metrics such as coefficient of determination ([Formula: see text]) and normalized root mean square error ([Formula: see text]). Overall, the most robustly performing ensemble learning technique was majority voting yielding [Formula: see text] and [Formula: see text] values of 0.74, 13.4% for WW, 0.68, 16.9% for WB, and 0.66, 14.1% for WR, respectively, through cross-validation at parcel level. At the district level, majority voting reached [Formula: see text] and [Formula: see text] ranges of 0.79–0.89, 7.2–8.1% for WW, 0.80–0.84, 6.0–9.9% for WB, and 0.60–0.78, 6.1–10.4% for WR, respectively. Capitalizing on ensemble learning-based majority voting, examples of unprecedented high-resolution crop yield maps at [Formula: see text] spatial resolution are presented. Implementing a scalable yield estimation approach, as proposed in this study, into crop yield reporting frameworks of public authorities mandated to provide official agricultural statistics would increase the spatial resolution of annually reported yields, eventually covering the entire cropland available. Such unprecedented data products delivered through map services may improve decision-making support for a variety of stakeholders across different spatial scales, ranging from parcel to higher administrative levels.https://www.tandfonline.com/doi/10.1080/15481603.2024.2367808Agricultural statisticsCopernicuscrop yieldearth observationmachine learningmeta-learning
spellingShingle Patric Brandt
Florian Beyer
Peter Borrmann
Markus Möller
Heike Gerighausen
Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics
GIScience & Remote Sensing
Agricultural statistics
Copernicus
crop yield
earth observation
machine learning
meta-learning
title Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics
title_full Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics
title_fullStr Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics
title_full_unstemmed Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics
title_short Ensemble learning-based crop yield estimation: a scalable approach for supporting agricultural statistics
title_sort ensemble learning based crop yield estimation a scalable approach for supporting agricultural statistics
topic Agricultural statistics
Copernicus
crop yield
earth observation
machine learning
meta-learning
url https://www.tandfonline.com/doi/10.1080/15481603.2024.2367808
work_keys_str_mv AT patricbrandt ensemblelearningbasedcropyieldestimationascalableapproachforsupportingagriculturalstatistics
AT florianbeyer ensemblelearningbasedcropyieldestimationascalableapproachforsupportingagriculturalstatistics
AT peterborrmann ensemblelearningbasedcropyieldestimationascalableapproachforsupportingagriculturalstatistics
AT markusmoller ensemblelearningbasedcropyieldestimationascalableapproachforsupportingagriculturalstatistics
AT heikegerighausen ensemblelearningbasedcropyieldestimationascalableapproachforsupportingagriculturalstatistics