Soil Science-Informed Machine Learning

Machine learning (ML) applications in soil science have significantly increased over the past two decades, reflecting a growing trend towards data-driven research addressing soil security. This extensive application has mainly focused on enhancing predictions of soil properties, particularly soil or...

Full description

Saved in:
Bibliographic Details
Main Authors: Budiman Minasny, Toshiyuki Bandai, Teamrat A. Ghezzehei, Yin-Chung Huang, Yuxin Ma, Alex B. McBratney, Wartini Ng, Sarem Norouzi, Jose Padarian, Rudiyanto, Amin Sharififar, Quentin Styc, Marliana Widyastuti
Format: Article
Language:English
Published: Elsevier 2024-12-01
Series:Geoderma
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0016706124003239
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846139905199046656
author Budiman Minasny
Toshiyuki Bandai
Teamrat A. Ghezzehei
Yin-Chung Huang
Yuxin Ma
Alex B. McBratney
Wartini Ng
Sarem Norouzi
Jose Padarian
Rudiyanto
Amin Sharififar
Quentin Styc
Marliana Widyastuti
author_facet Budiman Minasny
Toshiyuki Bandai
Teamrat A. Ghezzehei
Yin-Chung Huang
Yuxin Ma
Alex B. McBratney
Wartini Ng
Sarem Norouzi
Jose Padarian
Rudiyanto
Amin Sharififar
Quentin Styc
Marliana Widyastuti
author_sort Budiman Minasny
collection DOAJ
description Machine learning (ML) applications in soil science have significantly increased over the past two decades, reflecting a growing trend towards data-driven research addressing soil security. This extensive application has mainly focused on enhancing predictions of soil properties, particularly soil organic carbon, and improving the accuracy of digital soil mapping (DSM). Despite these advancements, the application of ML in soil science faces challenges related to data scarcity and the interpretability of ML models. There is a need for a shift towards Soil Science-Informed ML (SoilML) models that use the power of ML but also incorporate soil science knowledge in the training process to make predictions more reliable and generalisable. This paper proposes methodologies for embedding ML models with soil science knowledge to overcome current limitations. Incorporating soil science knowledge into ML models involves using observational priors to enhance training datasets, designing model structures which reflect soil science principles, and supervising model training with soil science-informed loss functions. The informed loss functions include observational constraints, coherency rules such as regularisation to avoid overfitting, and prior or soil-knowledge constraints that incorporate existing information about the parameters or outputs. By way of illustration, we present examples from four fields: digital soil mapping, soil spectroscopy, pedotransfer functions, and dynamic soil property models. We discuss the potential to integrate process-based models for improved prediction, the use of physics-informed neural networks, limitations, and the issue of overparametrisation. These approaches improve the relevance of ML predictions in soil science and enhance the models’ ability to generalise across different scenarios while maintaining soil science principles, transparency and reliability.
format Article
id doaj-art-b96fd0fe95a043a689a5f2030a96a07f
institution Kabale University
issn 1872-6259
language English
publishDate 2024-12-01
publisher Elsevier
record_format Article
series Geoderma
spelling doaj-art-b96fd0fe95a043a689a5f2030a96a07f2024-12-06T05:12:42ZengElsevierGeoderma1872-62592024-12-01452117094Soil Science-Informed Machine LearningBudiman Minasny0Toshiyuki Bandai1Teamrat A. Ghezzehei2Yin-Chung Huang3Yuxin Ma4Alex B. McBratney5Wartini Ng6Sarem Norouzi7Jose Padarian8 Rudiyanto9Amin Sharififar10Quentin Styc11Marliana Widyastuti12School of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaEarth and Environmental Sciences Area, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USALife & Environmental Sciences Department, University of California, Merced, CA 95343, USASchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaNew South Wales Department of Climate Change, Energy, the Environment and Water, Parramatta, NSW 2150, AustraliaSchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaSchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaDepartment of Agroecology, Aarhus University, 8830 Tjele, DenmarkSchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaFaculty of Fisheries and Food Science, Universiti Malaysia Terengganu, 21030 Kuala Nerus, Terengganu, MalaysiaSchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaSchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaSchool of Life and Environmental Sciences & Sydney Institute of Agriculture, The University of Sydney, NSW 2006, AustraliaMachine learning (ML) applications in soil science have significantly increased over the past two decades, reflecting a growing trend towards data-driven research addressing soil security. This extensive application has mainly focused on enhancing predictions of soil properties, particularly soil organic carbon, and improving the accuracy of digital soil mapping (DSM). Despite these advancements, the application of ML in soil science faces challenges related to data scarcity and the interpretability of ML models. There is a need for a shift towards Soil Science-Informed ML (SoilML) models that use the power of ML but also incorporate soil science knowledge in the training process to make predictions more reliable and generalisable. This paper proposes methodologies for embedding ML models with soil science knowledge to overcome current limitations. Incorporating soil science knowledge into ML models involves using observational priors to enhance training datasets, designing model structures which reflect soil science principles, and supervising model training with soil science-informed loss functions. The informed loss functions include observational constraints, coherency rules such as regularisation to avoid overfitting, and prior or soil-knowledge constraints that incorporate existing information about the parameters or outputs. By way of illustration, we present examples from four fields: digital soil mapping, soil spectroscopy, pedotransfer functions, and dynamic soil property models. We discuss the potential to integrate process-based models for improved prediction, the use of physics-informed neural networks, limitations, and the issue of overparametrisation. These approaches improve the relevance of ML predictions in soil science and enhance the models’ ability to generalise across different scenarios while maintaining soil science principles, transparency and reliability.http://www.sciencedirect.com/science/article/pii/S0016706124003239Artificial IntelligenceProcess-based modelsPhysics Informed Neural NetworksInformed Machine LearningMechanistic modelsPedology
spellingShingle Budiman Minasny
Toshiyuki Bandai
Teamrat A. Ghezzehei
Yin-Chung Huang
Yuxin Ma
Alex B. McBratney
Wartini Ng
Sarem Norouzi
Jose Padarian
Rudiyanto
Amin Sharififar
Quentin Styc
Marliana Widyastuti
Soil Science-Informed Machine Learning
Geoderma
Artificial Intelligence
Process-based models
Physics Informed Neural Networks
Informed Machine Learning
Mechanistic models
Pedology
title Soil Science-Informed Machine Learning
title_full Soil Science-Informed Machine Learning
title_fullStr Soil Science-Informed Machine Learning
title_full_unstemmed Soil Science-Informed Machine Learning
title_short Soil Science-Informed Machine Learning
title_sort soil science informed machine learning
topic Artificial Intelligence
Process-based models
Physics Informed Neural Networks
Informed Machine Learning
Mechanistic models
Pedology
url http://www.sciencedirect.com/science/article/pii/S0016706124003239
work_keys_str_mv AT budimanminasny soilscienceinformedmachinelearning
AT toshiyukibandai soilscienceinformedmachinelearning
AT teamrataghezzehei soilscienceinformedmachinelearning
AT yinchunghuang soilscienceinformedmachinelearning
AT yuxinma soilscienceinformedmachinelearning
AT alexbmcbratney soilscienceinformedmachinelearning
AT wartining soilscienceinformedmachinelearning
AT saremnorouzi soilscienceinformedmachinelearning
AT josepadarian soilscienceinformedmachinelearning
AT rudiyanto soilscienceinformedmachinelearning
AT aminsharififar soilscienceinformedmachinelearning
AT quentinstyc soilscienceinformedmachinelearning
AT marlianawidyastuti soilscienceinformedmachinelearning