The utility of combining deep learning with metabarcoding to model biodiversity dynamics at a national scale

To make informed decisions on how to effectively protect biodiversity, we need knowledge of its spatial and temporal dynamics. By combining detailed biodiversity surveys, geospatial data, and machine learning, we can model biodiversity with the aim of gaining insights into how these complex patterns...

Full description

Saved in:
Bibliographic Details
Main Authors: Adrian Baggström, Robert Goodsell, Laura van Dijk, Ela Iwaszkiewicz-Eggebrecht, Andreia Miraldo, Ayco J.M. Tack, Tobias Andermann
Format: Article
Language:English
Published: Elsevier 2025-12-01
Series:Ecological Informatics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1574954125003279
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To make informed decisions on how to effectively protect biodiversity, we need knowledge of its spatial and temporal dynamics. By combining detailed biodiversity surveys, geospatial data, and machine learning, we can model biodiversity with the aim of gaining insights into how these complex patterns behave. Here, we present a biodiversity modeling approach that utilizes metabarcoding-derived biodiversity data, remote sensing, and convolutional neural networks (CNNs). We apply CNNs to predict the spatial pattern of seasonal arthropod richness across Sweden and compare the results with other statistical models commonly used in spatial modeling. The biodiversity data used to train the models constitutes a state-of-the-art metabarcoding dataset composed of arthropod bulk samples, collected weekly from 198 locations. In addition, we compile 25 environmental features from public spatial data sources, describing each site's conditions. We find that CNN models do not outperform the other models in the applied performance metrics despite their conceptual advantage of incorporating contextual information. Most of the tested models capture the general seasonal diversity trends, resulting in similar performance metrics. However, when more closely inspecting the predicted spatial patterns we find that the CNN predictions yield ecologically more sensible patterns that distinguish different habitat types, as opposed to the other approaches. We conclude that while CNNs offer structural advantages for processing complex spatial data, their predictive performance does not surpass that of less complex statistical models, when applied to biodiversity datasets representing relatively few sites. Nonetheless, as metabarcoding biodiversity datasets continue to grow through large-scale sampling efforts, CNNs constitute a promising modeling approach to capture the complex correlations between biodiversity dynamics and the surrounding multidimensional environmental matrix.
ISSN:1574-9541