A bird song detector for improving bird identification through deep learning: A case study from Doñana
Passive Acoustic Monitoring (PAM), which uses devices like automatic audio recorders, has become a fundamental tool in conserving and managing natural ecosystems. However, the large volume of unsupervised audio data that PAM generates poses a major challenge for extracting meaningful information. De...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-12-01
|
| Series: | Ecological Informatics |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1574954125002638 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Passive Acoustic Monitoring (PAM), which uses devices like automatic audio recorders, has become a fundamental tool in conserving and managing natural ecosystems. However, the large volume of unsupervised audio data that PAM generates poses a major challenge for extracting meaningful information. Deep Learning techniques, particularly automated species identification models based on computer vision, offer a promising solution. BirdNET, a widely used model for bird identification, has shown success in many study systems but is limited at local scale due to biases in its training data, which focus on specific locations and target sounds rather than entire soundscapes. A key challenge in bird species detection is that many recordings either lack target species or contain overlapping vocalizations, complicating automatic identification. To overcome these problems, we developed a three-stage pipeline for automatic bird vocalization identification in Doñana National Park (SW Spain), a wetland facing significant conservation threats. We deployed AudioMoth recorders in three main habitats across nine different locations within Doñana, and the manual annotation of 461 min of audio data, resulting in 3749 annotations covering 34 classes. Our working pipeline included, first, the development of a Bird Song Detector to isolate bird vocalizations, using spectrograms as graphical representations of bird audio data and applying image processing methods. Second, we classified bird species training custom classifiers at the local scale with BirdNET’s embeddings. The best-performing detection model incorporated synthetic background audios through data augmentation and an environmental sound library (ESC-50). Applying the Bird Song Detector before classification improved species identification, as all classification models performed better when analyzing only the segments where birds were detected. Specifically, the combination of the Bird Song Detector and fine-tuned BirdNET increased weighted precision (from 0.18 to 0.37), recall (from 0.21 to 0.30), and F1 score (from 0.17 to 0.28), compared to the baseline without the Bird Song Detector. Our approach demonstrated the effectiveness of integrating a Bird Song Detector with fine-tuned classification models for bird identification at local soundscapes. These findings highlight the need to adapt general-purpose tools for specific ecological challenges, as demonstrated in Doñana. Automatically detecting bird species serves for tracking the health status of this threatened ecosystem, given the sensitivity of birds to environmental changes, and helps in the design of conservation measures for reducing biodiversity loss. |
|---|---|
| ISSN: | 1574-9541 |