Audio-Language Datasets of Scenes and Events: A Survey
Audio-language models (ALMs) generate linguistic descriptions of sound-producing events and scenes. Advances in dataset creation and computational power have led to significant progress in this domain. This paper surveys 69 datasets used to train ALMs, covering research up to September 2024 (<uri...
Saved in:
| Main Authors: | Gijs Wijngaard, Elia Formisano, Michele Esposito, Michel Dumontier |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10854210/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
The Development of Digital Audio Coding
by: Guo Ke
Published: (1995-01-01) -
Trends in audio scene source counting and analysis
by: Michael Nigro, et al.
Published: (2024-12-01) -
Estimating rainfall intensity based on surveillance audio and deep-learning
by: Meizhen Wang, et al.
Published: (2024-11-01) -
AUDIO BRANDING GUIDANCE MODEL IN THE CASE OF SMALL AND MEDIUM-SIZED BUSINESSES
by: Justinas Kisieliauskas, et al.
Published: (2024-12-01) -
Audio Features and Crowdfunding Success: An Empirical Study Using Audio Mining
by: Miao Miao, et al.
Published: (2024-11-01)