Audio-Language Datasets of Scenes and Events: A Survey
Audio-language models (ALMs) generate linguistic descriptions of sound-producing events and scenes. Advances in dataset creation and computational power have led to significant progress in this domain. This paper surveys 69 datasets used to train ALMs, covering research up to September 2024 (<uri...
Saved in:
Main Authors: | Gijs Wijngaard, Elia Formisano, Michele Esposito, Michel Dumontier |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10854210/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
The Development of Digital Audio Coding
by: Guo Ke
Published: (1995-01-01) -
A Novel Audio Copy Move Forgery Detection Method With Classification of Graph-Based Representations
by: Beste Ustubioglu, et al.
Published: (2025-01-01) -
Deep convolutional neural networks for double compressed AMR audio detection
by: Aykut Büker, et al.
Published: (2021-06-01) -
Synchronization and blind detect algorithm for dual channel audio watermark
by: FENG Tao1, et al.
Published: (2006-01-01) -
Audiogmenter: a MATLAB toolbox for audio data augmentation
by: Gianluca Maguolo, et al.
Published: (2025-01-01)