Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array

Abstract In this study, we investigate the effectiveness of spatial features in acoustic scene classification using distributed microphone arrays. Under the assumption that multiple subarrays, each equipped with microphones, are synchronized, we investigate two types of spatial feature: intra- and i...

Full description

Saved in:

Bibliographic Details
Main Authors:	Takao Kawamura, Yuma Kinoshita, Nobutaka Ono, Robin Scheibler
Format:	Article
Language:	English
Published:	SpringerOpen 2024-12-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Subjects:	Domestic activity monitoring Acoustic scene classification Distributed microphone array Subarray Generalized cross-correlation phase transform Middle integration
Online Access:	https://doi.org/10.1186/s13636-024-00386-y
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846112368573022208
author	Takao Kawamura Yuma Kinoshita Nobutaka Ono Robin Scheibler
author_facet	Takao Kawamura Yuma Kinoshita Nobutaka Ono Robin Scheibler
author_sort	Takao Kawamura
collection	DOAJ
description	Abstract In this study, we investigate the effectiveness of spatial features in acoustic scene classification using distributed microphone arrays. Under the assumption that multiple subarrays, each equipped with microphones, are synchronized, we investigate two types of spatial feature: intra- and inter-generalized cross-correlation phase transforms (GCC-PHATs). These are derived from channels within the same subarray and between different subarrays, respectively. Our approach treats the log-Mel spectrogram as a spectral feature and intra- and/or inter-GCC-PHAT as a spatial feature. We propose two integration methods for spectral and spatial features: (a) middle integration, which fuses embeddings obtained by spectral and spatial features, and (b) late integration, which fuses decisions estimated using spectral and spatial features. The evaluation experiments showed that, when using only spectral features, employing all channels did not markedly improve the F1-score compared with the single-channel case. In contrast, integrating both spectral and spatial features improved the F1-score compared with using only spectral features. Additionally, we confirmed that the F1-score for late integration was slightly higher than that for middle integration.
format	Article
id	doaj-art-bcc7d186a0ec4c2d89c79769929a3933
institution	Kabale University
issn	1687-4722
language	English
publishDate	2024-12-01
publisher	SpringerOpen
record_format	Article
series	EURASIP Journal on Audio, Speech, and Music Processing
spelling	doaj-art-bcc7d186a0ec4c2d89c79769929a39332024-12-22T12:39:30ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222024-12-012024111310.1186/s13636-024-00386-yAcoustic scene classification using inter- and intra-subarray spatial features in distributed microphone arrayTakao Kawamura0Yuma Kinoshita1Nobutaka Ono2Robin Scheibler3Department of Computer Science, Tokyo Metropolitan UniversityDepartment of Computer Science, Tokyo Metropolitan UniversityDepartment of Computer Science, Tokyo Metropolitan UniversityMusic Processing Team, LY CorporationAbstract In this study, we investigate the effectiveness of spatial features in acoustic scene classification using distributed microphone arrays. Under the assumption that multiple subarrays, each equipped with microphones, are synchronized, we investigate two types of spatial feature: intra- and inter-generalized cross-correlation phase transforms (GCC-PHATs). These are derived from channels within the same subarray and between different subarrays, respectively. Our approach treats the log-Mel spectrogram as a spectral feature and intra- and/or inter-GCC-PHAT as a spatial feature. We propose two integration methods for spectral and spatial features: (a) middle integration, which fuses embeddings obtained by spectral and spatial features, and (b) late integration, which fuses decisions estimated using spectral and spatial features. The evaluation experiments showed that, when using only spectral features, employing all channels did not markedly improve the F1-score compared with the single-channel case. In contrast, integrating both spectral and spatial features improved the F1-score compared with using only spectral features. Additionally, we confirmed that the F1-score for late integration was slightly higher than that for middle integration.https://doi.org/10.1186/s13636-024-00386-yDomestic activity monitoringAcoustic scene classificationDistributed microphone arraySubarrayGeneralized cross-correlation phase transformMiddle integration
spellingShingle	Takao Kawamura Yuma Kinoshita Nobutaka Ono Robin Scheibler Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array EURASIP Journal on Audio, Speech, and Music Processing Domestic activity monitoring Acoustic scene classification Distributed microphone array Subarray Generalized cross-correlation phase transform Middle integration
title	Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array
title_full	Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array
title_fullStr	Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array
title_full_unstemmed	Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array
title_short	Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array
title_sort	acoustic scene classification using inter and intra subarray spatial features in distributed microphone array
topic	Domestic activity monitoring Acoustic scene classification Distributed microphone array Subarray Generalized cross-correlation phase transform Middle integration
url	https://doi.org/10.1186/s13636-024-00386-y
work_keys_str_mv	AT takaokawamura acousticsceneclassificationusinginterandintrasubarrayspatialfeaturesindistributedmicrophonearray AT yumakinoshita acousticsceneclassificationusinginterandintrasubarrayspatialfeaturesindistributedmicrophonearray AT nobutakaono acousticsceneclassificationusinginterandintrasubarrayspatialfeaturesindistributedmicrophonearray AT robinscheibler acousticsceneclassificationusinginterandintrasubarrayspatialfeaturesindistributedmicrophonearray

Acoustic scene classification using inter- and intra-subarray spatial features in distributed microphone array

Similar Items