MuIm: Analyzing Music–Image Correlations from an Artistic Perspective

Cross-modality understanding is essential for AI to tackle complex tasks that require both deterministic and generative capabilities, such as correlating music and visual art. The existing state-of-the-art methods of audio-visual correlation often rely on single-dimension information, focusing eithe...

Full description

Saved in:

Bibliographic Details
Main Authors:	Ubaid Ullah, Hyun-Chul Choi
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Applied Sciences
Subjects:	music–image cross-modality neural networks multi-modality
Online Access:	https://www.mdpi.com/2076-3417/14/23/11470
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846124391164805120
author	Ubaid Ullah Hyun-Chul Choi
author_facet	Ubaid Ullah Hyun-Chul Choi
author_sort	Ubaid Ullah
collection	DOAJ
description	Cross-modality understanding is essential for AI to tackle complex tasks that require both deterministic and generative capabilities, such as correlating music and visual art. The existing state-of-the-art methods of audio-visual correlation often rely on single-dimension information, focusing either on semantic or emotional attributes, thus failing to capture the full depth of these inherently complex modalities. Addressing this limitation, we introduce a novel approach that perceives music–image correlation as multilayered rather than as a direct one-to-one correspondence. To this end, we present a pioneering dataset with two segments: an artistic segment that pairs music with art based on both emotional and semantic attributes, and a realistic segment that links music with images through affective–semantic layers. In modeling emotional layers for the artistic segment, we found traditional 2D affective models inadequate, prompting us to propose a more interpretable hybrid-emotional rating system that serves both experts and non-experts. For the realistic segment, we utilize a web-based dataset with tags, dividing tag information into semantic and affective components to ensure a balanced and nuanced representation of music–image correlation. We conducted an in-depth statistical analysis and user study to evaluate our dataset’s effectiveness and applicability for AI-driven understanding. This work provides a foundation for advanced explorations into the complex relationships between auditory and visual art modalities, advancing the development of more sophisticated cross-modal AI systems.
format	Article
id	doaj-art-be07e9bb6fed4aaeaac0bcffa8d163ae
institution	Kabale University
issn	2076-3417
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-be07e9bb6fed4aaeaac0bcffa8d163ae2024-12-13T16:24:01ZengMDPI AGApplied Sciences2076-34172024-12-0114231147010.3390/app142311470MuIm: Analyzing Music–Image Correlations from an Artistic PerspectiveUbaid Ullah0Hyun-Chul Choi1Intelligent Computer Vision Software Laboratory (ICVSLab), Department of Electronic Engineering, Yeungnam University, 280 Daehak-Ro, Gyeongsan 38541, Gyeongbuk, Republic of KoreaIntelligent Computer Vision Software Laboratory (ICVSLab), Department of Electronic Engineering, Yeungnam University, 280 Daehak-Ro, Gyeongsan 38541, Gyeongbuk, Republic of KoreaCross-modality understanding is essential for AI to tackle complex tasks that require both deterministic and generative capabilities, such as correlating music and visual art. The existing state-of-the-art methods of audio-visual correlation often rely on single-dimension information, focusing either on semantic or emotional attributes, thus failing to capture the full depth of these inherently complex modalities. Addressing this limitation, we introduce a novel approach that perceives music–image correlation as multilayered rather than as a direct one-to-one correspondence. To this end, we present a pioneering dataset with two segments: an artistic segment that pairs music with art based on both emotional and semantic attributes, and a realistic segment that links music with images through affective–semantic layers. In modeling emotional layers for the artistic segment, we found traditional 2D affective models inadequate, prompting us to propose a more interpretable hybrid-emotional rating system that serves both experts and non-experts. For the realistic segment, we utilize a web-based dataset with tags, dividing tag information into semantic and affective components to ensure a balanced and nuanced representation of music–image correlation. We conducted an in-depth statistical analysis and user study to evaluate our dataset’s effectiveness and applicability for AI-driven understanding. This work provides a foundation for advanced explorations into the complex relationships between auditory and visual art modalities, advancing the development of more sophisticated cross-modal AI systems.https://www.mdpi.com/2076-3417/14/23/11470music–imagecross-modalityneural networksmulti-modality
spellingShingle	Ubaid Ullah Hyun-Chul Choi MuIm: Analyzing Music–Image Correlations from an Artistic Perspective Applied Sciences music–image cross-modality neural networks multi-modality
title	MuIm: Analyzing Music–Image Correlations from an Artistic Perspective
title_full	MuIm: Analyzing Music–Image Correlations from an Artistic Perspective
title_fullStr	MuIm: Analyzing Music–Image Correlations from an Artistic Perspective
title_full_unstemmed	MuIm: Analyzing Music–Image Correlations from an Artistic Perspective
title_short	MuIm: Analyzing Music–Image Correlations from an Artistic Perspective
title_sort	muim analyzing music image correlations from an artistic perspective
topic	music–image cross-modality neural networks multi-modality
url	https://www.mdpi.com/2076-3417/14/23/11470
work_keys_str_mv	AT ubaidullah muimanalyzingmusicimagecorrelationsfromanartisticperspective AT hyunchulchoi muimanalyzingmusicimagecorrelationsfromanartisticperspective

MuIm: Analyzing Music–Image Correlations from an Artistic Perspective

Similar Items