Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion

With the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In th...

Full description

Saved in:

Bibliographic Details
Main Authors:	Doudou Sun, Gang Wang
Format:	Article
Language:	English
Published:	Elsevier 2024-11-01
Series:	Alexandria Engineering Journal
Subjects:	Deep learning Dance score generation Transformer Neural network Feature extraction
Online Access:	http://www.sciencedirect.com/science/article/pii/S1110016824008020
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846167123168067584
author	Doudou Sun Gang Wang
author_facet	Doudou Sun Gang Wang
author_sort	Doudou Sun
collection	DOAJ
description	With the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In this study, we propose an automatic dance score generation model that utilizes local spatio-temporal features to address the inefficiency and creativity limitations of traditional choreography methods. Specifically, we propose the Multiscale Spatio-Temporal Convolution (MSConv) module to capture local spatio-temporal features in human skeletal motion sequences. In addition, the Compressed Pyramid Attention (CPA) mechanism is used to achieve effective fusion of global and local features. This mechanism facilitates the interaction between global and local spatio-temporal information and automatically generates dance sequences by analyzing motion data from dance videos to extract key features. We validate the proposed method on Laban 16 and Laban 48 dance score datasets, and the generated Laban sequences preserve the original style of the dance sequences with a combined accuracy of 94.2% and 93.7%, respectively.
format	Article
id	doaj-art-872f7625af604e56bc240df7c5af9212
institution	Kabale University
issn	1110-0168
language	English
publishDate	2024-11-01
publisher	Elsevier
record_format	Article
series	Alexandria Engineering Journal
spelling	doaj-art-872f7625af604e56bc240df7c5af92122024-11-15T06:11:14ZengElsevierAlexandria Engineering Journal1110-01682024-11-01107634642Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusionDoudou Sun0Gang Wang1Center for Aesthetic & Art Education, Beijing Information Science and Technology University, 100101, BeiJing, China; Corresponding author.School of Computing and Data Engineering, NingboTech University, 315100, Ningbo, China; Department of Bioengineering, Imperial College London, SW7 2AZ, London, United KingdomWith the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In this study, we propose an automatic dance score generation model that utilizes local spatio-temporal features to address the inefficiency and creativity limitations of traditional choreography methods. Specifically, we propose the Multiscale Spatio-Temporal Convolution (MSConv) module to capture local spatio-temporal features in human skeletal motion sequences. In addition, the Compressed Pyramid Attention (CPA) mechanism is used to achieve effective fusion of global and local features. This mechanism facilitates the interaction between global and local spatio-temporal information and automatically generates dance sequences by analyzing motion data from dance videos to extract key features. We validate the proposed method on Laban 16 and Laban 48 dance score datasets, and the generated Laban sequences preserve the original style of the dance sequences with a combined accuracy of 94.2% and 93.7%, respectively.http://www.sciencedirect.com/science/article/pii/S1110016824008020Deep learningDance score generationTransformerNeural networkFeature extraction
spellingShingle	Doudou Sun Gang Wang Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion Alexandria Engineering Journal Deep learning Dance score generation Transformer Neural network Feature extraction
title	Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_full	Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_fullStr	Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_full_unstemmed	Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_short	Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_sort	deep learning driven multi scale spatiotemporal fusion dance spectrum generation network a method based on human pose fusion
topic	Deep learning Dance score generation Transformer Neural network Feature extraction
url	http://www.sciencedirect.com/science/article/pii/S1110016824008020
work_keys_str_mv	AT doudousun deeplearningdrivenmultiscalespatiotemporalfusiondancespectrumgenerationnetworkamethodbasedonhumanposefusion AT gangwang deeplearningdrivenmultiscalespatiotemporalfusiondancespectrumgenerationnetworkamethodbasedonhumanposefusion

Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion

Similar Items