Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion

With the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In th...

Full description

Saved in:
Bibliographic Details
Main Authors: Doudou Sun, Gang Wang
Format: Article
Language:English
Published: Elsevier 2024-11-01
Series:Alexandria Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1110016824008020
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846167123168067584
author Doudou Sun
Gang Wang
author_facet Doudou Sun
Gang Wang
author_sort Doudou Sun
collection DOAJ
description With the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In this study, we propose an automatic dance score generation model that utilizes local spatio-temporal features to address the inefficiency and creativity limitations of traditional choreography methods. Specifically, we propose the Multiscale Spatio-Temporal Convolution (MSConv) module to capture local spatio-temporal features in human skeletal motion sequences. In addition, the Compressed Pyramid Attention (CPA) mechanism is used to achieve effective fusion of global and local features. This mechanism facilitates the interaction between global and local spatio-temporal information and automatically generates dance sequences by analyzing motion data from dance videos to extract key features. We validate the proposed method on Laban 16 and Laban 48 dance score datasets, and the generated Laban sequences preserve the original style of the dance sequences with a combined accuracy of 94.2% and 93.7%, respectively.
format Article
id doaj-art-872f7625af604e56bc240df7c5af9212
institution Kabale University
issn 1110-0168
language English
publishDate 2024-11-01
publisher Elsevier
record_format Article
series Alexandria Engineering Journal
spelling doaj-art-872f7625af604e56bc240df7c5af92122024-11-15T06:11:14ZengElsevierAlexandria Engineering Journal1110-01682024-11-01107634642Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusionDoudou Sun0Gang Wang1Center for Aesthetic & Art Education, Beijing Information Science and Technology University, 100101, BeiJing, China; Corresponding author.School of Computing and Data Engineering, NingboTech University, 315100, Ningbo, China; Department of Bioengineering, Imperial College London, SW7 2AZ, London, United KingdomWith the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In this study, we propose an automatic dance score generation model that utilizes local spatio-temporal features to address the inefficiency and creativity limitations of traditional choreography methods. Specifically, we propose the Multiscale Spatio-Temporal Convolution (MSConv) module to capture local spatio-temporal features in human skeletal motion sequences. In addition, the Compressed Pyramid Attention (CPA) mechanism is used to achieve effective fusion of global and local features. This mechanism facilitates the interaction between global and local spatio-temporal information and automatically generates dance sequences by analyzing motion data from dance videos to extract key features. We validate the proposed method on Laban 16 and Laban 48 dance score datasets, and the generated Laban sequences preserve the original style of the dance sequences with a combined accuracy of 94.2% and 93.7%, respectively.http://www.sciencedirect.com/science/article/pii/S1110016824008020Deep learningDance score generationTransformerNeural networkFeature extraction
spellingShingle Doudou Sun
Gang Wang
Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
Alexandria Engineering Journal
Deep learning
Dance score generation
Transformer
Neural network
Feature extraction
title Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_full Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_fullStr Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_full_unstemmed Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_short Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
title_sort deep learning driven multi scale spatiotemporal fusion dance spectrum generation network a method based on human pose fusion
topic Deep learning
Dance score generation
Transformer
Neural network
Feature extraction
url http://www.sciencedirect.com/science/article/pii/S1110016824008020
work_keys_str_mv AT doudousun deeplearningdrivenmultiscalespatiotemporalfusiondancespectrumgenerationnetworkamethodbasedonhumanposefusion
AT gangwang deeplearningdrivenmultiscalespatiotemporalfusiondancespectrumgenerationnetworkamethodbasedonhumanposefusion