Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion
With the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In th...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2024-11-01
|
| Series: | Alexandria Engineering Journal |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S1110016824008020 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846167123168067584 |
|---|---|
| author | Doudou Sun Gang Wang |
| author_facet | Doudou Sun Gang Wang |
| author_sort | Doudou Sun |
| collection | DOAJ |
| description | With the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In this study, we propose an automatic dance score generation model that utilizes local spatio-temporal features to address the inefficiency and creativity limitations of traditional choreography methods. Specifically, we propose the Multiscale Spatio-Temporal Convolution (MSConv) module to capture local spatio-temporal features in human skeletal motion sequences. In addition, the Compressed Pyramid Attention (CPA) mechanism is used to achieve effective fusion of global and local features. This mechanism facilitates the interaction between global and local spatio-temporal information and automatically generates dance sequences by analyzing motion data from dance videos to extract key features. We validate the proposed method on Laban 16 and Laban 48 dance score datasets, and the generated Laban sequences preserve the original style of the dance sequences with a combined accuracy of 94.2% and 93.7%, respectively. |
| format | Article |
| id | doaj-art-872f7625af604e56bc240df7c5af9212 |
| institution | Kabale University |
| issn | 1110-0168 |
| language | English |
| publishDate | 2024-11-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Alexandria Engineering Journal |
| spelling | doaj-art-872f7625af604e56bc240df7c5af92122024-11-15T06:11:14ZengElsevierAlexandria Engineering Journal1110-01682024-11-01107634642Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusionDoudou Sun0Gang Wang1Center for Aesthetic & Art Education, Beijing Information Science and Technology University, 100101, BeiJing, China; Corresponding author.School of Computing and Data Engineering, NingboTech University, 315100, Ningbo, China; Department of Bioengineering, Imperial College London, SW7 2AZ, London, United KingdomWith the integration of dance art and computer technology, automatic dance score generation has become a new research direction in computer vision and machine learning, but generating the corresponding Laban symbols by capturing the skeletal key points of dance movements is a challenging task. In this study, we propose an automatic dance score generation model that utilizes local spatio-temporal features to address the inefficiency and creativity limitations of traditional choreography methods. Specifically, we propose the Multiscale Spatio-Temporal Convolution (MSConv) module to capture local spatio-temporal features in human skeletal motion sequences. In addition, the Compressed Pyramid Attention (CPA) mechanism is used to achieve effective fusion of global and local features. This mechanism facilitates the interaction between global and local spatio-temporal information and automatically generates dance sequences by analyzing motion data from dance videos to extract key features. We validate the proposed method on Laban 16 and Laban 48 dance score datasets, and the generated Laban sequences preserve the original style of the dance sequences with a combined accuracy of 94.2% and 93.7%, respectively.http://www.sciencedirect.com/science/article/pii/S1110016824008020Deep learningDance score generationTransformerNeural networkFeature extraction |
| spellingShingle | Doudou Sun Gang Wang Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion Alexandria Engineering Journal Deep learning Dance score generation Transformer Neural network Feature extraction |
| title | Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion |
| title_full | Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion |
| title_fullStr | Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion |
| title_full_unstemmed | Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion |
| title_short | Deep learning driven multi-scale spatiotemporal fusion dance spectrum generation network: A method based on human pose fusion |
| title_sort | deep learning driven multi scale spatiotemporal fusion dance spectrum generation network a method based on human pose fusion |
| topic | Deep learning Dance score generation Transformer Neural network Feature extraction |
| url | http://www.sciencedirect.com/science/article/pii/S1110016824008020 |
| work_keys_str_mv | AT doudousun deeplearningdrivenmultiscalespatiotemporalfusiondancespectrumgenerationnetworkamethodbasedonhumanposefusion AT gangwang deeplearningdrivenmultiscalespatiotemporalfusiondancespectrumgenerationnetworkamethodbasedonhumanposefusion |