Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use
Graph convolution networks (GCNs) have been extensively researched for action recognition by estimating human skeletons from video clips. However, their image sampling methods are not practical because they require video-length information for sampling images. In this study, we propose an Auxiliary...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/1/198 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841549468513599488 |
---|---|
author | Junsu Cho Seungwon Kim Chi-Min Oh Jeong-Min Park |
author_facet | Junsu Cho Seungwon Kim Chi-Min Oh Jeong-Min Park |
author_sort | Junsu Cho |
collection | DOAJ |
description | Graph convolution networks (GCNs) have been extensively researched for action recognition by estimating human skeletons from video clips. However, their image sampling methods are not practical because they require video-length information for sampling images. In this study, we propose an Auxiliary Task Graph Convolution Network (AT-GCN) with low and high-frame pathways while supporting a new sampling method. AT-GCN learns actions at a defined frame rate in the defined range with three losses: fuse, slow, and fast losses. AT-GCN handles the slow and fast losses in two auxiliary tasks, while the mainstream handles the fuse loss. AT-GCN outperforms the original State-of-the-Art model on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets while maintaining the same inference time. AT-GCN shows the best performance on the NTU RGB+D dataset at 90.3% from subjects, 95.2 from view benchmarks, on the NTU RGB+D 120 dataset at 86.5% from subjects, 87.6% from set benchmarks, and at 93.5% on the NW-UCLA dataset as top-1 accuracy. |
format | Article |
id | doaj-art-df82f77b153d4635b45b6b318ebb4441 |
institution | Kabale University |
issn | 2076-3417 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj-art-df82f77b153d4635b45b6b318ebb44412025-01-10T13:14:46ZengMDPI AGApplied Sciences2076-34172024-12-0115119810.3390/app15010198Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical UseJunsu Cho0Seungwon Kim1Chi-Min Oh2Jeong-Min Park3Department of AI Convergence, Chonnam National University, Gwangju 61186, Republic of KoreaDepartment of AI Convergence, Chonnam National University, Gwangju 61186, Republic of KoreaSafeMotion, Gwangju 61011, Republic of KoreaSafeMotion, Gwangju 61011, Republic of KoreaGraph convolution networks (GCNs) have been extensively researched for action recognition by estimating human skeletons from video clips. However, their image sampling methods are not practical because they require video-length information for sampling images. In this study, we propose an Auxiliary Task Graph Convolution Network (AT-GCN) with low and high-frame pathways while supporting a new sampling method. AT-GCN learns actions at a defined frame rate in the defined range with three losses: fuse, slow, and fast losses. AT-GCN handles the slow and fast losses in two auxiliary tasks, while the mainstream handles the fuse loss. AT-GCN outperforms the original State-of-the-Art model on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets while maintaining the same inference time. AT-GCN shows the best performance on the NTU RGB+D dataset at 90.3% from subjects, 95.2 from view benchmarks, on the NTU RGB+D 120 dataset at 86.5% from subjects, 87.6% from set benchmarks, and at 93.5% on the NW-UCLA dataset as top-1 accuracy.https://www.mdpi.com/2076-3417/15/1/198action recognitionauxiliary taskmulti streamframe rate3D skeletonGCN |
spellingShingle | Junsu Cho Seungwon Kim Chi-Min Oh Jeong-Min Park Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use Applied Sciences action recognition auxiliary task multi stream frame rate 3D skeleton GCN |
title | Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use |
title_full | Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use |
title_fullStr | Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use |
title_full_unstemmed | Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use |
title_short | Auxiliary Task Graph Convolution Network: A Skeleton-Based Action Recognition for Practical Use |
title_sort | auxiliary task graph convolution network a skeleton based action recognition for practical use |
topic | action recognition auxiliary task multi stream frame rate 3D skeleton GCN |
url | https://www.mdpi.com/2076-3417/15/1/198 |
work_keys_str_mv | AT junsucho auxiliarytaskgraphconvolutionnetworkaskeletonbasedactionrecognitionforpracticaluse AT seungwonkim auxiliarytaskgraphconvolutionnetworkaskeletonbasedactionrecognitionforpracticaluse AT chiminoh auxiliarytaskgraphconvolutionnetworkaskeletonbasedactionrecognitionforpracticaluse AT jeongminpark auxiliarytaskgraphconvolutionnetworkaskeletonbasedactionrecognitionforpracticaluse |