YOLO-I3D: Optimizing Inflated 3D Models for Real-Time Human Activity Recognition

Human Activity Recognition (HAR) plays a critical role in applications such as security surveillance and healthcare. However, existing methods, particularly two-stream models like Inflated 3D (I3D), face significant challenges in real-time applications due to their high computational demand, especia...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruikang Luo, Aman Anand, Farhana Zulkernine, Francois Rivest
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Journal of Imaging
Subjects:
Online Access:https://www.mdpi.com/2313-433X/10/11/269
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Human Activity Recognition (HAR) plays a critical role in applications such as security surveillance and healthcare. However, existing methods, particularly two-stream models like Inflated 3D (I3D), face significant challenges in real-time applications due to their high computational demand, especially from the optical flow branch. In this work, we address these limitations by proposing two major improvements. First, we introduce a lightweight motion information branch that replaces the computationally expensive optical flow component with a lower-resolution RGB input, significantly reducing computation time. Second, we incorporate YOLOv5, an efficient object detector, to further optimize the RGB branch for faster real-time performance. Experimental results on the Kinetics-400 dataset demonstrate that our proposed two-stream I3D Light model improves the original I3D model’s accuracy by 4.13% while reducing computational cost. Additionally, the integration of YOLOv5 into the I3D model enhances accuracy by 1.42%, providing a more efficient solution for real-time HAR tasks.
ISSN:2313-433X