Dual Focus-3D: A Hybrid Deep Learning Approach for Robust 3D Gaze Estimation
Estimating gaze direction is a key task in computer vision, especially for understanding where a person is focusing their attention. It is essential for applications in assistive technology, medical diagnostics, virtual environments, and human–computer interaction. In this work, we introduce Dual Fo...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/25/13/4086 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Estimating gaze direction is a key task in computer vision, especially for understanding where a person is focusing their attention. It is essential for applications in assistive technology, medical diagnostics, virtual environments, and human–computer interaction. In this work, we introduce Dual Focus-3D, a novel hybrid deep learning architecture that combines appearance-based features from eye images with 3D head orientation data. This fusion enhances the model’s prediction accuracy and robustness, particularly in challenging natural environments. To support training and evaluation, we present EyeLis, a new dataset containing 5206 annotated samples with corresponding 3D gaze and head pose information. Our model achieves state-of-the-art performance, with a MAE of 1.64° on EyeLis, demonstrating its ability to generalize effectively across both synthetic and real datasets. Key innovations include a multimodal feature fusion strategy, an angular loss function optimized for 3D gaze prediction, and regularization techniques to mitigate overfitting. Our results show that including 3D spatial information directly in the learning process significantly improves accuracy. |
|---|---|
| ISSN: | 1424-8220 |