Enhancing human-centered dynamic scene understanding via multiple LLMs collaborated reasoning
Abstract Human-centered dynamic scene understanding plays a pivotal role in enhancing the capability of robotic and autonomous systems, where video-based human-object interaction (V-HOI) detection is a crucial task in semantic scene understanding, which aims to comprehensively understand HOI relatio...
Saved in:
| Main Authors: | Hang Zhang, Wenxiao Zhang, Haoxuan Qu, Jun Liu |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-03-01
|
| Series: | Visual Intelligence |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44267-025-00074-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
MuRelSGG: Multimodal Relationship Prediction for Neurosymbolic Scene Graph Generation
by: Muhammad Junaid Khan, et al.
Published: (2025-01-01) -
Domain-Incremental Learning Paradigm for scene understanding via Pseudo-Replay Generation
by: Zhifeng Xie, et al.
Published: (2025-09-01) -
From object to context: Scene knowledge enhanced visual grounding for geospatial understanding
by: Ke She, et al.
Published: (2025-08-01) -
Knowledge reasoning with multiple relational paths
by: Hang Su, et al.
Published: (2023-12-01) -
A Proactive Agent Collaborative Framework for Zero‐Shot Multimodal Medical Reasoning
by: Zishan Gu, et al.
Published: (2025-08-01)