CVCPSG: Discovering Composite Visual Clues for Panoptic Scene Graph Generation
Abstract Panoptic Scene Graph Generation (PSG) aims to segment objects and predict the relation triplets <subject, relation, object> within an image. Despite the impressive achievements in PSG, current methods still struggle to capture fine-grained visual context, eschewing spatial and situati...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-05-01
|
| Series: | Journal of King Saud University: Computer and Information Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44443-025-00063-w |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Panoptic Scene Graph Generation (PSG) aims to segment objects and predict the relation triplets <subject, relation, object> within an image. Despite the impressive achievements in PSG, current methods still struggle to capture fine-grained visual context, eschewing spatial and situational information in favor of visual features related to object identity. This limitation naturally impedes the model’s ability to distinguish subtle visual differences between relation triplets, such as “cat-on-person” and “cat-lying on-person”. To address this challenge, we propose CVCPSG, a novel DETR-based method that uncovers composite visual clues for PSG. Specifically, drawing inspiration from how humans capture visual context using diverse visual clues, we first construct a composite visual clues bank based on three key aspects: object, spatial, and situational. Then, we introduce a multi-level visual extractor to align visual features from objects, interactions, and image levels with the composite visual clues bank. Additionally, we incorporate a cross-modal learning module with a multitower architecture to seamlessly integrate visual clues into the relation decoder, thereby improving PSG detection. Extensive experiments on two PSG benchmarks confirm the effectiveness and interpretability of CVCPSG. |
|---|---|
| ISSN: | 1319-1578 2213-1248 |