A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
Stereo disparity estimation is an essential component in computer vision and photogrammetry with many applications. However, there is a lack of real-world large datasets and large-scale models in the domain. Inspired by recent advances in the foundation model for image segmentation, we explore the R...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/17/9/1587 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Stereo disparity estimation is an essential component in computer vision and photogrammetry with many applications. However, there is a lack of real-world large datasets and large-scale models in the domain. Inspired by recent advances in the foundation model for image segmentation, we explore the RANSAC disparity refinement based on zero-shot monocular surface normal prediction and SAM segmentation masks, which combine stereo matching models and advanced monocular large-scale vision models. The disparity refinement problem is formulated as follows: extracting geometric structures based on SAM masks and surface normal prediction, building disparity map hypotheses of the geometric structures, and selecting the hypotheses-based weighted RANSAC method. We believe that after obtaining geometry structures, even if there is only a part of the correct disparity in the geometry structure, the entire correct geometry structure can be reconstructed based on the prior geometry structure. Our method can best optimize the results of traditional models such as SGM or deep learning models such as MC-CNN. The model obtains 15.48% D1-error without training on the US3D dataset and obtains 6.09% bad 2.0 error and 3.65% bad 4.0 error on the Middlebury dataset. The research helps to promote the development of scene and geometric structure understanding in stereo disparity estimation and the application of combining advanced large-scale monocular vision models with stereo matching methods. |
|---|---|
| ISSN: | 2072-4292 |