A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal

Stereo disparity estimation is an essential component in computer vision and photogrammetry with many applications. However, there is a lack of real-world large datasets and large-scale models in the domain. Inspired by recent advances in the foundation model for image segmentation, we explore the R...

Full description

Saved in:
Bibliographic Details
Main Authors: Haoxuan Sun, Taoyang Wang
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/9/1587
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849322473315631104
author Haoxuan Sun
Taoyang Wang
author_facet Haoxuan Sun
Taoyang Wang
author_sort Haoxuan Sun
collection DOAJ
description Stereo disparity estimation is an essential component in computer vision and photogrammetry with many applications. However, there is a lack of real-world large datasets and large-scale models in the domain. Inspired by recent advances in the foundation model for image segmentation, we explore the RANSAC disparity refinement based on zero-shot monocular surface normal prediction and SAM segmentation masks, which combine stereo matching models and advanced monocular large-scale vision models. The disparity refinement problem is formulated as follows: extracting geometric structures based on SAM masks and surface normal prediction, building disparity map hypotheses of the geometric structures, and selecting the hypotheses-based weighted RANSAC method. We believe that after obtaining geometry structures, even if there is only a part of the correct disparity in the geometry structure, the entire correct geometry structure can be reconstructed based on the prior geometry structure. Our method can best optimize the results of traditional models such as SGM or deep learning models such as MC-CNN. The model obtains 15.48% D1-error without training on the US3D dataset and obtains 6.09% bad 2.0 error and 3.65% bad 4.0 error on the Middlebury dataset. The research helps to promote the development of scene and geometric structure understanding in stereo disparity estimation and the application of combining advanced large-scale monocular vision models with stereo matching methods.
format Article
id doaj-art-87880d63f33c4c8a94df9cdd80ffbb01
institution Kabale University
issn 2072-4292
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj-art-87880d63f33c4c8a94df9cdd80ffbb012025-08-20T03:49:22ZengMDPI AGRemote Sensing2072-42922025-04-01179158710.3390/rs17091587A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface NormalHaoxuan Sun0Taoyang Wang1School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaSchool of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, ChinaStereo disparity estimation is an essential component in computer vision and photogrammetry with many applications. However, there is a lack of real-world large datasets and large-scale models in the domain. Inspired by recent advances in the foundation model for image segmentation, we explore the RANSAC disparity refinement based on zero-shot monocular surface normal prediction and SAM segmentation masks, which combine stereo matching models and advanced monocular large-scale vision models. The disparity refinement problem is formulated as follows: extracting geometric structures based on SAM masks and surface normal prediction, building disparity map hypotheses of the geometric structures, and selecting the hypotheses-based weighted RANSAC method. We believe that after obtaining geometry structures, even if there is only a part of the correct disparity in the geometry structure, the entire correct geometry structure can be reconstructed based on the prior geometry structure. Our method can best optimize the results of traditional models such as SGM or deep learning models such as MC-CNN. The model obtains 15.48% D1-error without training on the US3D dataset and obtains 6.09% bad 2.0 error and 3.65% bad 4.0 error on the Middlebury dataset. The research helps to promote the development of scene and geometric structure understanding in stereo disparity estimation and the application of combining advanced large-scale monocular vision models with stereo matching methods.https://www.mdpi.com/2072-4292/17/9/1587disparity estimationsegment anything modelsurface normalstereo matchingRANSAC
spellingShingle Haoxuan Sun
Taoyang Wang
A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
Remote Sensing
disparity estimation
segment anything model
surface normal
stereo matching
RANSAC
title A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
title_full A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
title_fullStr A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
title_full_unstemmed A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
title_short A Stereo Disparity Map Refinement Method Without Training Based on Monocular Segmentation and Surface Normal
title_sort stereo disparity map refinement method without training based on monocular segmentation and surface normal
topic disparity estimation
segment anything model
surface normal
stereo matching
RANSAC
url https://www.mdpi.com/2072-4292/17/9/1587
work_keys_str_mv AT haoxuansun astereodisparitymaprefinementmethodwithouttrainingbasedonmonocularsegmentationandsurfacenormal
AT taoyangwang astereodisparitymaprefinementmethodwithouttrainingbasedonmonocularsegmentationandsurfacenormal
AT haoxuansun stereodisparitymaprefinementmethodwithouttrainingbasedonmonocularsegmentationandsurfacenormal
AT taoyangwang stereodisparitymaprefinementmethodwithouttrainingbasedonmonocularsegmentationandsurfacenormal