An Effective 3D Instance Map Reconstruction Method Based on RGBD Images for Indoor Scene

To enhance the intelligence of robots, constructing accurate object-level instance maps is essential. However, the diversity and clutter of objects in indoor scenes present significant challenges for instance map construction. To tackle this issue, we propose a method for constructing object-level i...

Full description

Saved in:
Bibliographic Details
Main Authors: Heng Wu, Yanjie Liu, Chao Wang, Yanlong Wei
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/17/1/139
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To enhance the intelligence of robots, constructing accurate object-level instance maps is essential. However, the diversity and clutter of objects in indoor scenes present significant challenges for instance map construction. To tackle this issue, we propose a method for constructing object-level instance maps based on RGBD images. First, we utilize the advanced visual odometer ORB-SLAM3 to estimate the poses of image frames and extract keyframes. Next, we perform semantic and geometric segmentation on the color and depth images of these keyframes, respectively, using semantic segmentation to optimize the geometric segmentation results and address inaccuracies in the target segmentation caused by small depth variations. The segmented depth images are then projected into point cloud segments, which are assigned corresponding semantic information. We integrate these point cloud segments into a global voxel map, updating each voxel’s class using color, distance constraints, and Bayesian methods to create an object-level instance map. Finally, we construct an ellipsoids scene from this map to test the robot’s localization capabilities in indoor environments using semantic information. Our experiments demonstrate that this method accurately and robustly constructs the environment, facilitating precise object-level scene segmentation. Furthermore, compared to manually labeled ellipsoidal maps, generating ellipsoidal maps from extracted objects enables accurate global localization.
ISSN:2072-4292