Querying 3D point clouds exploiting open-vocabulary semantic segmentation of images

While deep models have advanced the 3D data analysis and demonstrated impressive results, they often struggle to generalize to new classes that are absent from the training dataset. Recently, open-vocabulary and zero-shot models have addressed this problem. However, these models are still relying on...

Full description

Saved in:
Bibliographic Details
Main Authors: A. Alami, F. Remondino
Format: Article
Language:English
Published: Copernicus Publications 2024-12-01
Series:The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Online Access:https://isprs-archives.copernicus.org/articles/XLVIII-2-W8-2024/1/2024/isprs-archives-XLVIII-2-W8-2024-1-2024.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:While deep models have advanced the 3D data analysis and demonstrated impressive results, they often struggle to generalize to new classes that are absent from the training dataset. Recently, open-vocabulary and zero-shot models have addressed this problem. However, these models are still relying on some data for training and fine-tuning for specific tasks. This requirement limits them to real-world applications. In this research, we propose an open-vocabulary method for point cloud segmentation, which does not require additional training data beyond the images and point cloud from the survey scene. By using the capabilities of the power of 2D open-vocabulary models and geometric features from the 3D data, combined with an XGBoost-guided region growing algorithm, our approach segments the queried objects directly in 3D scenes. We evaluate our method on 3D benchmark datasets, such as Replica and ScanNet, showing its practicality and scalability to real-world scenarios with limited data.
ISSN:1682-1750
2194-9034