InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation

Simultaneous localization and mapping (SLAM) techniques can be used to navigate the visually impaired, but the development of robust SLAM solutions for crowded spaces is limited by the lack of realistic datasets. To address this, we introduce InCrowd-VI, a novel visual–inertial dataset specifically...

Full description

Saved in:
Bibliographic Details
Main Authors: Marziyeh Bamdad, Hans-Peter Hutter, Alireza Darvishy
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/24/8164
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846102694341640192
author Marziyeh Bamdad
Hans-Peter Hutter
Alireza Darvishy
author_facet Marziyeh Bamdad
Hans-Peter Hutter
Alireza Darvishy
author_sort Marziyeh Bamdad
collection DOAJ
description Simultaneous localization and mapping (SLAM) techniques can be used to navigate the visually impaired, but the development of robust SLAM solutions for crowded spaces is limited by the lack of realistic datasets. To address this, we introduce InCrowd-VI, a novel visual–inertial dataset specifically designed for human navigation in indoor pedestrian-rich environments. Recorded using Meta Aria Project glasses, it captures realistic scenarios without environmental control. InCrowd-VI features 58 sequences totaling a 5 km trajectory length and 1.5 h of recording time, including RGB, stereo images, and IMU measurements. The dataset captures important challenges such as pedestrian occlusions, varying crowd densities, complex layouts, and lighting changes. Ground-truth trajectories, accurate to approximately 2 cm, are provided in the dataset, originating from the Meta Aria project machine perception SLAM service. In addition, a semi-dense 3D point cloud of scenes is provided for each sequence. The evaluation of state-of-the-art visual odometry (VO) and SLAM algorithms on InCrowd-VI revealed severe performance limitations in these realistic scenarios. Under challenging conditions, systems exceeded the required localization accuracy of 0.5 m and the 1% drift threshold, with classical methods showing drift up to 5–10%. While deep learning-based approaches maintained high pose estimation coverage (>90%), they failed to achieve real-time processing speeds necessary for walking pace navigation. These results demonstrate the need and value of a new dataset to advance SLAM research for visually impaired navigation in complex indoor environments.
format Article
id doaj-art-23dc72e54d5a43cc8fa83bf6787ddbe0
institution Kabale University
issn 1424-8220
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj-art-23dc72e54d5a43cc8fa83bf6787ddbe02024-12-27T14:53:11ZengMDPI AGSensors1424-82202024-12-012424816410.3390/s24248164InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human NavigationMarziyeh Bamdad0Hans-Peter Hutter1Alireza Darvishy2Institute of Computer Science, Zurich University of Applied Sciences, 8400 Winterthur, SwitzerlandInstitute of Computer Science, Zurich University of Applied Sciences, 8400 Winterthur, SwitzerlandInstitute of Computer Science, Zurich University of Applied Sciences, 8400 Winterthur, SwitzerlandSimultaneous localization and mapping (SLAM) techniques can be used to navigate the visually impaired, but the development of robust SLAM solutions for crowded spaces is limited by the lack of realistic datasets. To address this, we introduce InCrowd-VI, a novel visual–inertial dataset specifically designed for human navigation in indoor pedestrian-rich environments. Recorded using Meta Aria Project glasses, it captures realistic scenarios without environmental control. InCrowd-VI features 58 sequences totaling a 5 km trajectory length and 1.5 h of recording time, including RGB, stereo images, and IMU measurements. The dataset captures important challenges such as pedestrian occlusions, varying crowd densities, complex layouts, and lighting changes. Ground-truth trajectories, accurate to approximately 2 cm, are provided in the dataset, originating from the Meta Aria project machine perception SLAM service. In addition, a semi-dense 3D point cloud of scenes is provided for each sequence. The evaluation of state-of-the-art visual odometry (VO) and SLAM algorithms on InCrowd-VI revealed severe performance limitations in these realistic scenarios. Under challenging conditions, systems exceeded the required localization accuracy of 0.5 m and the 1% drift threshold, with classical methods showing drift up to 5–10%. While deep learning-based approaches maintained high pose estimation coverage (>90%), they failed to achieve real-time processing speeds necessary for walking pace navigation. These results demonstrate the need and value of a new dataset to advance SLAM research for visually impaired navigation in complex indoor environments.https://www.mdpi.com/1424-8220/24/24/8164visual SLAMblind and visually impaired navigationcrowded indoor environmentsdataset
spellingShingle Marziyeh Bamdad
Hans-Peter Hutter
Alireza Darvishy
InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
Sensors
visual SLAM
blind and visually impaired navigation
crowded indoor environments
dataset
title InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
title_full InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
title_fullStr InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
title_full_unstemmed InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
title_short InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
title_sort incrowd vi a realistic visual inertial dataset for evaluating simultaneous localization and mapping in indoor pedestrian rich spaces for human navigation
topic visual SLAM
blind and visually impaired navigation
crowded indoor environments
dataset
url https://www.mdpi.com/1424-8220/24/24/8164
work_keys_str_mv AT marziyehbamdad incrowdviarealisticvisualinertialdatasetforevaluatingsimultaneouslocalizationandmappinginindoorpedestrianrichspacesforhumannavigation
AT hanspeterhutter incrowdviarealisticvisualinertialdatasetforevaluatingsimultaneouslocalizationandmappinginindoorpedestrianrichspacesforhumannavigation
AT alirezadarvishy incrowdviarealisticvisualinertialdatasetforevaluatingsimultaneouslocalizationandmappinginindoorpedestrianrichspacesforhumannavigation