InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation
Simultaneous localization and mapping (SLAM) techniques can be used to navigate the visually impaired, but the development of robust SLAM solutions for crowded spaces is limited by the lack of realistic datasets. To address this, we introduce InCrowd-VI, a novel visual–inertial dataset specifically...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-12-01
|
| Series: | Sensors |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1424-8220/24/24/8164 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846102694341640192 |
|---|---|
| author | Marziyeh Bamdad Hans-Peter Hutter Alireza Darvishy |
| author_facet | Marziyeh Bamdad Hans-Peter Hutter Alireza Darvishy |
| author_sort | Marziyeh Bamdad |
| collection | DOAJ |
| description | Simultaneous localization and mapping (SLAM) techniques can be used to navigate the visually impaired, but the development of robust SLAM solutions for crowded spaces is limited by the lack of realistic datasets. To address this, we introduce InCrowd-VI, a novel visual–inertial dataset specifically designed for human navigation in indoor pedestrian-rich environments. Recorded using Meta Aria Project glasses, it captures realistic scenarios without environmental control. InCrowd-VI features 58 sequences totaling a 5 km trajectory length and 1.5 h of recording time, including RGB, stereo images, and IMU measurements. The dataset captures important challenges such as pedestrian occlusions, varying crowd densities, complex layouts, and lighting changes. Ground-truth trajectories, accurate to approximately 2 cm, are provided in the dataset, originating from the Meta Aria project machine perception SLAM service. In addition, a semi-dense 3D point cloud of scenes is provided for each sequence. The evaluation of state-of-the-art visual odometry (VO) and SLAM algorithms on InCrowd-VI revealed severe performance limitations in these realistic scenarios. Under challenging conditions, systems exceeded the required localization accuracy of 0.5 m and the 1% drift threshold, with classical methods showing drift up to 5–10%. While deep learning-based approaches maintained high pose estimation coverage (>90%), they failed to achieve real-time processing speeds necessary for walking pace navigation. These results demonstrate the need and value of a new dataset to advance SLAM research for visually impaired navigation in complex indoor environments. |
| format | Article |
| id | doaj-art-23dc72e54d5a43cc8fa83bf6787ddbe0 |
| institution | Kabale University |
| issn | 1424-8220 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Sensors |
| spelling | doaj-art-23dc72e54d5a43cc8fa83bf6787ddbe02024-12-27T14:53:11ZengMDPI AGSensors1424-82202024-12-012424816410.3390/s24248164InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human NavigationMarziyeh Bamdad0Hans-Peter Hutter1Alireza Darvishy2Institute of Computer Science, Zurich University of Applied Sciences, 8400 Winterthur, SwitzerlandInstitute of Computer Science, Zurich University of Applied Sciences, 8400 Winterthur, SwitzerlandInstitute of Computer Science, Zurich University of Applied Sciences, 8400 Winterthur, SwitzerlandSimultaneous localization and mapping (SLAM) techniques can be used to navigate the visually impaired, but the development of robust SLAM solutions for crowded spaces is limited by the lack of realistic datasets. To address this, we introduce InCrowd-VI, a novel visual–inertial dataset specifically designed for human navigation in indoor pedestrian-rich environments. Recorded using Meta Aria Project glasses, it captures realistic scenarios without environmental control. InCrowd-VI features 58 sequences totaling a 5 km trajectory length and 1.5 h of recording time, including RGB, stereo images, and IMU measurements. The dataset captures important challenges such as pedestrian occlusions, varying crowd densities, complex layouts, and lighting changes. Ground-truth trajectories, accurate to approximately 2 cm, are provided in the dataset, originating from the Meta Aria project machine perception SLAM service. In addition, a semi-dense 3D point cloud of scenes is provided for each sequence. The evaluation of state-of-the-art visual odometry (VO) and SLAM algorithms on InCrowd-VI revealed severe performance limitations in these realistic scenarios. Under challenging conditions, systems exceeded the required localization accuracy of 0.5 m and the 1% drift threshold, with classical methods showing drift up to 5–10%. While deep learning-based approaches maintained high pose estimation coverage (>90%), they failed to achieve real-time processing speeds necessary for walking pace navigation. These results demonstrate the need and value of a new dataset to advance SLAM research for visually impaired navigation in complex indoor environments.https://www.mdpi.com/1424-8220/24/24/8164visual SLAMblind and visually impaired navigationcrowded indoor environmentsdataset |
| spellingShingle | Marziyeh Bamdad Hans-Peter Hutter Alireza Darvishy InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation Sensors visual SLAM blind and visually impaired navigation crowded indoor environments dataset |
| title | InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation |
| title_full | InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation |
| title_fullStr | InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation |
| title_full_unstemmed | InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation |
| title_short | InCrowd-VI: A Realistic Visual–Inertial Dataset for Evaluating Simultaneous Localization and Mapping in Indoor Pedestrian-Rich Spaces for Human Navigation |
| title_sort | incrowd vi a realistic visual inertial dataset for evaluating simultaneous localization and mapping in indoor pedestrian rich spaces for human navigation |
| topic | visual SLAM blind and visually impaired navigation crowded indoor environments dataset |
| url | https://www.mdpi.com/1424-8220/24/24/8164 |
| work_keys_str_mv | AT marziyehbamdad incrowdviarealisticvisualinertialdatasetforevaluatingsimultaneouslocalizationandmappinginindoorpedestrianrichspacesforhumannavigation AT hanspeterhutter incrowdviarealisticvisualinertialdatasetforevaluatingsimultaneouslocalizationandmappinginindoorpedestrianrichspacesforhumannavigation AT alirezadarvishy incrowdviarealisticvisualinertialdatasetforevaluatingsimultaneouslocalizationandmappinginindoorpedestrianrichspacesforhumannavigation |