Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes

This study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot’s spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kenta Gunji, Kazunori Ohno, Shuhei Kurita, Ken Sakurada, Ranulfo Bezerra, Shotaro Kojima, Yoshito Okada, Masashi Konyo, Satoshi Tadokoro
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Semantic scene understanding large language models co-occurrence validation prompt engineering
Online Access:	https://ieeexplore.ieee.org/document/10786984/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846118313225093120
author	Kenta Gunji Kazunori Ohno Shuhei Kurita Ken Sakurada Ranulfo Bezerra Shotaro Kojima Yoshito Okada Masashi Konyo Satoshi Tadokoro
author_facet	Kenta Gunji Kazunori Ohno Shuhei Kurita Ken Sakurada Ranulfo Bezerra Shotaro Kojima Yoshito Okada Masashi Konyo Satoshi Tadokoro
author_sort	Kenta Gunji
collection	DOAJ
description	This study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot’s spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navigate their surroundings. LLM can generate object co-occurrence information based on the learned representations acquired from the learning process. However, the challenge lies in determining whether the co-occurrence gleaned from linguistic data can effectively translate to real-world object relationships, a concept yet to be thoroughly examined. After providing category information about a specific situation, this paper compares and evaluates the co-occurrence degree (co-occurrence coefficient) output by gpt-4-turbo-2024-04-09 (GPT-4) against the object pair data from the ScanNet v2 dataset. The results revealed that GPT-4 achieved a high recall of 0.78 across various situation categories annotated by ScanNet v2, although its precision was relatively low at an average of 0.29. The root mean square error of the co-occurrence coefficient was 0.31. While GPT-4 tends to output slightly higher co-occurrence coefficients, it effectively captures the overall co-occurrence patterns observed in the ScanNet v2 dataset. GPT-4 produced co-occurrence information for more objects than those available in ScanNet v2 while covering co-occurrences among objects within ScanNet v2. These results demonstrate that integrating co-occurrence data from different sources could enhance the ability to recognize real-world objects and potentially strengthen robot intelligence.
format	Article
id	doaj-art-f6acd24f2ee54513a1f4ba798f767b6b
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-f6acd24f2ee54513a1f4ba798f767b6b2024-12-18T00:01:50ZengIEEEIEEE Access2169-35362024-01-011218657318658510.1109/ACCESS.2024.351447310786984Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional ScenesKenta Gunji0https://orcid.org/0000-0002-7924-2437Kazunori Ohno1https://orcid.org/0000-0003-3958-2901Shuhei Kurita2https://orcid.org/0000-0001-7415-3120Ken Sakurada3https://orcid.org/0000-0003-3386-1547Ranulfo Bezerra4https://orcid.org/0000-0001-6017-5724Shotaro Kojima5https://orcid.org/0000-0003-0042-1764Yoshito Okada6https://orcid.org/0000-0003-3830-079XMasashi Konyo7https://orcid.org/0000-0002-6826-9722Satoshi Tadokoro8https://orcid.org/0000-0002-5571-4276Graduate School of Information Sciences, Tohoku University, Sendai, JapanGraduate School of Information Sciences, Tohoku University, Sendai, JapanNational Institute of Informatics, Tokyo, JapanGraduate School of Informatics, Kyoto University, Kyoto, JapanGraduate School of Information Sciences, Tohoku University, Sendai, JapanGraduate School of Information Sciences, Tohoku University, Sendai, JapanGraduate School of Information Sciences, Tohoku University, Sendai, JapanGraduate School of Information Sciences, Tohoku University, Sendai, JapanGraduate School of Information Sciences, Tohoku University, Sendai, JapanThis study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot’s spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navigate their surroundings. LLM can generate object co-occurrence information based on the learned representations acquired from the learning process. However, the challenge lies in determining whether the co-occurrence gleaned from linguistic data can effectively translate to real-world object relationships, a concept yet to be thoroughly examined. After providing category information about a specific situation, this paper compares and evaluates the co-occurrence degree (co-occurrence coefficient) output by gpt-4-turbo-2024-04-09 (GPT-4) against the object pair data from the ScanNet v2 dataset. The results revealed that GPT-4 achieved a high recall of 0.78 across various situation categories annotated by ScanNet v2, although its precision was relatively low at an average of 0.29. The root mean square error of the co-occurrence coefficient was 0.31. While GPT-4 tends to output slightly higher co-occurrence coefficients, it effectively captures the overall co-occurrence patterns observed in the ScanNet v2 dataset. GPT-4 produced co-occurrence information for more objects than those available in ScanNet v2 while covering co-occurrences among objects within ScanNet v2. These results demonstrate that integrating co-occurrence data from different sources could enhance the ability to recognize real-world objects and potentially strengthen robot intelligence.https://ieeexplore.ieee.org/document/10786984/Semantic scene understandinglarge language modelsco-occurrence validationprompt engineering
spellingShingle	Kenta Gunji Kazunori Ohno Shuhei Kurita Ken Sakurada Ranulfo Bezerra Shotaro Kojima Yoshito Okada Masashi Konyo Satoshi Tadokoro Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes IEEE Access Semantic scene understanding large language models co-occurrence validation prompt engineering
title	Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes
title_full	Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes
title_fullStr	Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes
title_full_unstemmed	Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes
title_short	Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes
title_sort	validation of llm generated object co occurrence information for understanding three dimensional scenes
topic	Semantic scene understanding large language models co-occurrence validation prompt engineering
url	https://ieeexplore.ieee.org/document/10786984/
work_keys_str_mv	AT kentagunji validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT kazunoriohno validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT shuheikurita validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT kensakurada validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT ranulfobezerra validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT shotarokojima validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT yoshitookada validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT masashikonyo validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes AT satoshitadokoro validationofllmgeneratedobjectcooccurrenceinformationforunderstandingthreedimensionalscenes

Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes

Similar Items