Retail commodity detection method based on location learnable visual center mechanism

To address the problem of low detection accuracy caused by the difficulty in effectively capturing significant and diversified feature information for packaging deformation and overlap products, a location learnable visual center (LLVC) mechanism was designed to improve YOLOX-s, achieving higher det...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaohua LYU, Mingchen WEI, Libo LIU
Format: Article
Language:zho
Published: China InfoCom Media Group 2023-12-01
Series:物联网学报
Subjects:
Online Access:http://www.wlwxb.com.cn/zh/article/doi/10.11959/j.issn.2096-3750.2023.00366/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841531126655483904
author Xiaohua LYU
Mingchen WEI
Libo LIU
author_facet Xiaohua LYU
Mingchen WEI
Libo LIU
author_sort Xiaohua LYU
collection DOAJ
description To address the problem of low detection accuracy caused by the difficulty in effectively capturing significant and diversified feature information for packaging deformation and overlap products, a location learnable visual center (LLVC) mechanism was designed to improve YOLOX-s, achieving higher detection accuracy.To effectively deal with product packaging deformation and overlap phenomena, firstly, global context information was captured through a lightweight multi-layer perceptron to help the model better understand spatial information in product features.Secondly, the local feature representation ability was enhanced by the designed LLVC and the spatial information was used to allocate learnable weights for local features to increase the attention of discriminative local features.Finally, the intersection over union (IoU) loss function was replaced with centered intersection over union (CIoU) and power parameters were introduced on this basis to effectively reduce the missed detection rate.Experimental results show that the proposed method achieves an accuracy of 91.3% on the retail product checkout (RPC) dataset, which is 2.2% higher than YOLOX-s and better than current mainstream lightweight object detection algorithms.At the same time, frame per second (FPS) is 97 frame/s, and the model size is 9.48 MB.It can accurately and in real-time detect retail products in scenarios where computing resources are limited.
format Article
id doaj-art-fcb8edf500b14afca7597588b9af605f
institution Kabale University
issn 2096-3750
language zho
publishDate 2023-12-01
publisher China InfoCom Media Group
record_format Article
series 物联网学报
spelling doaj-art-fcb8edf500b14afca7597588b9af605f2025-01-15T02:54:22ZzhoChina InfoCom Media Group物联网学报2096-37502023-12-01714215259566374Retail commodity detection method based on location learnable visual center mechanismXiaohua LYUMingchen WEILibo LIUTo address the problem of low detection accuracy caused by the difficulty in effectively capturing significant and diversified feature information for packaging deformation and overlap products, a location learnable visual center (LLVC) mechanism was designed to improve YOLOX-s, achieving higher detection accuracy.To effectively deal with product packaging deformation and overlap phenomena, firstly, global context information was captured through a lightweight multi-layer perceptron to help the model better understand spatial information in product features.Secondly, the local feature representation ability was enhanced by the designed LLVC and the spatial information was used to allocate learnable weights for local features to increase the attention of discriminative local features.Finally, the intersection over union (IoU) loss function was replaced with centered intersection over union (CIoU) and power parameters were introduced on this basis to effectively reduce the missed detection rate.Experimental results show that the proposed method achieves an accuracy of 91.3% on the retail product checkout (RPC) dataset, which is 2.2% higher than YOLOX-s and better than current mainstream lightweight object detection algorithms.At the same time, frame per second (FPS) is 97 frame/s, and the model size is 9.48 MB.It can accurately and in real-time detect retail products in scenarios where computing resources are limited.http://www.wlwxb.com.cn/zh/article/doi/10.11959/j.issn.2096-3750.2023.00366/retail commodity detectionYOLOX-scentral learning mechanismloss functionlightweight
spellingShingle Xiaohua LYU
Mingchen WEI
Libo LIU
Retail commodity detection method based on location learnable visual center mechanism
物联网学报
retail commodity detection
YOLOX-s
central learning mechanism
loss function
lightweight
title Retail commodity detection method based on location learnable visual center mechanism
title_full Retail commodity detection method based on location learnable visual center mechanism
title_fullStr Retail commodity detection method based on location learnable visual center mechanism
title_full_unstemmed Retail commodity detection method based on location learnable visual center mechanism
title_short Retail commodity detection method based on location learnable visual center mechanism
title_sort retail commodity detection method based on location learnable visual center mechanism
topic retail commodity detection
YOLOX-s
central learning mechanism
loss function
lightweight
url http://www.wlwxb.com.cn/zh/article/doi/10.11959/j.issn.2096-3750.2023.00366/
work_keys_str_mv AT xiaohualyu retailcommoditydetectionmethodbasedonlocationlearnablevisualcentermechanism
AT mingchenwei retailcommoditydetectionmethodbasedonlocationlearnablevisualcentermechanism
AT liboliu retailcommoditydetectionmethodbasedonlocationlearnablevisualcentermechanism