GLClick: Interactive Segmentation Combining Global and Local Features

Convolutional neural networks (CNNs) are the backbone of most modern interactive segmentation algorithms. However, the limited receptive field of CNNs restricts their ability to capture long-range semantic relationships. Recently, transformers have gained significant attention for their capacity to...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiaying Tang, Hongyuan Wang, Zongyuan Ding, Zihao Xin
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Applied Sciences
Subjects:	ResNet50 Transformer global–local feature fusion module interactive segmentation MLP
Online Access:	https://www.mdpi.com/2076-3417/15/1/186
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841549447552565248
author	Jiaying Tang Hongyuan Wang Zongyuan Ding Zihao Xin
author_facet	Jiaying Tang Hongyuan Wang Zongyuan Ding Zihao Xin
author_sort	Jiaying Tang
collection	DOAJ
description	Convolutional neural networks (CNNs) are the backbone of most modern interactive segmentation algorithms. However, the limited receptive field of CNNs restricts their ability to capture long-range semantic relationships. Recently, transformers have gained significant attention for their capacity to capture long-range dependencies. Nevertheless, CNNs still outperform Transformer in extracting local information. An effective interactive segmentation algorithm should accurately capture fine-grained local details alongside global semantic relationships. Therefore, we propose GLClick, a global–local click-based interactive image segmentation model that integrates local and global information through a novel fusion mechanism. We design an efficient global–local feature fusion module (GLFM) that integrates fine-grained features from various layers of ResNet50 with those from the Transformer feature pyramid. This approach maintains ResNet50’s ability to extract local features while effectively leveraging the Transformer to capture global context. Additionally, we enhance the multi-layer perceptron (MLP) to improve performance. Extensive experiments on diverse benchmark datasets demonstrate significant improvements in interactive image segmentation, confirming the effectiveness of our approach. Moreover, we conduct experiments on medical image datasets, further illustrating the model’s versatility and effectiveness across different domains.
format	Article
id	doaj-art-714681e2554b46c493f0f92db97187f7
institution	Kabale University
issn	2076-3417
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-714681e2554b46c493f0f92db97187f72025-01-10T13:14:44ZengMDPI AGApplied Sciences2076-34172024-12-0115118610.3390/app15010186GLClick: Interactive Segmentation Combining Global and Local FeaturesJiaying Tang0Hongyuan Wang1Zongyuan Ding2Zihao Xin3School of Computer and Artificial Intelligence, Changzhou University, No. 1, Gehu Road, Changzhou 213164, ChinaSchool of Computer and Artificial Intelligence, Changzhou University, No. 1, Gehu Road, Changzhou 213164, ChinaSchool of Computer and Artificial Intelligence, Changzhou University, No. 1, Gehu Road, Changzhou 213164, ChinaSchool of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, No. 29, Jiangjun Road, Nanjing 211106, ChinaConvolutional neural networks (CNNs) are the backbone of most modern interactive segmentation algorithms. However, the limited receptive field of CNNs restricts their ability to capture long-range semantic relationships. Recently, transformers have gained significant attention for their capacity to capture long-range dependencies. Nevertheless, CNNs still outperform Transformer in extracting local information. An effective interactive segmentation algorithm should accurately capture fine-grained local details alongside global semantic relationships. Therefore, we propose GLClick, a global–local click-based interactive image segmentation model that integrates local and global information through a novel fusion mechanism. We design an efficient global–local feature fusion module (GLFM) that integrates fine-grained features from various layers of ResNet50 with those from the Transformer feature pyramid. This approach maintains ResNet50’s ability to extract local features while effectively leveraging the Transformer to capture global context. Additionally, we enhance the multi-layer perceptron (MLP) to improve performance. Extensive experiments on diverse benchmark datasets demonstrate significant improvements in interactive image segmentation, confirming the effectiveness of our approach. Moreover, we conduct experiments on medical image datasets, further illustrating the model’s versatility and effectiveness across different domains.https://www.mdpi.com/2076-3417/15/1/186ResNet50Transformerglobal–local feature fusion moduleinteractive segmentationMLP
spellingShingle	Jiaying Tang Hongyuan Wang Zongyuan Ding Zihao Xin GLClick: Interactive Segmentation Combining Global and Local Features Applied Sciences ResNet50 Transformer global–local feature fusion module interactive segmentation MLP
title	GLClick: Interactive Segmentation Combining Global and Local Features
title_full	GLClick: Interactive Segmentation Combining Global and Local Features
title_fullStr	GLClick: Interactive Segmentation Combining Global and Local Features
title_full_unstemmed	GLClick: Interactive Segmentation Combining Global and Local Features
title_short	GLClick: Interactive Segmentation Combining Global and Local Features
title_sort	glclick interactive segmentation combining global and local features
topic	ResNet50 Transformer global–local feature fusion module interactive segmentation MLP
url	https://www.mdpi.com/2076-3417/15/1/186
work_keys_str_mv	AT jiayingtang glclickinteractivesegmentationcombiningglobalandlocalfeatures AT hongyuanwang glclickinteractivesegmentationcombiningglobalandlocalfeatures AT zongyuanding glclickinteractivesegmentationcombiningglobalandlocalfeatures AT zihaoxin glclickinteractivesegmentationcombiningglobalandlocalfeatures

GLClick: Interactive Segmentation Combining Global and Local Features

Similar Items