Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery
The accurate mapping of weeds in agricultural fields is essential for effective weed control and enhanced crop productivity. Moving beyond the limitations of RGB imagery alone, this study presents a cross-modal feature fusion network (CMFNet) designed for precise weed mapping by integrating RGB and...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Agriculture |
Subjects: | |
Online Access: | https://www.mdpi.com/2077-0472/14/12/2331 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1846106438005424128 |
---|---|
author | Xijian Fan Chunlei Ge Xubing Yang Weice Wang |
author_facet | Xijian Fan Chunlei Ge Xubing Yang Weice Wang |
author_sort | Xijian Fan |
collection | DOAJ |
description | The accurate mapping of weeds in agricultural fields is essential for effective weed control and enhanced crop productivity. Moving beyond the limitations of RGB imagery alone, this study presents a cross-modal feature fusion network (CMFNet) designed for precise weed mapping by integrating RGB and near-infrared (NIR) imagery. CMFNet first applies color space enhancement and adaptive histogram equalization to improve the image brightness and contrast in both RGB and NIR images. Building on a Transformer-based segmentation framework, a cross-modal multi-scale feature enhancement module is then introduced, featuring spatial and channel feature interaction to automatically capture complementary information across two modalities. The enhanced features are further fused and refined by integrating an attention mechanism, which reduces the background interference and enhances the segmentation accuracy. Extensive experiments conducted on two public datasets, the Sugar Beets 2016 and Sunflower datasets, demonstrate that CMFNet significantly outperforms CNN-based segmentation models in the task of weed and crop segmentation. The model achieved an Intersection over Union (IoU) metric of 90.86% and 90.77%, along with a Mean Accuracy (mAcc) of 93.8% and 94.35%, respectively. Ablation studies further validate that the proposed cross-modal fusion method provides substantial improvements over basic feature fusion methods, effectively localizing weed and crop regions across diverse field conditions. These findings underscore their potential as a robust solution for precise and adaptive weed mapping in complex agricultural landscapes. |
format | Article |
id | doaj-art-b750e0d76bee416f9ca5aab4b8ff2c05 |
institution | Kabale University |
issn | 2077-0472 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Agriculture |
spelling | doaj-art-b750e0d76bee416f9ca5aab4b8ff2c052024-12-27T14:03:24ZengMDPI AGAgriculture2077-04722024-12-011412233110.3390/agriculture14122331Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared ImageryXijian Fan0Chunlei Ge1Xubing Yang2Weice Wang3College of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, ChinaCollege of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, ChinaCollege of Information Science and Technology & Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, ChinaFujian Key Laboratory of Spatial Information Perception and Intelligent Processing, Yango University, Fuzhou 350015, ChinaThe accurate mapping of weeds in agricultural fields is essential for effective weed control and enhanced crop productivity. Moving beyond the limitations of RGB imagery alone, this study presents a cross-modal feature fusion network (CMFNet) designed for precise weed mapping by integrating RGB and near-infrared (NIR) imagery. CMFNet first applies color space enhancement and adaptive histogram equalization to improve the image brightness and contrast in both RGB and NIR images. Building on a Transformer-based segmentation framework, a cross-modal multi-scale feature enhancement module is then introduced, featuring spatial and channel feature interaction to automatically capture complementary information across two modalities. The enhanced features are further fused and refined by integrating an attention mechanism, which reduces the background interference and enhances the segmentation accuracy. Extensive experiments conducted on two public datasets, the Sugar Beets 2016 and Sunflower datasets, demonstrate that CMFNet significantly outperforms CNN-based segmentation models in the task of weed and crop segmentation. The model achieved an Intersection over Union (IoU) metric of 90.86% and 90.77%, along with a Mean Accuracy (mAcc) of 93.8% and 94.35%, respectively. Ablation studies further validate that the proposed cross-modal fusion method provides substantial improvements over basic feature fusion methods, effectively localizing weed and crop regions across diverse field conditions. These findings underscore their potential as a robust solution for precise and adaptive weed mapping in complex agricultural landscapes.https://www.mdpi.com/2077-0472/14/12/2331weed mappingmulti-modal fusionself-attentionsemantic segmentation |
spellingShingle | Xijian Fan Chunlei Ge Xubing Yang Weice Wang Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery Agriculture weed mapping multi-modal fusion self-attention semantic segmentation |
title | Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery |
title_full | Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery |
title_fullStr | Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery |
title_full_unstemmed | Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery |
title_short | Cross-Modal Feature Fusion for Field Weed Mapping Using RGB and Near-Infrared Imagery |
title_sort | cross modal feature fusion for field weed mapping using rgb and near infrared imagery |
topic | weed mapping multi-modal fusion self-attention semantic segmentation |
url | https://www.mdpi.com/2077-0472/14/12/2331 |
work_keys_str_mv | AT xijianfan crossmodalfeaturefusionforfieldweedmappingusingrgbandnearinfraredimagery AT chunleige crossmodalfeaturefusionforfieldweedmappingusingrgbandnearinfraredimagery AT xubingyang crossmodalfeaturefusionforfieldweedmappingusingrgbandnearinfraredimagery AT weicewang crossmodalfeaturefusionforfieldweedmappingusingrgbandnearinfraredimagery |