SDMA-Net: Swin Transformer-Based Dynamic Memory-Attention Network for Endoscopic Navigation
Accurate endoscopic motion navigation is crucial for minimally invasive surgical procedures. Nevertheless, endoscopic video data often exhibit low texture, variable lighting, and dynamic motion patterns, which poses significant challenges to existing methods. To address these issues, we propose a no...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11025809/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Accurate endoscopic motion navigation is crucial for minimally invasive surgical procedures. Nevertheless, endoscopic video data often exhibit low texture, variable lighting, and dynamic motion patterns, which poses significant challenges to existing methods. To address these issues, we propose a novel deep learning framework, namely Swin Transformer-based Dynamic Memory-Attention Network (SDMA-Net). SDMA-Net integrates a Swin Transformer for multiscale feature extraction, a Dynamic Channel Attention (DCA) module for frequency-aware feature refinement, and a Channel-Level Masked AutoEncoder (CL-MAE) for self supervised learning. Temporal dependencies are modeled using a Long Short-Term Memory (LSTM) network. Additionally, a Dynamic Memory Augmentation Module (DMAM) adaptively updates and retrieves motion patterns to enhance robustness against noise and occlusions. Experiments on a colonoscopy dataset of over 12,000 images demonstrate that SDMA-Net achieves superior classification accuracy and Area Under the Curve (AUC) compared to existing baselines. As a conclusion, our proposed SDMA-Net provides an effective and efficient solution for endoscopic motion detection and classification. |
|---|---|
| ISSN: | 2169-3536 |