Improving BI-RADS Mammographic Classification With Self-Supervised Vision Transformers and Cascade Learning

Accurate and early breast cancer detection is critical for improving patient outcomes. In this study, we propose PatchCascade-ViT, a novel self-supervised Vision Transformer (ViT) framework for automated BI-RADS classification of mammographic images. Unlike conventional deep learning approaches that...

Full description

Saved in:
Bibliographic Details
Main Authors: Abdelrahman Abdallah, Mahmoud Salaheldin Kasem, Ibrahim Abdelhalim, Norah Saleh Alghamdi, Ayman El-Baz
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11045361/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate and early breast cancer detection is critical for improving patient outcomes. In this study, we propose PatchCascade-ViT, a novel self-supervised Vision Transformer (ViT) framework for automated BI-RADS classification of mammographic images. Unlike conventional deep learning approaches that rely heavily on annotated datasets, PatchCascade-ViT leverages Self Patch-level Supervision (SPS) to learn meaningful mammographic representations from unlabeled data, significantly enhancing classification performance. Our framework operates through a two-stage cascade classification process. In the first stage, the model differentiates non-cancerous from potentially cancerous mammograms using SelfPatch, an innovative self-supervised learning task that enhances patch-level feature learning by enforcing consistency among spatially correlated patches. The second stage refines the classification by distinguishing Scattered Fibroglandular from Heterogeneously and Extremely Dense breast tissue categories, enabling more precise breast cancer risk assessment. To validate the effectiveness of PatchCascade-ViT, we conducted extensive evaluations on a dataset of 4,368 mammograms across three BI-RADS classes. Our method achieved a system sensitivity of 85.01% and an F1-score of 84.90%, outperforming existing deep learning-based approaches. By integrating self-supervised learning with a cascade vision transformer architecture, PatchCascade-ViT reduces reliance on annotated datasets while maintaining high classification accuracy. These findings demonstrate its potential for enhancing breast cancer screening, aiding radiologists in early detection, and improving clinical decision-making.
ISSN:2169-3536