PW-BALFC, a clinical dataset for detection and instance segmentation of bronchoalveolar lavage fluid cell

Abstract Bronchoalveolar lavage fluid (BALF) cytology provides an important basis for the diagnosis and treatment of lung diseases. Current cytological analysis of BALF relies on manual microscopic examination, which is time-consuming, laborious, and experience-dependent. Automated identification of...

Full description

Saved in:
Bibliographic Details
Main Authors: Xin Shi, Qing Huang, Teng Xu, Hongwen Mei, Tingwei Quan, Xiuli Wang, Yinghan Shi, Ye Hu, Zhimei Duan, Fei Xie, Sifan Li, Lixin Xie, Kaifei Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-025-05452-4
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Bronchoalveolar lavage fluid (BALF) cytology provides an important basis for the diagnosis and treatment of lung diseases. Current cytological analysis of BALF relies on manual microscopic examination, which is time-consuming, laborious, and experience-dependent. Automated identification of BALF cytology helps increase the accuracy and speed of screening qualified samples and subsequent cytomorphology analysis. However, there is a lack of public clinical BALF cell datasets for the detection of different cell types and a lack of pixel-level annotations for cytomorphology analysis. In this work, high-resolution cell images from clinical bronchoalveolar lavage sample obtained at the Chinese PLA General Hospital from 2018–2024 were collected, and pixel-level high-quality instance annotations of seven cell types were labeled. In total, 2,105 clinical images were gathered, with 13,263 cells from seven distinct classes, via both contour fine labeling and bounding box labeling. The dataset was trained and tested by the YOLOv8 instance segmentation network. The results demonstrated that the dataset and model we provided are beneficial for the study of automated cell identification in BALF.
ISSN:2052-4463