A deep multiple instance learning framework improves microsatellite instability detection from tumor next generation sequencing

Abstract Microsatellite instability (MSI) is a critical phenotype of cancer genomes and an FDA-recognized biomarker that can guide treatment with immune checkpoint inhibitors. Previous work has demonstrated that next-generation sequencing data can be used to identify samples with MSI-high phenotype....

Full description

Saved in:
Bibliographic Details
Main Authors: John Ziegler, Jaclyn F. Hechtman, Satshil Rana, Ryan N. Ptashkin, Gowtham Jayakumaran, Sumit Middha, Shweta S. Chavan, Chad Vanderbilt, Deborah DeLair, Jacklyn Casanova, Jinru Shia, Nicole DeGroat, Ryma Benayed, Marc Ladanyi, Michael F. Berger, Thomas J. Fuchs, A. Rose Brannon, Ahmet Zehir
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-54970-z
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Microsatellite instability (MSI) is a critical phenotype of cancer genomes and an FDA-recognized biomarker that can guide treatment with immune checkpoint inhibitors. Previous work has demonstrated that next-generation sequencing data can be used to identify samples with MSI-high phenotype. However, low tumor purity, as frequently observed in routine clinical samples, poses a challenge to the sensitivity of existing algorithms. To overcome this critical issue, we developed MiMSI, an MSI classifier based on deep neural networks and trained using a dataset that included low tumor purity MSI cases in a multiple instance learning framework. On a challenging yet representative set of cases, MiMSI showed higher sensitivity (0.895) and auROC (0.971) than MSISensor (sensitivity: 0.67; auROC: 0.907), an open-source software previously validated for clinical use at our institution using MSK-IMPACT large panel targeted NGS data. In a separate, prospective cohort, MiMSI confirmed that it outperforms MSISensor in low purity cases (P = 8.244e-07).
ISSN:2041-1723