Small variant benchmark from a complete assembly of X and Y chromosomes

Abstract The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research and clinical laboratories to evaluate variant detection on male sex chromosomes...

Full description

Saved in:
Bibliographic Details
Main Authors: Justin Wagner, Nathan D. Olson, Jennifer McDaniel, Lindsay Harris, Brendan J. Pinto, David Jáspez, Adrián Muñoz-Barrera, Luis A. Rubio-Rodríguez, José M. Lorenzo-Salazar, Carlos Flores, Sayed Mohammad Ebrahim Sahraeian, Giuseppe Narzisi, Marta Byrska-Bishop, Uday S. Evani, Chunlin Xiao, Juniper A. Lake, Peter Fontana, Craig Greenberg, Donald Freed, Mohammed Faizal Eeman Mootor, Paul C. Boutros, Lisa Murray, Kishwar Shafin, Andrew Carroll, Fritz J. Sedlazeck, Melissa Wilson, Justin M. Zook
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Nature Communications
Online Access:https://doi.org/10.1038/s41467-024-55710-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841544418294759424
author Justin Wagner
Nathan D. Olson
Jennifer McDaniel
Lindsay Harris
Brendan J. Pinto
David Jáspez
Adrián Muñoz-Barrera
Luis A. Rubio-Rodríguez
José M. Lorenzo-Salazar
Carlos Flores
Sayed Mohammad Ebrahim Sahraeian
Giuseppe Narzisi
Marta Byrska-Bishop
Uday S. Evani
Chunlin Xiao
Juniper A. Lake
Peter Fontana
Craig Greenberg
Donald Freed
Mohammed Faizal Eeman Mootor
Paul C. Boutros
Lisa Murray
Kishwar Shafin
Andrew Carroll
Fritz J. Sedlazeck
Melissa Wilson
Justin M. Zook
author_facet Justin Wagner
Nathan D. Olson
Jennifer McDaniel
Lindsay Harris
Brendan J. Pinto
David Jáspez
Adrián Muñoz-Barrera
Luis A. Rubio-Rodríguez
José M. Lorenzo-Salazar
Carlos Flores
Sayed Mohammad Ebrahim Sahraeian
Giuseppe Narzisi
Marta Byrska-Bishop
Uday S. Evani
Chunlin Xiao
Juniper A. Lake
Peter Fontana
Craig Greenberg
Donald Freed
Mohammed Faizal Eeman Mootor
Paul C. Boutros
Lisa Murray
Kishwar Shafin
Andrew Carroll
Fritz J. Sedlazeck
Melissa Wilson
Justin M. Zook
author_sort Justin Wagner
collection DOAJ
description Abstract The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research and clinical laboratories to evaluate variant detection on male sex chromosomes X and Y, we create a small variant benchmark set with 111,725 variants for the Genome in a Bottle HG002 reference material. We develop an active evaluation approach to demonstrate the benchmark set reliably identifies errors in challenging genomic regions and across short and long read callsets. We show how complete assemblies can expand benchmarks to difficult regions, but highlight remaining challenges benchmarking variants in long homopolymers and tandem repeats, complex gene conversions, copy number variable gene arrays, and human satellites.
format Article
id doaj-art-2d384cdcf1274ec7b44aa9c241ff7ea0
institution Kabale University
issn 2041-1723
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Nature Communications
spelling doaj-art-2d384cdcf1274ec7b44aa9c241ff7ea02025-01-12T12:30:55ZengNature PortfolioNature Communications2041-17232025-01-011611710.1038/s41467-024-55710-zSmall variant benchmark from a complete assembly of X and Y chromosomesJustin Wagner0Nathan D. Olson1Jennifer McDaniel2Lindsay Harris3Brendan J. Pinto4David Jáspez5Adrián Muñoz-Barrera6Luis A. Rubio-Rodríguez7José M. Lorenzo-Salazar8Carlos Flores9Sayed Mohammad Ebrahim Sahraeian10Giuseppe Narzisi11Marta Byrska-Bishop12Uday S. Evani13Chunlin Xiao14Juniper A. Lake15Peter Fontana16Craig Greenberg17Donald Freed18Mohammed Faizal Eeman Mootor19Paul C. Boutros20Lisa Murray21Kishwar Shafin22Andrew Carroll23Fritz J. Sedlazeck24Melissa Wilson25Justin M. Zook26Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr.Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr.Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr.Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr.Center for Evolution & Medicine and School of Life Sciences, Arizona State University, Tempe, AZ 85281 USA - Department of Zoology, Milwaukee Public MuseumGenomics Division, Instituto Tecnológico y de Energías Renovables (ITER)Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER)Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER)Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER)Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER)Roche Sequencing SolutionsNew York Genome CenterNew York Genome CenterNew York Genome CenterNational Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthPacific BiosciencesInformation Technology Laboratory, National Institute of Standards and Technology, 100 Bureau Dr. Mailstop 8940Information Technology Laboratory, National Institute of Standards and Technology, 100 Bureau Dr. Mailstop 8940Sentieon Inc.Department of Human Genetics, University of California Los AngelesDepartment of Human Genetics, University of California Los AngelesIlluminaGoogle Inc, 1600 Amphitheatre PkwyGoogle Inc, 1600 Amphitheatre PkwyBaylor College of Medicine Human Genome Sequencing CenterCenter for Evolution & Medicine and School of Life Sciences, Arizona State UniversityMaterial Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr.Abstract The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research and clinical laboratories to evaluate variant detection on male sex chromosomes X and Y, we create a small variant benchmark set with 111,725 variants for the Genome in a Bottle HG002 reference material. We develop an active evaluation approach to demonstrate the benchmark set reliably identifies errors in challenging genomic regions and across short and long read callsets. We show how complete assemblies can expand benchmarks to difficult regions, but highlight remaining challenges benchmarking variants in long homopolymers and tandem repeats, complex gene conversions, copy number variable gene arrays, and human satellites.https://doi.org/10.1038/s41467-024-55710-z
spellingShingle Justin Wagner
Nathan D. Olson
Jennifer McDaniel
Lindsay Harris
Brendan J. Pinto
David Jáspez
Adrián Muñoz-Barrera
Luis A. Rubio-Rodríguez
José M. Lorenzo-Salazar
Carlos Flores
Sayed Mohammad Ebrahim Sahraeian
Giuseppe Narzisi
Marta Byrska-Bishop
Uday S. Evani
Chunlin Xiao
Juniper A. Lake
Peter Fontana
Craig Greenberg
Donald Freed
Mohammed Faizal Eeman Mootor
Paul C. Boutros
Lisa Murray
Kishwar Shafin
Andrew Carroll
Fritz J. Sedlazeck
Melissa Wilson
Justin M. Zook
Small variant benchmark from a complete assembly of X and Y chromosomes
Nature Communications
title Small variant benchmark from a complete assembly of X and Y chromosomes
title_full Small variant benchmark from a complete assembly of X and Y chromosomes
title_fullStr Small variant benchmark from a complete assembly of X and Y chromosomes
title_full_unstemmed Small variant benchmark from a complete assembly of X and Y chromosomes
title_short Small variant benchmark from a complete assembly of X and Y chromosomes
title_sort small variant benchmark from a complete assembly of x and y chromosomes
url https://doi.org/10.1038/s41467-024-55710-z
work_keys_str_mv AT justinwagner smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT nathandolson smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT jennifermcdaniel smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT lindsayharris smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT brendanjpinto smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT davidjaspez smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT adrianmunozbarrera smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT luisarubiorodriguez smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT josemlorenzosalazar smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT carlosflores smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT sayedmohammadebrahimsahraeian smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT giuseppenarzisi smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT martabyrskabishop smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT udaysevani smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT chunlinxiao smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT juniperalake smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT peterfontana smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT craiggreenberg smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT donaldfreed smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT mohammedfaizaleemanmootor smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT paulcboutros smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT lisamurray smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT kishwarshafin smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT andrewcarroll smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT fritzjsedlazeck smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT melissawilson smallvariantbenchmarkfromacompleteassemblyofxandychromosomes
AT justinmzook smallvariantbenchmarkfromacompleteassemblyofxandychromosomes