Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data

Abstract We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from r...

Full description

Saved in:
Bibliographic Details
Main Authors: Dilek Koptekin, Etka Yapar, Kıvılcım Başak Vural, Ekin Sağlıcan, N. Ezgi Altınışık, Anna-Sapfo Malaspinas, Can Alkan, Mehmet Somel
Format: Article
Language:English
Published: BMC 2025-01-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-024-03462-w
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841544599496032256
author Dilek Koptekin
Etka Yapar
Kıvılcım Başak Vural
Ekin Sağlıcan
N. Ezgi Altınışık
Anna-Sapfo Malaspinas
Can Alkan
Mehmet Somel
author_facet Dilek Koptekin
Etka Yapar
Kıvılcım Başak Vural
Ekin Sağlıcan
N. Ezgi Altınışık
Anna-Sapfo Malaspinas
Can Alkan
Mehmet Somel
author_sort Dilek Koptekin
collection DOAJ
description Abstract We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from raw read files. We next study approaches to overcome postmortem damage: trimming, rescaling, and our newly developed algorithm, bamRefine (github.com/etkayapar/bamRefine and zenodo.org/records/14234666), masking reads only at positions possibly affected by PMD. We propose graph alignment coupled with bamRefine as a simple strategy to minimize data loss and bias, and urge the community to publish FASTQ files.
format Article
id doaj-art-8cede3d9c7034bb6b65af3eaeda1a6ac
institution Kabale University
issn 1474-760X
language English
publishDate 2025-01-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj-art-8cede3d9c7034bb6b65af3eaeda1a6ac2025-01-12T12:25:59ZengBMCGenome Biology1474-760X2025-01-0126112310.1186/s13059-024-03462-wPre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome dataDilek Koptekin0Etka Yapar1Kıvılcım Başak Vural2Ekin Sağlıcan3N. Ezgi Altınışık4Anna-Sapfo Malaspinas5Can Alkan6Mehmet Somel7Department of Biological Sciences, Middle East Technical UniversityDepartment of Biological Sciences, Middle East Technical UniversityDepartment of Biological Sciences, Middle East Technical UniversityDepartment of Biological Sciences, Middle East Technical UniversityHuman-G Laboratory, Department of Anthropology, Hacettepe UniversityDepartment of Computational Biology, University of LausanneDepartment of Computer Engineering, Bilkent UniversityDepartment of Biological Sciences, Middle East Technical UniversityAbstract We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from raw read files. We next study approaches to overcome postmortem damage: trimming, rescaling, and our newly developed algorithm, bamRefine (github.com/etkayapar/bamRefine and zenodo.org/records/14234666), masking reads only at positions possibly affected by PMD. We propose graph alignment coupled with bamRefine as a simple strategy to minimize data loss and bias, and urge the community to publish FASTQ files.https://doi.org/10.1186/s13059-024-03462-wAncient DNAReference biasGraph-reference genomePost-mortem damageMasking
spellingShingle Dilek Koptekin
Etka Yapar
Kıvılcım Başak Vural
Ekin Sağlıcan
N. Ezgi Altınışık
Anna-Sapfo Malaspinas
Can Alkan
Mehmet Somel
Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
Genome Biology
Ancient DNA
Reference bias
Graph-reference genome
Post-mortem damage
Masking
title Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
title_full Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
title_fullStr Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
title_full_unstemmed Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
title_short Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
title_sort pre processing of paleogenomes mitigating reference bias and postmortem damage in ancient genome data
topic Ancient DNA
Reference bias
Graph-reference genome
Post-mortem damage
Masking
url https://doi.org/10.1186/s13059-024-03462-w
work_keys_str_mv AT dilekkoptekin preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT etkayapar preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT kıvılcımbasakvural preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT ekinsaglıcan preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT nezgialtınısık preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT annasapfomalaspinas preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT canalkan preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata
AT mehmetsomel preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata