Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data
Abstract We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from r...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13059-024-03462-w |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841544599496032256 |
---|---|
author | Dilek Koptekin Etka Yapar Kıvılcım Başak Vural Ekin Sağlıcan N. Ezgi Altınışık Anna-Sapfo Malaspinas Can Alkan Mehmet Somel |
author_facet | Dilek Koptekin Etka Yapar Kıvılcım Başak Vural Ekin Sağlıcan N. Ezgi Altınışık Anna-Sapfo Malaspinas Can Alkan Mehmet Somel |
author_sort | Dilek Koptekin |
collection | DOAJ |
description | Abstract We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from raw read files. We next study approaches to overcome postmortem damage: trimming, rescaling, and our newly developed algorithm, bamRefine (github.com/etkayapar/bamRefine and zenodo.org/records/14234666), masking reads only at positions possibly affected by PMD. We propose graph alignment coupled with bamRefine as a simple strategy to minimize data loss and bias, and urge the community to publish FASTQ files. |
format | Article |
id | doaj-art-8cede3d9c7034bb6b65af3eaeda1a6ac |
institution | Kabale University |
issn | 1474-760X |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj-art-8cede3d9c7034bb6b65af3eaeda1a6ac2025-01-12T12:25:59ZengBMCGenome Biology1474-760X2025-01-0126112310.1186/s13059-024-03462-wPre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome dataDilek Koptekin0Etka Yapar1Kıvılcım Başak Vural2Ekin Sağlıcan3N. Ezgi Altınışık4Anna-Sapfo Malaspinas5Can Alkan6Mehmet Somel7Department of Biological Sciences, Middle East Technical UniversityDepartment of Biological Sciences, Middle East Technical UniversityDepartment of Biological Sciences, Middle East Technical UniversityDepartment of Biological Sciences, Middle East Technical UniversityHuman-G Laboratory, Department of Anthropology, Hacettepe UniversityDepartment of Computational Biology, University of LausanneDepartment of Computer Engineering, Bilkent UniversityDepartment of Biological Sciences, Middle East Technical UniversityAbstract We investigate alternative strategies against reference bias and postmortem damage in low coverage paleogenomes. Compared to alignment to the linear reference genome, we show that masking known polymorphic sites and graph alignment effectively remove reference bias, but only starting from raw read files. We next study approaches to overcome postmortem damage: trimming, rescaling, and our newly developed algorithm, bamRefine (github.com/etkayapar/bamRefine and zenodo.org/records/14234666), masking reads only at positions possibly affected by PMD. We propose graph alignment coupled with bamRefine as a simple strategy to minimize data loss and bias, and urge the community to publish FASTQ files.https://doi.org/10.1186/s13059-024-03462-wAncient DNAReference biasGraph-reference genomePost-mortem damageMasking |
spellingShingle | Dilek Koptekin Etka Yapar Kıvılcım Başak Vural Ekin Sağlıcan N. Ezgi Altınışık Anna-Sapfo Malaspinas Can Alkan Mehmet Somel Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data Genome Biology Ancient DNA Reference bias Graph-reference genome Post-mortem damage Masking |
title | Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data |
title_full | Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data |
title_fullStr | Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data |
title_full_unstemmed | Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data |
title_short | Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data |
title_sort | pre processing of paleogenomes mitigating reference bias and postmortem damage in ancient genome data |
topic | Ancient DNA Reference bias Graph-reference genome Post-mortem damage Masking |
url | https://doi.org/10.1186/s13059-024-03462-w |
work_keys_str_mv | AT dilekkoptekin preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT etkayapar preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT kıvılcımbasakvural preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT ekinsaglıcan preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT nezgialtınısık preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT annasapfomalaspinas preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT canalkan preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata AT mehmetsomel preprocessingofpaleogenomesmitigatingreferencebiasandpostmortemdamageinancientgenomedata |