RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis

ABSTRACT Reproducibility is a fundamental expectation in science and enables investigators to have confidence in their research findings and the ability to compare data from disparate sources, but evaluating reproducibility can be elusive. For example, generating RNA sequencing (RNA-seq) data includ...

Full description

Saved in:
Bibliographic Details
Main Authors: Rebecca P. Duncan, Gina R. Lewin, Daniel M. Cornforth, Frances L. Diggle, Ananya Kapur, Dina A. Moustafa, Yasmin Hilliam, Jennifer M. Bomberger, Marvin Whiteley, Joanna B. Goldberg
Format: Article
Language:English
Published: American Society for Microbiology 2025-01-01
Series:Microbiology Spectrum
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/spectrum.01513-24
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841556067563077632
author Rebecca P. Duncan
Gina R. Lewin
Daniel M. Cornforth
Frances L. Diggle
Ananya Kapur
Dina A. Moustafa
Yasmin Hilliam
Jennifer M. Bomberger
Marvin Whiteley
Joanna B. Goldberg
author_facet Rebecca P. Duncan
Gina R. Lewin
Daniel M. Cornforth
Frances L. Diggle
Ananya Kapur
Dina A. Moustafa
Yasmin Hilliam
Jennifer M. Bomberger
Marvin Whiteley
Joanna B. Goldberg
author_sort Rebecca P. Duncan
collection DOAJ
description ABSTRACT Reproducibility is a fundamental expectation in science and enables investigators to have confidence in their research findings and the ability to compare data from disparate sources, but evaluating reproducibility can be elusive. For example, generating RNA sequencing (RNA-seq) data includes multiple steps where variance can be introduced. Thus, it is unclear if RNA-seq data from different sources can be validly compared. While most studies on RNA-seq reproducibility focus on eukaryotes, we evaluate bias in bacteria using Pseudomonas aeruginosa gene expression data from five laboratory models of cystic fibrosis. We leverage a large data set that includes samples prepared in three different laboratories and paired data sets where the same sample was sequenced using at least two different sequencing pipelines. We report here that expression data are highly reproducible across laboratories. In addition, while samples sequenced with different sequencing pipelines showed significantly more variance in expression profiles than between labs, gene expression was still highly reproducible between sequencing pipelines. Further investigation of expression differences between two sequencing pipelines revealed that library preparation methods were the largest source of error, though analyses to identify the source of this variance were inconclusive. Consistent with the reproducibility of expression between sequencing pipelines, we found that different pipelines detected over 80% of the same differentially expressed genes with large expression differences between conditions. Thus, bacterial RNA-seq data from different sources can be validly compared, facilitating the ability to advance understanding of bacterial behavior and physiology using the wide array of publicly available RNA-seq data sets.IMPORTANCERNA sequencing (RNA-seq) has revolutionized biology, but many steps in RNA-seq workflows can introduce variance, potentially compromising reproducibility. While reproducibility in RNA-seq has been thoroughly investigated in eukaryotes, less is known about pipelines and workflows that introduce variance and biases in bacterial RNA-seq data. By leveraging Pseudomonas aeruginosa transcriptomes in cystic fibrosis models from different laboratories and sequenced with different sequencing pipelines, we directly assess sources of bacterial RNA-seq variance. RNA-seq data were highly reproducible, with the largest variance due to sequencing pipelines, specifically library preparation. Different sequencing pipelines detected overlapping differentially expressed genes, especially those with large expression differences between conditions. This study confirms that different approaches to preparing and sequencing bacterial RNA libraries capture comparable transcriptional profiles, supporting investigators’ ability to leverage diverse RNA-seq data sets to advance their science.
format Article
id doaj-art-096330e9d10f42f89c3092e62c00c6e0
institution Kabale University
issn 2165-0497
language English
publishDate 2025-01-01
publisher American Society for Microbiology
record_format Article
series Microbiology Spectrum
spelling doaj-art-096330e9d10f42f89c3092e62c00c6e02025-01-07T14:05:18ZengAmerican Society for MicrobiologyMicrobiology Spectrum2165-04972025-01-0113110.1128/spectrum.01513-24RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosisRebecca P. Duncan0Gina R. Lewin1Daniel M. Cornforth2Frances L. Diggle3Ananya Kapur4Dina A. Moustafa5Yasmin Hilliam6Jennifer M. Bomberger7Marvin Whiteley8Joanna B. Goldberg9Division of Pulmonary, Asthma, Cystic Fibrosis, and Sleep, Department of Pediatrics, Emory University School of Medicine, Atlanta, Georgia, USAEmory-Children’s Cystic Fibrosis Center, Atlanta, Georgia, USAEmory-Children’s Cystic Fibrosis Center, Atlanta, Georgia, USAEmory-Children’s Cystic Fibrosis Center, Atlanta, Georgia, USADepartment of Microbiology and Molecular Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania, USADivision of Pulmonary, Asthma, Cystic Fibrosis, and Sleep, Department of Pediatrics, Emory University School of Medicine, Atlanta, Georgia, USADepartment of Microbiology and Molecular Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania, USADepartment of Microbiology and Molecular Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania, USAEmory-Children’s Cystic Fibrosis Center, Atlanta, Georgia, USADivision of Pulmonary, Asthma, Cystic Fibrosis, and Sleep, Department of Pediatrics, Emory University School of Medicine, Atlanta, Georgia, USAABSTRACT Reproducibility is a fundamental expectation in science and enables investigators to have confidence in their research findings and the ability to compare data from disparate sources, but evaluating reproducibility can be elusive. For example, generating RNA sequencing (RNA-seq) data includes multiple steps where variance can be introduced. Thus, it is unclear if RNA-seq data from different sources can be validly compared. While most studies on RNA-seq reproducibility focus on eukaryotes, we evaluate bias in bacteria using Pseudomonas aeruginosa gene expression data from five laboratory models of cystic fibrosis. We leverage a large data set that includes samples prepared in three different laboratories and paired data sets where the same sample was sequenced using at least two different sequencing pipelines. We report here that expression data are highly reproducible across laboratories. In addition, while samples sequenced with different sequencing pipelines showed significantly more variance in expression profiles than between labs, gene expression was still highly reproducible between sequencing pipelines. Further investigation of expression differences between two sequencing pipelines revealed that library preparation methods were the largest source of error, though analyses to identify the source of this variance were inconclusive. Consistent with the reproducibility of expression between sequencing pipelines, we found that different pipelines detected over 80% of the same differentially expressed genes with large expression differences between conditions. Thus, bacterial RNA-seq data from different sources can be validly compared, facilitating the ability to advance understanding of bacterial behavior and physiology using the wide array of publicly available RNA-seq data sets.IMPORTANCERNA sequencing (RNA-seq) has revolutionized biology, but many steps in RNA-seq workflows can introduce variance, potentially compromising reproducibility. While reproducibility in RNA-seq has been thoroughly investigated in eukaryotes, less is known about pipelines and workflows that introduce variance and biases in bacterial RNA-seq data. By leveraging Pseudomonas aeruginosa transcriptomes in cystic fibrosis models from different laboratories and sequenced with different sequencing pipelines, we directly assess sources of bacterial RNA-seq variance. RNA-seq data were highly reproducible, with the largest variance due to sequencing pipelines, specifically library preparation. Different sequencing pipelines detected overlapping differentially expressed genes, especially those with large expression differences between conditions. This study confirms that different approaches to preparing and sequencing bacterial RNA libraries capture comparable transcriptional profiles, supporting investigators’ ability to leverage diverse RNA-seq data sets to advance their science.https://journals.asm.org/doi/10.1128/spectrum.01513-24RNA-seqPseudomonas aeruginosareproducibilityepithelial cell modelSCFM2cystic fibrosis
spellingShingle Rebecca P. Duncan
Gina R. Lewin
Daniel M. Cornforth
Frances L. Diggle
Ananya Kapur
Dina A. Moustafa
Yasmin Hilliam
Jennifer M. Bomberger
Marvin Whiteley
Joanna B. Goldberg
RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis
Microbiology Spectrum
RNA-seq
Pseudomonas aeruginosa
reproducibility
epithelial cell model
SCFM2
cystic fibrosis
title RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis
title_full RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis
title_fullStr RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis
title_full_unstemmed RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis
title_short RNA-seq reproducibility of Pseudomonas aeruginosa in laboratory models of cystic fibrosis
title_sort rna seq reproducibility of pseudomonas aeruginosa in laboratory models of cystic fibrosis
topic RNA-seq
Pseudomonas aeruginosa
reproducibility
epithelial cell model
SCFM2
cystic fibrosis
url https://journals.asm.org/doi/10.1128/spectrum.01513-24
work_keys_str_mv AT rebeccapduncan rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT ginarlewin rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT danielmcornforth rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT francesldiggle rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT ananyakapur rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT dinaamoustafa rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT yasminhilliam rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT jennifermbomberger rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT marvinwhiteley rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis
AT joannabgoldberg rnaseqreproducibilityofpseudomonasaeruginosainlaboratorymodelsofcysticfibrosis