COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS

Single-cell RNA sequencing (scRNA-seq) offers unprecedented insight into cellular diversity in complex tissues like peripheral blood mononuclear cells (PBMC). Furthermore, differential gene expression at a single-cell level can provide a basis for understanding the specialized roles of individual c...

Full description

Saved in:
Bibliographic Details
Main Authors: Larisa Goričan, Boris Gole, Gregor Jezernik, Gloria Krajnc, Uroš Potočnik, Mario Gorenjak
Format: Article
Language:English
Published: University of Ljubljana Press (Založba Univerze v Ljubljani) 2024-12-01
Series:Slovenian Veterinary Research
Subjects:
Online Access:https://www.slovetres.si/index.php/SVR/article/view/1920
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846099186710216704
author Larisa Goričan
Boris Gole
Gregor Jezernik
Gloria Krajnc
Uroš Potočnik
Mario Gorenjak
author_facet Larisa Goričan
Boris Gole
Gregor Jezernik
Gloria Krajnc
Uroš Potočnik
Mario Gorenjak
author_sort Larisa Goričan
collection DOAJ
description Single-cell RNA sequencing (scRNA-seq) offers unprecedented insight into cellular diversity in complex tissues like peripheral blood mononuclear cells (PBMC). Furthermore, differential gene expression at a single-cell level can provide a basis for understanding the specialized roles of individual cells and cell types in biological processes and disease mechanisms. Accurate annotation of cell types in scRNA-seq datasets is, however, challenging due to the high complexity of the data. Here, we compare two cell-type annotation strategies applied to PBMCs in scRNA-seq datasets: automated reference-based tool Azimuth and unsupervised Shared Nearest Neighbor (SNN) clustering, followed by manual annotation. Our results highlight the strengths and limitations of the two approaches. Azimuth easily processed large-scale scRNAseq datasets and reliably identified even relatively rare cell populations. It, however, struggled with cell types outside its reference range. In contrast, unsupervised SNN clustering clearly delineated all the different cell populations in a sample. This makes it well suited for identifying rare or novel cell types, but the method requires time-consuming and bias-prone manual annotation. To minimize the bias, we used rigorous criteria and the collaborative expertise of multiple independent evaluators, which resulted in the manual annotation that was closely related to the automated one. Finally, pseudo-temporal analysis of the major cell types further confirmed the validity of the Azimuth and manual annotations. In conclusion, each annotation method has its merits and downsides. Our research thus highlights the need to combine different clustering and annotation approaches to manage the complexity of scRNA-seq and to improve the reliability and depth of scRNA-seq analyses. Primerjalna analiza referenčno osnovanega mapiranja celičnih tipov in ročne anotacije pri analizi sekvenciranja RNA posamezne celice Izvleček: Sekvenciranje RNA v posamezni celici (scRNA-seq) omogoča edinstven vpogled v celično raznolikost kompleksnih tkiv, kot so mononuklearne celice periferne krvi (PBMC). Dodatno je diferencialno izražanje genov na ravni posameznih celic lahko osnova za razumevanje specializiranih vlog posameznih celic in celičnih tipov v bioloških procesih in bolezenskih mehanizmih. Zaradi velike kompleksnosti pa je točna določitev celičnih tipov v zbirkah podatkov scRNA-seq zahtevna. V članku primerjamo dve strategiji določanja celičnih tipov, ki se uporabljata za PBMC v zbirkah podatkov scRNA-seq: avtomatizirano, na referenčnih bazah podatkov temelječe orodje »Azimuth« in nenadzorovano razvrščanje v grozde »Shared Nearest Neighbour« (SNN), ki mu sledi ročno določanje celičnih tipov. Naši rezultati poudarijo prednosti in omejitve obeh pristopov. »Azimuth« je zlahka obdelal obsežne podatkovne nize scRNAseq in zanesljivo prepoznal tudi razmeroma redke populacije celic. Imel pa je težave s celičnimi tipi izven svojega referenčnega območja. Nasprotno je nenadzorovano razvrščanje SNN jasno razmejilo vse različne celične populacije v vzorcu. Metoda SNN je zato zelo primerna za prepoznavanje redkih ali novih tipov celic, vendar zahteva dolgotrajno ročno določanje celičnih tipov, ki je nagnjeno k pristranskosti. S strogimi merili in skupnim strokovnim znanjem več neodvisnih ocenjevalcev smo to pristranskost minimalizirali. Naše ročno določanje celičnih tipov je tako le malo odstopalo od avtomatiziranega. Nazadnje je veljavnost določitve celičnih tipov z orodjem »Azimuth« in ročno metodo potrdila še psevdočasovna analiza glavnih celičnih tipov. Naša raziskava tako poudarja nujo po kombiniranju različnih pristopov razvrščanja in določanja celičnih populacij za izboljšanje zanesljivosti in globine analiz scRNA-seq. Ključne besede: transkriptomika posamezne celice; mononuklearne celice periferne krvi; referenčno mapiranje; anotacija celičnih tipov; imunski sistem
format Article
id doaj-art-a04603a8d96a4aab99d852be2351643d
institution Kabale University
issn 1580-4003
2385-8761
language English
publishDate 2024-12-01
publisher University of Ljubljana Press (Založba Univerze v Ljubljani)
record_format Article
series Slovenian Veterinary Research
spelling doaj-art-a04603a8d96a4aab99d852be2351643d2024-12-31T23:23:15ZengUniversity of Ljubljana Press (Založba Univerze v Ljubljani)Slovenian Veterinary Research1580-40032385-87612024-12-0161410.26873/SVR-1920-2024COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSISLarisa Goričan0Boris Gole1Gregor Jezernik2Gloria Krajnc3Uroš Potočnik4Mario Gorenjak5Centre for Human Genetics and Pharmacogenomics, Faculty of Medicine, University of Maribor, Taborska ulica 8, SI-2000 Maribor, SloveniaCentre for Human Genetics and Pharmacogenomics, Faculty of Medicine, University of Maribor, Taborska ulica 8, SI-2000 Maribor, SloveniaCentre for Human Genetics and Pharmacogenomics, Faculty of Medicine, University of Maribor, Taborska ulica 8, SI-2000 Maribor, SloveniaCentre for Human Genetics and Pharmacogenomics, Faculty of Medicine, University of Maribor, Taborska ulica 8, SI-2000 Maribor; Department for Science and Research, University Medical Centre Maribor, Ljubljanska ulica 5, SI-2000 Maribor, SloveniaCentre for Human Genetics and Pharmacogenomics, Faculty of Medicine, University of Maribor, Taborska ulica 8, SI-2000 Maribor; Laboratory for Biochemistry, Molecular Biology and Genomics, Faculty of Chemistry and Chemical Engineering, University of Maribor, Smetanova ulica 17, SI-2000 Maribor; Department for Science and Research, University Medical Centre Maribor, Ljubljanska ulica 5, SI-2000 Maribor, SloveniaCentre for Human Genetics and Pharmacogenomics, Faculty of Medicine, University of Maribor, Taborska ulica 8, SI-2000 Maribor, Slovenia, mario.gorenjak@um.si Single-cell RNA sequencing (scRNA-seq) offers unprecedented insight into cellular diversity in complex tissues like peripheral blood mononuclear cells (PBMC). Furthermore, differential gene expression at a single-cell level can provide a basis for understanding the specialized roles of individual cells and cell types in biological processes and disease mechanisms. Accurate annotation of cell types in scRNA-seq datasets is, however, challenging due to the high complexity of the data. Here, we compare two cell-type annotation strategies applied to PBMCs in scRNA-seq datasets: automated reference-based tool Azimuth and unsupervised Shared Nearest Neighbor (SNN) clustering, followed by manual annotation. Our results highlight the strengths and limitations of the two approaches. Azimuth easily processed large-scale scRNAseq datasets and reliably identified even relatively rare cell populations. It, however, struggled with cell types outside its reference range. In contrast, unsupervised SNN clustering clearly delineated all the different cell populations in a sample. This makes it well suited for identifying rare or novel cell types, but the method requires time-consuming and bias-prone manual annotation. To minimize the bias, we used rigorous criteria and the collaborative expertise of multiple independent evaluators, which resulted in the manual annotation that was closely related to the automated one. Finally, pseudo-temporal analysis of the major cell types further confirmed the validity of the Azimuth and manual annotations. In conclusion, each annotation method has its merits and downsides. Our research thus highlights the need to combine different clustering and annotation approaches to manage the complexity of scRNA-seq and to improve the reliability and depth of scRNA-seq analyses. Primerjalna analiza referenčno osnovanega mapiranja celičnih tipov in ročne anotacije pri analizi sekvenciranja RNA posamezne celice Izvleček: Sekvenciranje RNA v posamezni celici (scRNA-seq) omogoča edinstven vpogled v celično raznolikost kompleksnih tkiv, kot so mononuklearne celice periferne krvi (PBMC). Dodatno je diferencialno izražanje genov na ravni posameznih celic lahko osnova za razumevanje specializiranih vlog posameznih celic in celičnih tipov v bioloških procesih in bolezenskih mehanizmih. Zaradi velike kompleksnosti pa je točna določitev celičnih tipov v zbirkah podatkov scRNA-seq zahtevna. V članku primerjamo dve strategiji določanja celičnih tipov, ki se uporabljata za PBMC v zbirkah podatkov scRNA-seq: avtomatizirano, na referenčnih bazah podatkov temelječe orodje »Azimuth« in nenadzorovano razvrščanje v grozde »Shared Nearest Neighbour« (SNN), ki mu sledi ročno določanje celičnih tipov. Naši rezultati poudarijo prednosti in omejitve obeh pristopov. »Azimuth« je zlahka obdelal obsežne podatkovne nize scRNAseq in zanesljivo prepoznal tudi razmeroma redke populacije celic. Imel pa je težave s celičnimi tipi izven svojega referenčnega območja. Nasprotno je nenadzorovano razvrščanje SNN jasno razmejilo vse različne celične populacije v vzorcu. Metoda SNN je zato zelo primerna za prepoznavanje redkih ali novih tipov celic, vendar zahteva dolgotrajno ročno določanje celičnih tipov, ki je nagnjeno k pristranskosti. S strogimi merili in skupnim strokovnim znanjem več neodvisnih ocenjevalcev smo to pristranskost minimalizirali. Naše ročno določanje celičnih tipov je tako le malo odstopalo od avtomatiziranega. Nazadnje je veljavnost določitve celičnih tipov z orodjem »Azimuth« in ročno metodo potrdila še psevdočasovna analiza glavnih celičnih tipov. Naša raziskava tako poudarja nujo po kombiniranju različnih pristopov razvrščanja in določanja celičnih populacij za izboljšanje zanesljivosti in globine analiz scRNA-seq. Ključne besede: transkriptomika posamezne celice; mononuklearne celice periferne krvi; referenčno mapiranje; anotacija celičnih tipov; imunski sistem https://www.slovetres.si/index.php/SVR/article/view/1920single-cell transcriptomicsperipheral blood mononuclear cellsreference mappingcell-type annotationimmune system
spellingShingle Larisa Goričan
Boris Gole
Gregor Jezernik
Gloria Krajnc
Uroš Potočnik
Mario Gorenjak
COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS
Slovenian Veterinary Research
single-cell transcriptomics
peripheral blood mononuclear cells
reference mapping
cell-type annotation
immune system
title COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS
title_full COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS
title_fullStr COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS
title_full_unstemmed COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS
title_short COMPARATIVE ANALYSIS OF REFERENCE-BASED CELL TYPE MAPPING AND MANUAL ANNOTATION IN SINGLE CELL RNA SEQUENCING ANALYSIS
title_sort comparative analysis of reference based cell type mapping and manual annotation in single cell rna sequencing analysis
topic single-cell transcriptomics
peripheral blood mononuclear cells
reference mapping
cell-type annotation
immune system
url https://www.slovetres.si/index.php/SVR/article/view/1920
work_keys_str_mv AT larisagorican comparativeanalysisofreferencebasedcelltypemappingandmanualannotationinsinglecellrnasequencinganalysis
AT borisgole comparativeanalysisofreferencebasedcelltypemappingandmanualannotationinsinglecellrnasequencinganalysis
AT gregorjezernik comparativeanalysisofreferencebasedcelltypemappingandmanualannotationinsinglecellrnasequencinganalysis
AT gloriakrajnc comparativeanalysisofreferencebasedcelltypemappingandmanualannotationinsinglecellrnasequencinganalysis
AT urospotocnik comparativeanalysisofreferencebasedcelltypemappingandmanualannotationinsinglecellrnasequencinganalysis
AT mariogorenjak comparativeanalysisofreferencebasedcelltypemappingandmanualannotationinsinglecellrnasequencinganalysis