SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data

Drawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-ce...

Full description

Saved in:
Bibliographic Details
Main Authors: Boya Ji, Xiaoqi Wang, Debin Qiao, Liwen Xu, Shaoliang Peng
Format: Article
Language:English
Published: Tsinghua University Press 2024-12-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2024.9020056
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846100773383962624
author Boya Ji
Xiaoqi Wang
Debin Qiao
Liwen Xu
Shaoliang Peng
author_facet Boya Ji
Xiaoqi Wang
Debin Qiao
Liwen Xu
Shaoliang Peng
author_sort Boya Ji
collection DOAJ
description Drawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data. SpaCCC is free and publicly available at https://github.com/jiboyalab/SpaCCC.
format Article
id doaj-art-a63d02bf05654d77b7f9535f4dae8b35
institution Kabale University
issn 2096-0654
language English
publishDate 2024-12-01
publisher Tsinghua University Press
record_format Article
series Big Data Mining and Analytics
spelling doaj-art-a63d02bf05654d77b7f9535f4dae8b352024-12-29T15:36:22ZengTsinghua University PressBig Data Mining and Analytics2096-06542024-12-01741129114710.26599/BDMA.2024.9020056SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic DataBoya Ji0Xiaoqi Wang1Debin Qiao2Liwen Xu3Shaoliang Peng4College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710000, ChinaSchool of Computer and Artificial Intelligence and National Supercomputing Center in Zhengzhou, Zhengzhou University, Zhengzhou 450001, ChinaCollege of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, ChinaCollege of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, ChinaDrawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data. SpaCCC is free and publicly available at https://github.com/jiboyalab/SpaCCC.https://www.sciopen.com/article/10.26599/BDMA.2024.9020056large language models (llm)spatial transcriptome datacell-cell communications (cccs)functional gene interaction networksunified latent space
spellingShingle Boya Ji
Xiaoqi Wang
Debin Qiao
Liwen Xu
Shaoliang Peng
SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
Big Data Mining and Analytics
large language models (llm)
spatial transcriptome data
cell-cell communications (cccs)
functional gene interaction networks
unified latent space
title SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
title_full SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
title_fullStr SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
title_full_unstemmed SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
title_short SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
title_sort spaccc large language model based cell cell communication inference for spatially resolved transcriptomic data
topic large language models (llm)
spatial transcriptome data
cell-cell communications (cccs)
functional gene interaction networks
unified latent space
url https://www.sciopen.com/article/10.26599/BDMA.2024.9020056
work_keys_str_mv AT boyaji spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata
AT xiaoqiwang spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata
AT debinqiao spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata
AT liwenxu spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata
AT shaoliangpeng spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata