SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data
Drawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-ce...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Tsinghua University Press
2024-12-01
|
| Series: | Big Data Mining and Analytics |
| Subjects: | |
| Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2024.9020056 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846100773383962624 |
|---|---|
| author | Boya Ji Xiaoqi Wang Debin Qiao Liwen Xu Shaoliang Peng |
| author_facet | Boya Ji Xiaoqi Wang Debin Qiao Liwen Xu Shaoliang Peng |
| author_sort | Boya Ji |
| collection | DOAJ |
| description | Drawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data. SpaCCC is free and publicly available at https://github.com/jiboyalab/SpaCCC. |
| format | Article |
| id | doaj-art-a63d02bf05654d77b7f9535f4dae8b35 |
| institution | Kabale University |
| issn | 2096-0654 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Tsinghua University Press |
| record_format | Article |
| series | Big Data Mining and Analytics |
| spelling | doaj-art-a63d02bf05654d77b7f9535f4dae8b352024-12-29T15:36:22ZengTsinghua University PressBig Data Mining and Analytics2096-06542024-12-01741129114710.26599/BDMA.2024.9020056SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic DataBoya Ji0Xiaoqi Wang1Debin Qiao2Liwen Xu3Shaoliang Peng4College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an 710000, ChinaSchool of Computer and Artificial Intelligence and National Supercomputing Center in Zhengzhou, Zhengzhou University, Zhengzhou 450001, ChinaCollege of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, ChinaCollege of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, ChinaDrawing parallels between linguistic constructs and cellular biology, Large Language Models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer Ligand-Receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data. SpaCCC is free and publicly available at https://github.com/jiboyalab/SpaCCC.https://www.sciopen.com/article/10.26599/BDMA.2024.9020056large language models (llm)spatial transcriptome datacell-cell communications (cccs)functional gene interaction networksunified latent space |
| spellingShingle | Boya Ji Xiaoqi Wang Debin Qiao Liwen Xu Shaoliang Peng SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data Big Data Mining and Analytics large language models (llm) spatial transcriptome data cell-cell communications (cccs) functional gene interaction networks unified latent space |
| title | SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data |
| title_full | SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data |
| title_fullStr | SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data |
| title_full_unstemmed | SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data |
| title_short | SpaCCC: Large Language Model-Based Cell-Cell Communication Inference for Spatially Resolved Transcriptomic Data |
| title_sort | spaccc large language model based cell cell communication inference for spatially resolved transcriptomic data |
| topic | large language models (llm) spatial transcriptome data cell-cell communications (cccs) functional gene interaction networks unified latent space |
| url | https://www.sciopen.com/article/10.26599/BDMA.2024.9020056 |
| work_keys_str_mv | AT boyaji spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata AT xiaoqiwang spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata AT debinqiao spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata AT liwenxu spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata AT shaoliangpeng spaccclargelanguagemodelbasedcellcellcommunicationinferenceforspatiallyresolvedtranscriptomicdata |