Dual retrieving and ranking medical large language model with retrieval augmented generation

Abstract Recent advancements in large language models (LLMs) have significantly enhanced text generation across various sectors; however, their medical application faces critical challenges regarding both accuracy and real-time responsiveness. To address these dual challenges, we propose a novel two...

Full description

Saved in:

Bibliographic Details
Main Authors:	Qimin Yang, Huan Zuo, Runqi Su, Hanyinghong Su, Tangyi Zeng, Huimei Zhou, Rongsheng Wang, Jiexin Chen, Yijun Lin, Zhiyi Chen, Tao Tan
Format:	Article
Language:	English
Published:	Nature Portfolio 2025-05-01
Series:	Scientific Reports
Subjects:	Medical-large language model Artificial intelligence (AI) Retrieval-augmented generation (RAG)
Online Access:	https://doi.org/10.1038/s41598-025-00724-w
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Abstract Recent advancements in large language models (LLMs) have significantly enhanced text generation across various sectors; however, their medical application faces critical challenges regarding both accuracy and real-time responsiveness. To address these dual challenges, we propose a novel two-step retrieval and ranking retrieval-augmented generation (RAG) framework that synergistically combines embedding search with Elasticsearch technology. Built upon a dynamically updated medical knowledge base incorporating expert-reviewed documents from leading healthcare institutions, our hybrid architecture employs ColBERTv2 for context-aware result ranking while maintaining computational efficiency. Experimental results show a 10% improvement in accuracy for complex medical queries compared to standalone LLM and single-search RAG variants, while acknowledging that latency challenges remain in emergency situations requiring sub-second responses in an experimental setting, which can be achieved in real-time using more powerful hardware in real-world deployments. This work establishes a new paradigm for reliable medical AI assistants that successfully balances accuracy and practical deployment considerations.
ISSN:	2045-2322

Dual retrieving and ranking medical large language model with retrieval augmented generation

Similar Items