Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology

This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Julien Delaunay, Jordi Cusido
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Applied Sciences
Subjects:	large language models medical diagnosis natural language processing healthcare Spanish language prompt techniques
Online Access:	https://www.mdpi.com/2076-3417/15/1/61
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population.
ISSN:	2076-3417

Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology

Similar Items