Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology

This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Julien Delaunay, Jordi Cusido
Format:	Article
Language:	English
Published:	MDPI AG 2024-12-01
Series:	Applied Sciences
Subjects:	large language models medical diagnosis natural language processing healthcare Spanish language prompt techniques
Online Access:	https://www.mdpi.com/2076-3417/15/1/61
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841549461066612736
author	Julien Delaunay Jordi Cusido
author_facet	Julien Delaunay Jordi Cusido
author_sort	Julien Delaunay
collection	DOAJ
description	This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population.
format	Article
id	doaj-art-adf10fba72234876b8b48a0378eedb5c
institution	Kabale University
issn	2076-3417
language	English
publishDate	2024-12-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-adf10fba72234876b8b48a0378eedb5c2025-01-10T13:14:18ZengMDPI AGApplied Sciences2076-34172024-12-011516110.3390/app15010061Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in CardiologyJulien Delaunay0Jordi Cusido1Top Health Tech, 08021 Barcelona, SpainTop Health Tech, 08021 Barcelona, SpainThis study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population.https://www.mdpi.com/2076-3417/15/1/61large language modelsmedical diagnosisnatural language processinghealthcareSpanish languageprompt techniques
spellingShingle	Julien Delaunay Jordi Cusido Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology Applied Sciences large language models medical diagnosis natural language processing healthcare Spanish language prompt techniques
title	Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_full	Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_fullStr	Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_full_unstemmed	Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_short	Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_sort	evaluating the performance of large language models in predicting diagnostics for spanish clinical cases in cardiology
topic	large language models medical diagnosis natural language processing healthcare Spanish language prompt techniques
url	https://www.mdpi.com/2076-3417/15/1/61
work_keys_str_mv	AT juliendelaunay evaluatingtheperformanceoflargelanguagemodelsinpredictingdiagnosticsforspanishclinicalcasesincardiology AT jordicusido evaluatingtheperformanceoflargelanguagemodelsinpredictingdiagnosticsforspanishclinicalcasesincardiology

Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology

Similar Items