Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology

This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensi...

Full description

Saved in:
Bibliographic Details
Main Authors: Julien Delaunay, Jordi Cusido
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/61
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841549461066612736
author Julien Delaunay
Jordi Cusido
author_facet Julien Delaunay
Jordi Cusido
author_sort Julien Delaunay
collection DOAJ
description This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population.
format Article
id doaj-art-adf10fba72234876b8b48a0378eedb5c
institution Kabale University
issn 2076-3417
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-adf10fba72234876b8b48a0378eedb5c2025-01-10T13:14:18ZengMDPI AGApplied Sciences2076-34172024-12-011516110.3390/app15010061Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in CardiologyJulien Delaunay0Jordi Cusido1Top Health Tech, 08021 Barcelona, SpainTop Health Tech, 08021 Barcelona, SpainThis study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population.https://www.mdpi.com/2076-3417/15/1/61large language modelsmedical diagnosisnatural language processinghealthcareSpanish languageprompt techniques
spellingShingle Julien Delaunay
Jordi Cusido
Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
Applied Sciences
large language models
medical diagnosis
natural language processing
healthcare
Spanish language
prompt techniques
title Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_full Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_fullStr Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_full_unstemmed Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_short Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
title_sort evaluating the performance of large language models in predicting diagnostics for spanish clinical cases in cardiology
topic large language models
medical diagnosis
natural language processing
healthcare
Spanish language
prompt techniques
url https://www.mdpi.com/2076-3417/15/1/61
work_keys_str_mv AT juliendelaunay evaluatingtheperformanceoflargelanguagemodelsinpredictingdiagnosticsforspanishclinicalcasesincardiology
AT jordicusido evaluatingtheperformanceoflargelanguagemodelsinpredictingdiagnosticsforspanishclinicalcasesincardiology