Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology
This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensi...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/1/61 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841549461066612736 |
---|---|
author | Julien Delaunay Jordi Cusido |
author_facet | Julien Delaunay Jordi Cusido |
author_sort | Julien Delaunay |
collection | DOAJ |
description | This study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population. |
format | Article |
id | doaj-art-adf10fba72234876b8b48a0378eedb5c |
institution | Kabale University |
issn | 2076-3417 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj-art-adf10fba72234876b8b48a0378eedb5c2025-01-10T13:14:18ZengMDPI AGApplied Sciences2076-34172024-12-011516110.3390/app15010061Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in CardiologyJulien Delaunay0Jordi Cusido1Top Health Tech, 08021 Barcelona, SpainTop Health Tech, 08021 Barcelona, SpainThis study explores the potential of large language models (LLMs) in predicting medical diagnoses from Spanish-language clinical case descriptions, offering an alternative to traditional machine learning (ML) and deep learning (DL) techniques. Unlike ML and DL models, which typically rely on extensive domain-specific training and complex data preprocessing, LLMs can process unstructured text data directly without the need for specialized training on medical datasets. This unique characteristic of LLMs allows for faster implementation and eliminates the risks associated with overfitting, which are common in ML and DL models that require tailored training for each new dataset. In this research, we investigate the capacities of several state-of-the-art LLMs in predicting medical diagnoses based on Spanish textual descriptions of clinical cases. We measured the impact of prompt techniques and temperatures on the quality of the diagnosis. Our results indicate that Gemini Pro and Mixtral 8x22b generally performed well across different temperatures and techniques, while Medichat Llama3 showed more variability, particularly with the few-shot prompting technique. Low temperatures and specific prompt techniques, such as zero-shot and Retrieval-Augmented Generation (RAG), tended to yield clearer and more accurate diagnoses. This study highlights the potential of LLMs as a disruptive alternative to traditional ML and DL approaches, offering a more efficient, scalable, and flexible solution for medical diagnostics, particularly in the non-English-speaking population.https://www.mdpi.com/2076-3417/15/1/61large language modelsmedical diagnosisnatural language processinghealthcareSpanish languageprompt techniques |
spellingShingle | Julien Delaunay Jordi Cusido Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology Applied Sciences large language models medical diagnosis natural language processing healthcare Spanish language prompt techniques |
title | Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology |
title_full | Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology |
title_fullStr | Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology |
title_full_unstemmed | Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology |
title_short | Evaluating the Performance of Large Language Models in Predicting Diagnostics for Spanish Clinical Cases in Cardiology |
title_sort | evaluating the performance of large language models in predicting diagnostics for spanish clinical cases in cardiology |
topic | large language models medical diagnosis natural language processing healthcare Spanish language prompt techniques |
url | https://www.mdpi.com/2076-3417/15/1/61 |
work_keys_str_mv | AT juliendelaunay evaluatingtheperformanceoflargelanguagemodelsinpredictingdiagnosticsforspanishclinicalcasesincardiology AT jordicusido evaluatingtheperformanceoflargelanguagemodelsinpredictingdiagnosticsforspanishclinicalcasesincardiology |