Establishing vocabulary tests as a benchmark for evaluating large language models.

Establishing vocabulary tests as a benchmark for evaluating large language models.

Vocabulary tests, once a cornerstone of language modeling evaluation, have been largely overlooked in the current landscape of Large Language Models (LLMs) like Llama 2, Mistral, and GPT. While most LLM evaluation benchmarks focus on specific tasks or domain-specific knowledge, they often neglect th...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gonzalo Martínez, Javier Conde, Elena Merino-Gómez, Beatriz Bermúdez-Margaretto, José Alberto Hernández, Pedro Reviriego, Marc Brysbaert
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2024-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0308259
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Playing with words: Comparing the vocabulary and lexical diversity of ChatGPT and humans
by: Pedro Reviriego, et al.
Published: (2024-12-01)

Benchmarking Large Language Models for News Summarization
by: Tianyi Zhang, et al.
Published: (2024-02-01)

Survey of Different Large Language Model Architectures: Trends, Benchmarks, and Challenges
by: Minghao Shao, et al.
Published: (2024-01-01)

Towards a benchmark dataset for large language models in the context of process automation
by: Tejennour Tizaoui, et al.
Published: (2024-12-01)

Large Language Model-Driven Structured Output: A Comprehensive Benchmark and Spatial Data Generation Framework
by: Diya Li, et al.
Published: (2024-11-01)

Profiles of early expressive vocabulary in children with typical and atypical language development
by: Alejandra Auza-Benavides, et al.
Published: (2024-12-01)

Benchmarking protein language models for protein crystallization
by: Raghvendra Mall, et al.
Published: (2025-01-01)

Waiting for Treatment for Chronic Pain – a Survey of Existing Benchmarks: Toward Establishing Evidence-Based Benchmarks for Medically Acceptable Waiting Times
by: Mary E Lynch, et al.
Published: (2007-01-01)

MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models
by: Mianxin Liu, et al.
Published: (2024-12-01)

Russian sign language: Main problems of vocabulary study
by: A.A. Komarova
Published: (2022-04-01)

Quantitative Evaluation of Vocabulary Emotional Color in Language Teaching
by: Zhong Caihong
Published: (2022-01-01)

Vocabulary Size of University of Aden English Language Students
by: Abdulnaser Mohammed Ali Naqeeb
Published: (2021-04-01)

VisGraphVar: A benchmark generator for Assessing Variability in Graph Analysis Using Large Vision-Language Models
by: Camilo Chacon Sartori, et al.
Published: (2025-01-01)

A survey on augmenting knowledge graphs (KGs) with large language models (LLMs): models, evaluation metrics, benchmarks, and challenges
by: Nourhan Ibrahim, et al.
Published: (2024-11-01)

Effective techniques for developing advanced vocabulary skills in English language
by: Yu. Kovalenko
Published: (2024-05-01)

The acquisition of foreign language vocabulary: Does spacing effect matter?
by: F. M. Al-Khasawneh
Published: (2023-03-01)

How to Run Linear Mixed Effects Analysis for Pairwise Comparisons? A Tutorial and a Proposal for the Calculation of Standardized Effect Sizes
by: Marc Brysbaert, et al.
Published: (2025-01-01)

Existence of Arabicization Methods for Naturalising Contemporary Technical Vocabularies into the Arabic Language
by: Alif Cahya Setiyadi, et al.
Published: (2022-12-01)

Large multimodal model for open vocabulary semantic segmentation of remote sensing images
by: Bing Liu, et al.
Published: (2025-12-01)

Cross-Attention Fusion of Visual and Geometric Features for Large-Vocabulary Arabic Lipreading
by: Samar Daou, et al.
Published: (2025-01-01)

Unplanned Vocabulary Instruction: A Case Study of Three Second Language Classrooms
by: Elsa Tragant, et al.
Published: (1997-12-01)

LEARNING GERMAN AS A THIRD LANGUAGE THROUGHS ESL. STRATEGIES TO DEVELOP VOCABULARY
by: Carmen-Daniela CARAIMAN, et al.
Published: (2016-06-01)

Russian as a foreign language in higher education: systematic approach to teaching vocabulary
by: Аnna A. Khokhlova, et al.
Published: (2023-04-01)

SEMANTIC SHIFT AS A MECHANISM FOR EXPANDING THE VOCABULARY OF THE ENGLISH LANGUAGE IN VIRTUAL DISCOURSE
by: Yuliya S. Gavrikova
Published: (2024-09-01)

Technological Interface Components That Support Accelerated Learning in the Acquisition of Foreign Language Vocabulary
by: David Passig, et al.
Published: (2024-11-01)

The Effects of the Virtual Background on French as a Second Foreign Language Vocabulary Learning
by: Jiaqi Hou
Published: (2024-08-01)

The Ever-Expanding Ability of the English Economic Vocabulary
by: Elena Tălmăcian
Published: (2024-12-01)

Vocabulary, Story and Ideology in the Rhetoric of Persuasive Speech
by: Javier de Santiago-Guervós
Published: (2019-01-01)

Vocabulary Development /
by: Deighton, Lee C.
Published: (1964)

Discover, Explain, Improve: An Automatic Slice Detection Benchmark for Natural Language Processing
by: Wenyue Hua, et al.
Published: (2023-12-01)

Analyzing large text data for vocabulary profiling in corpus-based studies of academic discourse
by: Ismail Xodabande, et al.
Published: (2025-06-01)

THE IMPORTANCE OF BENCHMARKING IN MAKING MANAGEMENT DECISIONS
by: Adriana-Mihaela IONESCU, et al.
Published: (2016-06-01)

Pedagogical uses of authentic video in ESP classrooms for developing language skills and enriching vocabulary
by: Violeta Jurkovič, et al.
Published: (2015-12-01)

Neural activity related to productive vocabulary knowledge effects during second language comprehension
by: Takara Kenza Allal-Sumoto, et al.
Published: (2024-06-01)

Vocabulary Notebooks as a Noteworthy Powerful Instrument in Technical Vocabulary Learning
by: Megi Plaku, et al.
Published: (2024-11-01)

Equivalence and Lacunarity of Dialectal Vocabulary of the Lezgin (Based on the Vocabulary of the Akhty Dialect)
by: L.D. Ramazanova
Published: (2024-12-01)

TAD: A Large-Scale Benchmark for Traffic Accidents Detection From Video Surveillance
by: Yajun Xu, et al.
Published: (2025-01-01)

Vocabulary Workshop. fourth course /

A Rukiga vocabulary/
by: Kaji, Shigeki

Vocabulary of Iqbal’s Poetry
by: Rukhsana Baloch
Published: (2022-09-01)