Measurement of event data from text

We examine measurement concerns about computer-aided political event data in the state-of-the-art after 2015. The focus is on how to compare and quantify the mathematical and/or conceptual distance between what a machine codes/classifies from information describing an event and the actual circumstan...

Full description

Saved in:
Bibliographic Details
Main Authors: Patrick T. Brandt, Marcus Sianan
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Political Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpos.2024.1453640/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841558658535653376
author Patrick T. Brandt
Marcus Sianan
author_facet Patrick T. Brandt
Marcus Sianan
author_sort Patrick T. Brandt
collection DOAJ
description We examine measurement concerns about computer-aided political event data in the state-of-the-art after 2015. The focus is on how to compare and quantify the mathematical and/or conceptual distance between what a machine codes/classifies from information describing an event and the actual circumstances of the event, or the ground truth. Three primary arguments are made: (1) It is important for users of event data to understand the measurement side of these data to avoid faulty inferences and make better decisions. (2) Avant-garde event data systems are still not free from some of the fundamental problems that plague legacy systems (investigated are theoretical and real-world examples of measurement issues, why they are problematic, how they are dealt with, and what is left to be desired even with newer systems). (3) One of the most crucial goals of event data science is to attain congruence between what is machine-coded/classified vs. the ground truth. To support these arguments, the literature is benchmarked against well-documented sources of measurement error. Guidance is provided on how to make performance comparisons within and across language models, identify opportunities to improve event data systems, and more articulately discuss and present findings in this area of research.
format Article
id doaj-art-de20a43df2364cdf90c86b42fb3be325
institution Kabale University
issn 2673-3145
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Political Science
spelling doaj-art-de20a43df2364cdf90c86b42fb3be3252025-01-06T06:59:44ZengFrontiers Media S.A.Frontiers in Political Science2673-31452025-01-01610.3389/fpos.2024.14536401453640Measurement of event data from textPatrick T. BrandtMarcus SiananWe examine measurement concerns about computer-aided political event data in the state-of-the-art after 2015. The focus is on how to compare and quantify the mathematical and/or conceptual distance between what a machine codes/classifies from information describing an event and the actual circumstances of the event, or the ground truth. Three primary arguments are made: (1) It is important for users of event data to understand the measurement side of these data to avoid faulty inferences and make better decisions. (2) Avant-garde event data systems are still not free from some of the fundamental problems that plague legacy systems (investigated are theoretical and real-world examples of measurement issues, why they are problematic, how they are dealt with, and what is left to be desired even with newer systems). (3) One of the most crucial goals of event data science is to attain congruence between what is machine-coded/classified vs. the ground truth. To support these arguments, the literature is benchmarked against well-documented sources of measurement error. Guidance is provided on how to make performance comparisons within and across language models, identify opportunities to improve event data systems, and more articulately discuss and present findings in this area of research.https://www.frontiersin.org/articles/10.3389/fpos.2024.1453640/fullevent datapolitical methodologynatural languagepolitical conflictinternational relations
spellingShingle Patrick T. Brandt
Marcus Sianan
Measurement of event data from text
Frontiers in Political Science
event data
political methodology
natural language
political conflict
international relations
title Measurement of event data from text
title_full Measurement of event data from text
title_fullStr Measurement of event data from text
title_full_unstemmed Measurement of event data from text
title_short Measurement of event data from text
title_sort measurement of event data from text
topic event data
political methodology
natural language
political conflict
international relations
url https://www.frontiersin.org/articles/10.3389/fpos.2024.1453640/full
work_keys_str_mv AT patricktbrandt measurementofeventdatafromtext
AT marcussianan measurementofeventdatafromtext