STEED: A data mining tool for automated extraction of experimental parameters and risk of bias items from in vivo publications.

<h4>Background and methods</h4>Systematic reviews, i.e., research summaries that address focused questions in a structured and reproducible manner, are a cornerstone of evidence-based medicine and research. However, certain steps in systematic reviews, such as data extraction, are labour...

Full description

Saved in:
Bibliographic Details
Main Authors: Wolfgang Emanuel Zurrer, Amelia Elaine Cannon, Ewoud Ewing, David Brüschweiler, Julia Bugajska, Bernard Friedrich Hild, Marianna Rosso, Daniel Salo Reich, Benjamin Victor Ineichen
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0311358
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:<h4>Background and methods</h4>Systematic reviews, i.e., research summaries that address focused questions in a structured and reproducible manner, are a cornerstone of evidence-based medicine and research. However, certain steps in systematic reviews, such as data extraction, are labour-intensive, which hampers their feasibility, especially with the rapidly expanding body of biomedical literature. To bridge this gap, we aimed to develop a data mining tool in the R programming environment to automate data extraction from neuroscience in vivo publications. The function was trained on a literature corpus (n = 45 publications) of animal motor neuron disease studies and tested in two validation corpora (motor neuron diseases, n = 31 publications; multiple sclerosis, n = 244 publications).<h4>Results</h4>Our data mining tool, STEED (STructured Extraction of Experimental Data), successfully extracted key experimental parameters such as animal models and species, as well as risk of bias items like randomization or blinding, from in vivo studies. Sensitivity and specificity were over 85% and 80%, respectively, for most items in both validation corpora. Accuracy and F1-score were above 90% and 0.9 for most items in the validation corpora, respectively. Time savings were above 99%.<h4>Conclusions</h4>Our text mining tool, STEED, can extract key experimental parameters and risk of bias items from the neuroscience in vivo literature. This enables the tool's deployment for probing a field in a research improvement context or replacing one human reader during data extraction, resulting in substantial time savings and contributing towards the automation of systematic reviews.
ISSN:1932-6203