Natural Language Processing Tools and Workflows for Improving Research Processes

The modern research process involves refining a set of keywords until sufficiently pertinent results are obtained from acceptable sources. References and citations from the most relevant results can then be traced to related works. This process iteratively develops a set of keywords to find the most...

Full description

Saved in:
Bibliographic Details
Main Authors: Noel Khan, David Elizondo, Lipika Deka, Miguel A. Molina-Cabello
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/14/24/11731
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The modern research process involves refining a set of keywords until sufficiently pertinent results are obtained from acceptable sources. References and citations from the most relevant results can then be traced to related works. This process iteratively develops a set of keywords to find the most relevant literature. However, because a keyword-based search essentially samples a corpus, it may be inadequate for capturing a broad or exhaustive understanding of a topic. Further, a keyword-based search is dependent upon the underlying storage and retrieval technology and is essentially a syntactical search rather than a semantic search. To overcome such limitations, this paper explores the use of well-known natural language processing (NLP) techniques to support a semantic search and identifies where specific NLP techniques can be employed and what their primary benefits are, thus enhancing the opportunities to further improve the research process. The proposed NLP methods were tested through different workflows on different datasets and each workflow was designed to exploit latent relationships within the data to refine the keywords. The results of these tests demonstrated an improvement in the identified literature when compared to the literature extracted from the end-user-given keywords. For example, one of the defined workflows reduced the number of search results by two orders of magnitude but contained a larger percentage of pertinent results.
ISSN:2076-3417