KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories

We present KnowVID-19, a knowledge-based system that assists medical researchers and scientists in extracting targeted information quickly and efficiently from online medical literature repositories, such as PubMed, PubMed Central, and other biomedical sources. The system utilizes various open-sourc...

Full description

Saved in:
Bibliographic Details
Main Authors: Muzzamil Aziz, Ioana Popa, Amjad Zia, Andreas Fischer, Sabih Ahmed Khan, Amirreza Fazely Hamedani, Abdul R. Asif
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Biomolecules
Subjects:
Online Access:https://www.mdpi.com/2218-273X/14/11/1411
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846154189817774080
author Muzzamil Aziz
Ioana Popa
Amjad Zia
Andreas Fischer
Sabih Ahmed Khan
Amirreza Fazely Hamedani
Abdul R. Asif
author_facet Muzzamil Aziz
Ioana Popa
Amjad Zia
Andreas Fischer
Sabih Ahmed Khan
Amirreza Fazely Hamedani
Abdul R. Asif
author_sort Muzzamil Aziz
collection DOAJ
description We present KnowVID-19, a knowledge-based system that assists medical researchers and scientists in extracting targeted information quickly and efficiently from online medical literature repositories, such as PubMed, PubMed Central, and other biomedical sources. The system utilizes various open-source machine learning tools, such as GROBID, S2ORC, and BioC to streamline the processes of data extraction and data mining. Central to the functionality of KnowVID-19 is its keyword-based text classification process, which plays a pivotal role in organizing and categorizing the extracted information. By employing machine learning techniques for keyword extraction—specifically RAKE, YAKE, and KeyBERT—KnowVID-19 systematically categorizes publication data into distinct topics and subtopics. This topic structuring enhances the system’s ability to match user queries with relevant research, improving both the accuracy and efficiency of the search results. In addition, KnowVID-19 leverages the NetworkX Python library to construct networks of the most relevant terms within publications. These networks are then visualized using Cytoscape software, providing a graphical representation of the relationships between key terms. This network visualization allows researchers to easily track emerging trends and developments related to COVID-19, long COVID, and associated topics, facilitating more informed and user-centered exploration of the scientific literature. KnowVID-19 also provides an interactive web application with an intuitive, user-centered interface. This platform supports seamless keyword searching and filtering, as well as a visual network of term associations to help users quickly identify emerging research trends. The responsive design and network visualization enables efficient navigation and access to targeted COVID-19 literature, enhancing both the user experience and the accuracy of data-driven insights.
format Article
id doaj-art-a9b3b165ff624d4fbda79da79ac5fdc5
institution Kabale University
issn 2218-273X
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Biomolecules
spelling doaj-art-a9b3b165ff624d4fbda79da79ac5fdc52024-11-26T17:54:11ZengMDPI AGBiomolecules2218-273X2024-11-011411141110.3390/biom14111411KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical RepositoriesMuzzamil Aziz0Ioana Popa1Amjad Zia2Andreas Fischer3Sabih Ahmed Khan4Amirreza Fazely Hamedani5Abdul R. Asif6Future Networks, eScience Group, Gesellschaft für Wissenschaftliche Datenverarbeitung mbH Göttingen (GWDG), 37077 Göttingen, GermanyInstitute for Clinical Chemistry, University Medical Center Göttingen, George-August-University, 37073 Göttingen, GermanyInstitute for Clinical Chemistry, University Medical Center Göttingen, George-August-University, 37073 Göttingen, GermanyInstitute for Clinical Chemistry, University Medical Center Göttingen, George-August-University, 37073 Göttingen, GermanyFuture Networks, eScience Group, Gesellschaft für Wissenschaftliche Datenverarbeitung mbH Göttingen (GWDG), 37077 Göttingen, GermanyFuture Networks, eScience Group, Gesellschaft für Wissenschaftliche Datenverarbeitung mbH Göttingen (GWDG), 37077 Göttingen, GermanyInstitute for Clinical Chemistry, University Medical Center Göttingen, George-August-University, 37073 Göttingen, GermanyWe present KnowVID-19, a knowledge-based system that assists medical researchers and scientists in extracting targeted information quickly and efficiently from online medical literature repositories, such as PubMed, PubMed Central, and other biomedical sources. The system utilizes various open-source machine learning tools, such as GROBID, S2ORC, and BioC to streamline the processes of data extraction and data mining. Central to the functionality of KnowVID-19 is its keyword-based text classification process, which plays a pivotal role in organizing and categorizing the extracted information. By employing machine learning techniques for keyword extraction—specifically RAKE, YAKE, and KeyBERT—KnowVID-19 systematically categorizes publication data into distinct topics and subtopics. This topic structuring enhances the system’s ability to match user queries with relevant research, improving both the accuracy and efficiency of the search results. In addition, KnowVID-19 leverages the NetworkX Python library to construct networks of the most relevant terms within publications. These networks are then visualized using Cytoscape software, providing a graphical representation of the relationships between key terms. This network visualization allows researchers to easily track emerging trends and developments related to COVID-19, long COVID, and associated topics, facilitating more informed and user-centered exploration of the scientific literature. KnowVID-19 also provides an interactive web application with an intuitive, user-centered interface. This platform supports seamless keyword searching and filtering, as well as a visual network of term associations to help users quickly identify emerging research trends. The responsive design and network visualization enables efficient navigation and access to targeted COVID-19 literature, enhancing both the user experience and the accuracy of data-driven insights.https://www.mdpi.com/2218-273X/14/11/1411knowledge-based systemnatural language processingweb crawling and scrapingCOVID-19long COVIDartificial intelligence
spellingShingle Muzzamil Aziz
Ioana Popa
Amjad Zia
Andreas Fischer
Sabih Ahmed Khan
Amirreza Fazely Hamedani
Abdul R. Asif
KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories
Biomolecules
knowledge-based system
natural language processing
web crawling and scraping
COVID-19
long COVID
artificial intelligence
title KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories
title_full KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories
title_fullStr KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories
title_full_unstemmed KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories
title_short KnowVID-19: A Knowledge-Based System to Extract Targeted COVID-19 Information from Online Medical Repositories
title_sort knowvid 19 a knowledge based system to extract targeted covid 19 information from online medical repositories
topic knowledge-based system
natural language processing
web crawling and scraping
COVID-19
long COVID
artificial intelligence
url https://www.mdpi.com/2218-273X/14/11/1411
work_keys_str_mv AT muzzamilaziz knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories
AT ioanapopa knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories
AT amjadzia knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories
AT andreasfischer knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories
AT sabihahmedkhan knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories
AT amirrezafazelyhamedani knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories
AT abdulrasif knowvid19aknowledgebasedsystemtoextracttargetedcovid19informationfromonlinemedicalrepositories