Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions

Abstraction is a key part of understanding and representation. Discovering different abstraction dimensions on a large set of texts can help understand the texts from multiple dimensions therefore support multi-dimensional operations required by advanced applications. This paper proposes a low-cost...

Full description

Saved in:
Bibliographic Details
Main Authors: Jian Zhou, Jiazheng Li, Sirui Zhuge, Hai Zhuge
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10798421/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841550786863038464
author Jian Zhou
Jiazheng Li
Sirui Zhuge
Hai Zhuge
author_facet Jian Zhou
Jiazheng Li
Sirui Zhuge
Hai Zhuge
author_sort Jian Zhou
collection DOAJ
description Abstraction is a key part of understanding and representation. Discovering different abstraction dimensions on a large set of texts can help understand the texts from multiple dimensions therefore support multi-dimensional operations required by advanced applications. This paper proposes a low-cost approach to automatically discovering abstraction dimensions represented as class trees on texts. The approach consists of three steps: 1) extract subclass relations from input texts based on modifier pattern and syntactic pattern; 2) construct class trees based on the extracted subclass relations; and 3) select independent class trees with high coverage on texts as abstraction dimensions. The correctness and feasibility of the approach are validated on seven data sets of different types. The average precision, recall and F1-score of the extracted subclass relations of the proposed approach are all greater than 85%. The application of the proposed approach to managing GitHub projects demonstrates that searching on the class trees ensures strong relevance between query and return, can quickly reduce search space and support effective management of projects. The proposed approach not only greatly extends the pattern-based approach to finding abstraction relation from texts with a high coverage but also verifies the feasibility of automatically extracting abstraction dimensions from texts. It can be applied to efficiently manage large-scale text resources from different dimensions to support advanced applications.
format Article
id doaj-art-a004f08cb24a4b218b64c42837813d24
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-a004f08cb24a4b218b64c42837813d242025-01-10T00:01:22ZengIEEEIEEE Access2169-35362025-01-01134737475810.1109/ACCESS.2024.351687210798421Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction DimensionsJian Zhou0https://orcid.org/0000-0001-8674-6062Jiazheng Li1https://orcid.org/0009-0008-6451-4787Sirui Zhuge2Hai Zhuge3Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaKey Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaKing’s College London, London, U.K.Great Bay University, Dongguan, ChinaAbstraction is a key part of understanding and representation. Discovering different abstraction dimensions on a large set of texts can help understand the texts from multiple dimensions therefore support multi-dimensional operations required by advanced applications. This paper proposes a low-cost approach to automatically discovering abstraction dimensions represented as class trees on texts. The approach consists of three steps: 1) extract subclass relations from input texts based on modifier pattern and syntactic pattern; 2) construct class trees based on the extracted subclass relations; and 3) select independent class trees with high coverage on texts as abstraction dimensions. The correctness and feasibility of the approach are validated on seven data sets of different types. The average precision, recall and F1-score of the extracted subclass relations of the proposed approach are all greater than 85%. The application of the proposed approach to managing GitHub projects demonstrates that searching on the class trees ensures strong relevance between query and return, can quickly reduce search space and support effective management of projects. The proposed approach not only greatly extends the pattern-based approach to finding abstraction relation from texts with a high coverage but also verifies the feasibility of automatically extracting abstraction dimensions from texts. It can be applied to efficiently manage large-scale text resources from different dimensions to support advanced applications.https://ieeexplore.ieee.org/document/10798421/Abstractiondimensionnatural language processingpatternresource spacesubclass relation
spellingShingle Jian Zhou
Jiazheng Li
Sirui Zhuge
Hai Zhuge
Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
IEEE Access
Abstraction
dimension
natural language processing
pattern
resource space
subclass relation
title Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
title_full Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
title_fullStr Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
title_full_unstemmed Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
title_short Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
title_sort automatically constructing multi dimensional resource space by extracting class trees from texts for operating and analyzing texts from multiple abstraction dimensions
topic Abstraction
dimension
natural language processing
pattern
resource space
subclass relation
url https://ieeexplore.ieee.org/document/10798421/
work_keys_str_mv AT jianzhou automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions
AT jiazhengli automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions
AT siruizhuge automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions
AT haizhuge automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions