Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions
Abstraction is a key part of understanding and representation. Discovering different abstraction dimensions on a large set of texts can help understand the texts from multiple dimensions therefore support multi-dimensional operations required by advanced applications. This paper proposes a low-cost...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10798421/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841550786863038464 |
---|---|
author | Jian Zhou Jiazheng Li Sirui Zhuge Hai Zhuge |
author_facet | Jian Zhou Jiazheng Li Sirui Zhuge Hai Zhuge |
author_sort | Jian Zhou |
collection | DOAJ |
description | Abstraction is a key part of understanding and representation. Discovering different abstraction dimensions on a large set of texts can help understand the texts from multiple dimensions therefore support multi-dimensional operations required by advanced applications. This paper proposes a low-cost approach to automatically discovering abstraction dimensions represented as class trees on texts. The approach consists of three steps: 1) extract subclass relations from input texts based on modifier pattern and syntactic pattern; 2) construct class trees based on the extracted subclass relations; and 3) select independent class trees with high coverage on texts as abstraction dimensions. The correctness and feasibility of the approach are validated on seven data sets of different types. The average precision, recall and F1-score of the extracted subclass relations of the proposed approach are all greater than 85%. The application of the proposed approach to managing GitHub projects demonstrates that searching on the class trees ensures strong relevance between query and return, can quickly reduce search space and support effective management of projects. The proposed approach not only greatly extends the pattern-based approach to finding abstraction relation from texts with a high coverage but also verifies the feasibility of automatically extracting abstraction dimensions from texts. It can be applied to efficiently manage large-scale text resources from different dimensions to support advanced applications. |
format | Article |
id | doaj-art-a004f08cb24a4b218b64c42837813d24 |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-a004f08cb24a4b218b64c42837813d242025-01-10T00:01:22ZengIEEEIEEE Access2169-35362025-01-01134737475810.1109/ACCESS.2024.351687210798421Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction DimensionsJian Zhou0https://orcid.org/0000-0001-8674-6062Jiazheng Li1https://orcid.org/0009-0008-6451-4787Sirui Zhuge2Hai Zhuge3Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaKey Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, ChinaKing’s College London, London, U.K.Great Bay University, Dongguan, ChinaAbstraction is a key part of understanding and representation. Discovering different abstraction dimensions on a large set of texts can help understand the texts from multiple dimensions therefore support multi-dimensional operations required by advanced applications. This paper proposes a low-cost approach to automatically discovering abstraction dimensions represented as class trees on texts. The approach consists of three steps: 1) extract subclass relations from input texts based on modifier pattern and syntactic pattern; 2) construct class trees based on the extracted subclass relations; and 3) select independent class trees with high coverage on texts as abstraction dimensions. The correctness and feasibility of the approach are validated on seven data sets of different types. The average precision, recall and F1-score of the extracted subclass relations of the proposed approach are all greater than 85%. The application of the proposed approach to managing GitHub projects demonstrates that searching on the class trees ensures strong relevance between query and return, can quickly reduce search space and support effective management of projects. The proposed approach not only greatly extends the pattern-based approach to finding abstraction relation from texts with a high coverage but also verifies the feasibility of automatically extracting abstraction dimensions from texts. It can be applied to efficiently manage large-scale text resources from different dimensions to support advanced applications.https://ieeexplore.ieee.org/document/10798421/Abstractiondimensionnatural language processingpatternresource spacesubclass relation |
spellingShingle | Jian Zhou Jiazheng Li Sirui Zhuge Hai Zhuge Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions IEEE Access Abstraction dimension natural language processing pattern resource space subclass relation |
title | Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions |
title_full | Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions |
title_fullStr | Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions |
title_full_unstemmed | Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions |
title_short | Automatically Constructing Multi-Dimensional Resource Space by Extracting Class Trees From Texts for Operating and Analyzing Texts From Multiple Abstraction Dimensions |
title_sort | automatically constructing multi dimensional resource space by extracting class trees from texts for operating and analyzing texts from multiple abstraction dimensions |
topic | Abstraction dimension natural language processing pattern resource space subclass relation |
url | https://ieeexplore.ieee.org/document/10798421/ |
work_keys_str_mv | AT jianzhou automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions AT jiazhengli automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions AT siruizhuge automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions AT haizhuge automaticallyconstructingmultidimensionalresourcespacebyextractingclasstreesfromtextsforoperatingandanalyzingtextsfrommultipleabstractiondimensions |