Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning

Hierarchical Text Classification (HTC) is a challenging task where labels are structured in a tree or Directed Acyclic Graph (DAG) format. Current approaches often struggle with data imbalance and fail to fully capture rich semantic information. This paper proposes an Adaptive Hierarchical Text Clas...

Full description

Saved in:

Bibliographic Details
Main Authors:	Han Chen, Yangsen Zhang, Yuru Jiang, Ruixue Duan
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Hierarchical text classification data augmentation large language model graph attention network ERNIE
Online Access:	https://ieeexplore.ieee.org/document/10807255/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841533410524266496
author	Han Chen Yangsen Zhang Yuru Jiang Ruixue Duan
author_facet	Han Chen Yangsen Zhang Yuru Jiang Ruixue Duan
author_sort	Han Chen
collection	DOAJ
description	Hierarchical Text Classification (HTC) is a challenging task where labels are structured in a tree or Directed Acyclic Graph (DAG) format. Current approaches often struggle with data imbalance and fail to fully capture rich semantic information. This paper proposes an Adaptive Hierarchical Text Classification method, EDTPA (ERNIE and Dynamic Threshold Pruning-based Adaptive classification), which leverages Large Language Models (LLMs) for data augmentation to mitigate imbalanced datasets. The model first uses Graph Attention Networks (GAT) to capture hierarchical dependencies among labels, effectively modeling structured relationships. ERNIE enhances the semantic representation of both the text and hierarchical labels, optimizing the model’s ability to process Chinese text. An attention mechanism strengthens the alignment between text and labels, improving accuracy. The model combines global and local information flows, while dynamic threshold pruning prunes low-probability branches, improving interpretability. Results on the Chinese Scientific Literature (CSL) dataset show EDTPA significantly outperforms baseline models in both Micro-F1 and Macro-F1 scores, effectively addressing data imbalance and improving classification performance.
format	Article
id	doaj-art-60f18b9d756b4fa0a666b98e4b02362f
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-60f18b9d756b4fa0a666b98e4b02362f2025-01-16T00:02:02ZengIEEEIEEE Access2169-35362024-01-011219364119365210.1109/ACCESS.2024.351995410807255Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold PruningHan Chen0https://orcid.org/0009-0008-9408-5979Yangsen Zhang1Yuru Jiang2https://orcid.org/0000-0002-0947-2640Ruixue Duan3https://orcid.org/0000-0002-4478-1692Institute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing, ChinaInstitute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing, ChinaInstitute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing, ChinaInstitute of Intelligent Information Processing, Beijing Information Science and Technology University, Beijing, ChinaHierarchical Text Classification (HTC) is a challenging task where labels are structured in a tree or Directed Acyclic Graph (DAG) format. Current approaches often struggle with data imbalance and fail to fully capture rich semantic information. This paper proposes an Adaptive Hierarchical Text Classification method, EDTPA (ERNIE and Dynamic Threshold Pruning-based Adaptive classification), which leverages Large Language Models (LLMs) for data augmentation to mitigate imbalanced datasets. The model first uses Graph Attention Networks (GAT) to capture hierarchical dependencies among labels, effectively modeling structured relationships. ERNIE enhances the semantic representation of both the text and hierarchical labels, optimizing the model’s ability to process Chinese text. An attention mechanism strengthens the alignment between text and labels, improving accuracy. The model combines global and local information flows, while dynamic threshold pruning prunes low-probability branches, improving interpretability. Results on the Chinese Scientific Literature (CSL) dataset show EDTPA significantly outperforms baseline models in both Micro-F1 and Macro-F1 scores, effectively addressing data imbalance and improving classification performance.https://ieeexplore.ieee.org/document/10807255/Hierarchical text classificationdata augmentationlarge language modelgraph attention networkERNIE
spellingShingle	Han Chen Yangsen Zhang Yuru Jiang Ruixue Duan Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning IEEE Access Hierarchical text classification data augmentation large language model graph attention network ERNIE
title	Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning
title_full	Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning
title_fullStr	Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning
title_full_unstemmed	Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning
title_short	Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning
title_sort	adaptive hierarchical text classification using ernie and dynamic threshold pruning
topic	Hierarchical text classification data augmentation large language model graph attention network ERNIE
url	https://ieeexplore.ieee.org/document/10807255/
work_keys_str_mv	AT hanchen adaptivehierarchicaltextclassificationusingernieanddynamicthresholdpruning AT yangsenzhang adaptivehierarchicaltextclassificationusingernieanddynamicthresholdpruning AT yurujiang adaptivehierarchicaltextclassificationusingernieanddynamicthresholdpruning AT ruixueduan adaptivehierarchicaltextclassificationusingernieanddynamicthresholdpruning

Adaptive Hierarchical Text Classification Using ERNIE and Dynamic Threshold Pruning

Similar Items