Semantic aware enhanced event causality identification

Abstract Event Causality Identification (ECI) aims to predict causal relations between events in a text. Existing research primarily focuses on leveraging external knowledge such as knowledge graphs and dependency trees to construct explicit structured features to enrich event representations. Howev...

Full description

Saved in:
Bibliographic Details
Main Authors: Xinfang Liu, Wenzhong Yang, Fuyuan Wei, Zhonghua Wu
Format: Article
Language:English
Published: Nature Portfolio 2024-12-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-83678-9
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Event Causality Identification (ECI) aims to predict causal relations between events in a text. Existing research primarily focuses on leveraging external knowledge such as knowledge graphs and dependency trees to construct explicit structured features to enrich event representations. However, this approach underestimates the semantic features of the original input sentences and performs poorly in capturing implicit causal relations. Therefore, this paper proposes a new framework based on Hierarchical Feature Extraction and Prompt-aware Attention (HFEPA) to address the issues above. On the one hand, we introduce a Hierarchical Feature Extraction (HFE) module to extract two kinds of features based on the input sentences: event mention level and segment level, enriching the semantic information of events through the interaction between event pairs and different segments. On the other hand, we design a Prompt-aware Attention (PAA) module that utilizes implicit causal knowledge in pre-trained language models to capture potential relationship information between events. This information is then combined with the contextual information of the text sequence to enhance the model’s ability to identify implicit causal relations between events. Additionally, this task faces challenges in the Chinese domain due to the limited scale of annotated datasets, leading to relatively slow research progress. To address this issue, we propose a new Chinese ECI dataset (Chinese News Causality), aiming to solve the current data scarcity problem in the Chinese domain. This dataset contains 25,629 event mentions and 5,569 causal event pairs, making it, to our knowledge, the largest Chinese dataset to date. We evaluate the effectiveness of HFEPA on both the EventStoryLine and Chinese News Causality datasets, and experimental results show that HFEPA significantly outperforms previous methods. The CNC dataset is available at https://github.com/twinkle121/CNC .
ISSN:2045-2322