Contextual semantics graph attention network model for entity resolution
Abstract Entity resolution technology is the process of distinguishing whether data from different knowledge bases refer to the same entity in the real world. Existing research takes entity pairs as input and makes judgments based on the characteristics of entity pairs. However, there is insufficien...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-11932-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Entity resolution technology is the process of distinguishing whether data from different knowledge bases refer to the same entity in the real world. Existing research takes entity pairs as input and makes judgments based on the characteristics of entity pairs. However, there is insufficient utilization of contextual semantics, as existing methods fail to effectively model the token-attribute associations within data sources and cross-attribute semantic hierarchical relationships, which weakens the discriminative power of key attributes. What’ more, they exhibit failure in handling polysemous ambiguities, as conventional graph neural network adopts rigid node representations that cannot dynamically adjust word meanings according to attribute-specific contexts. To address this issue, this paper proposes the Contextual Semantics Graph Attention Network (CSGAT), which extracts contextual information at token and attribute levels to generate semantically fused embeddings. The advantages of CSGAT are: 1) Leveraging the Transformer self-attention mechanism to extract feature vectors of words, model sequence relationships, and calculate the degree of relevance with other words; 2) Employing the attention mechanism on contextual information at the attribute level to extract semantic embeddings to enrich attribute embeddings, forming more discriminative attribute embeddings; 3) Utilizing the graph attention network to generate residual vectors for final entity resolution decisions. Experimental on Amazon-Google and BeerAdvo-RateBeer datasets show that, as compared with the competing methods, CSGAT can achieve significant improved performance on F1-score with fine Precision and Recall. Code is available at https://github.com/xhtech2024/csgat . |
|---|---|
| ISSN: | 2045-2322 |