A weight clustering algorithm based on sliding window model for stream data
Abstract Streaming data, characterized by its temporal variations and large volumes, presents unique challenges for clustering tasks. To address these challenges, this paper proposes a novel weighted clustering approach specifically designed for streaming data. The proposed method begins with an in-...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-96696-y |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Streaming data, characterized by its temporal variations and large volumes, presents unique challenges for clustering tasks. To address these challenges, this paper proposes a novel weighted clustering approach specifically designed for streaming data. The proposed method begins with an in-depth analysis of concept drift features in streaming data, followed by the development of a weight parameter calculation technique. Building on this, we introduce a sliding window model clustering algorithm, which incorporates detailed threshold calculation processes to enhance clustering accuracy. The algorithm operates in two key stages: (1) constructing a sliding window tailored to the characteristics of streaming data to perform intra-window clustering, and (2) merging clusters within the landmark window to achieve global clustering. Extensive experiments are conducted on diverse datasets to validate the algorithm’s effectiveness. Results on static datasets reveal that while the algorithm struggles with precise clustering, it achieves low runtime and misclassification rates. In contrast, experiments on concept-drifting datasets demonstrate that the algorithm, when combined with appropriate weight parameters, achieves accurate clustering with minimal misclassification rates. These findings highlight the algorithm’s adaptability to dynamic data environments and its potential for real-world applications in streaming data analysis. |
|---|---|
| ISSN: | 2045-2322 |