Unsupervised feature selection in binarization of real attributes for conceptual clustering

This paper proposes an approach for processing noisy data to form homogeneous subgroups of objects based on Formal Concept Analysis (FCA). The approach involves binary encoding of heterogeneous features and unsupervised feature selection using the Laplacian Score. The selected feature set is then us...

Full description

Saved in:
Bibliographic Details
Main Authors: Shkaberina Guzel, Masich Igor, Markushin Egor, Kraeva Ekaterina
Format: Article
Language:English
Published: EDP Sciences 2025-01-01
Series:ITM Web of Conferences
Online Access:https://www.itm-conferences.org/articles/itmconf/pdf/2025/03/itmconf_hmmocs-III2024_04004.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes an approach for processing noisy data to form homogeneous subgroups of objects based on Formal Concept Analysis (FCA). The approach involves binary encoding of heterogeneous features and unsupervised feature selection using the Laplacian Score. The selected feature set is then used to generate formal concepts. The main idea of our research is to use the concepts derived through FCA as new features for clustering. This process transforms the original feature space into a concept-driven space, where each feature corresponds to the extents of the derived concepts. The proposed approach enhances clustering performance in the presence of noise, outperforming the traditional K-means clustering algorithm in terms of cluster coherence and accuracy. By utilizing concept-based features, the method is able to better capture the underlying structure of the data, leading to more robust and meaningful groupings compared to conventional attribute-based clustering techniques.
ISSN:2271-2097