Label-Informed Outlier Detection Based on Granule Density
By: Baiyang Chen , Zhong Yuan , Dezhong Peng and more
Outlier detection, crucial for identifying unusual patterns with significant implications across numerous applications, has drawn considerable research interest. Existing semi-supervised methods typically treat data as purely numerical and} in a deterministic manner, thereby neglecting the heterogeneity and uncertainty inherent in complex, real-world datasets. This paper introduces a label-informed outlier detection method for heterogeneous data based on Granular Computing and Fuzzy Sets, namely Granule Density-based Outlier Factor (GDOF). Specifically, GDOF first employs label-informed fuzzy granulation to effectively represent various data types and develops granule density for precise density estimation. Subsequently, granule densities from individual attributes are integrated for outlier scoring by assessing attribute relevance with a limited number of labeled outliers. Experimental results on various real-world datasets show that GDOF stands out in detecting outliers in heterogeneous data with a minimal number of labeled outliers. The integration of Fuzzy Sets and Granular Computing in GDOF offers a practical framework for outlier detection in complex and diverse data types. All relevant datasets and source codes are publicly available for further research. This is the author's accepted manuscript of a paper published in IEEE Transactions on Fuzzy Systems. The final version is available at https://doi.org/10.1109/TFUZZ.2024.3514853
Similar Papers
Fuzzy Granule Density-Based Outlier Detection with Multi-Scale Granular Balls
Machine Learning (CS)
Finds weird data points in many ways.
Finding Time Series Anomalies using Granular-ball Vector Data Description
Machine Learning (CS)
Finds weird patterns in changing information.
GBFRS: Robust Fuzzy Rough Sets via Granular-ball Computing
Artificial Intelligence
Makes computers better at finding important information.