An experimental study of existing tools for outlier detection and cleaning in trajectories
By: Mariana M Garcez Duarte, Mahmoud Sakr
Potential Business Impact:
Finds and removes bad data points in movement paths.
Outlier detection and cleaning are essential steps in data preprocessing to ensure the integrity and validity of data analyses. This paper focuses on outlier points within individual trajectories, i.e., points that deviate significantly inside a single trajectory. We experiment with ten open-source libraries to comprehensively evaluate available tools, comparing their efficiency and accuracy in identifying and cleaning outliers. This experiment considers the libraries as they are offered to end users, with real-world applicability. We compare existing outlier detection libraries, introduce a method for establishing ground-truth, and aim to guide users in choosing the most appropriate tool for their specific outlier detection needs. Furthermore, we survey the state-of-the-art algorithms for outlier detection and classify them into five types: Statistic-based methods, Sliding window algorithms, Clustering-based methods, Graph-based methods, and Heuristic-based methods. Our research provides insights into these libraries' performance and contributes to developing data preprocessing and outlier detection methodologies.
Similar Papers
A method for outlier detection based on cluster analysis and visual expert criteria
Machine Learning (CS)
Finds weird data points hidden in big groups.
DOD: Detection of outliers in high dimensional data with distance of distances
Methodology
Finds strange data points in complex information.
Finding Outliers in a Haystack: Anomaly Detection for Large Pointcloud Scenes
CV and Pattern Recognition
Helps robots see new things in outdoor scenes.