Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

A New Single Linkage Robust Clustering Outlier Detection Procedures for Multivarite Data

View through CrossRef
Outliers are abnormal data, and the detection of outliers in multivariate data has always been of interest. Unlike univariate data, outlier detection for multivariate data is insufficient with a visual inspection. In this study, we developed a new single linkage robust clustering outlier detection procedure for multivariate data. A robust estimator, Test on Covariance (TOC) is used to robustified the similarity distance measure, producing robust single linkage clustering. The performance of the new single linkage robust clustering outlier detection procedure is investigated via a simulation study using three outlier scenarios and historical multivariate datasets as illustrative examples. Three performance measures are used, which are pout, pmask, and pswamp. The performance of the new single linkage robust clustering procedure also compared with single linkage clustering using Euclidean and Mahalanobis distances as similarity distance measures as well as TOC. It is found that the new single linkage robust clustering procedure performs well in Outlier Scenario 3 when the mean and covariance matrix are shifted. The new procedure also performs well by successfully detecting all outliers, does not have masking effects in two out of five datasets and does not have swamping effect in all datasets. In conclusion, the new single linkage robust clustering outlier detection procedure is a practical and promising approach and good for simultaneously identifying multiple outliers in multivariate data.
Title: A New Single Linkage Robust Clustering Outlier Detection Procedures for Multivarite Data
Description:
Outliers are abnormal data, and the detection of outliers in multivariate data has always been of interest.
Unlike univariate data, outlier detection for multivariate data is insufficient with a visual inspection.
In this study, we developed a new single linkage robust clustering outlier detection procedure for multivariate data.
A robust estimator, Test on Covariance (TOC) is used to robustified the similarity distance measure, producing robust single linkage clustering.
The performance of the new single linkage robust clustering outlier detection procedure is investigated via a simulation study using three outlier scenarios and historical multivariate datasets as illustrative examples.
Three performance measures are used, which are pout, pmask, and pswamp.
The performance of the new single linkage robust clustering procedure also compared with single linkage clustering using Euclidean and Mahalanobis distances as similarity distance measures as well as TOC.
It is found that the new single linkage robust clustering procedure performs well in Outlier Scenario 3 when the mean and covariance matrix are shifted.
The new procedure also performs well by successfully detecting all outliers, does not have masking effects in two out of five datasets and does not have swamping effect in all datasets.
In conclusion, the new single linkage robust clustering outlier detection procedure is a practical and promising approach and good for simultaneously identifying multiple outliers in multivariate data.

Related Results

Investigating Outlier Detection Techniques Based on Kernel Rough Clustering
Investigating Outlier Detection Techniques Based on Kernel Rough Clustering
Background: Data quality is crucial to the success of big data analytics. However, the presence of outliers affects data quality and data analysis. Employing effective outlier dete...
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of t...
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
A Monte Carlo-Based Outlier Diagnosis Method for Sensitivity Analysis
An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of m...
Outlier Detection Method based on Adaptive Clustering Method and Density Peak
Outlier Detection Method based on Adaptive Clustering Method and Density Peak
The outlier detection technique is widely used in the data analysis for the clustering of data. Many techniques have been applied in the outlier detection to increase the efficienc...
Optimasi Algoritma K-Nearest Neighbors Berdasarkan Perbandingan Analisis Outlier (Berbasis Jarak, Kepadatan, LOF)
Optimasi Algoritma K-Nearest Neighbors Berdasarkan Perbandingan Analisis Outlier (Berbasis Jarak, Kepadatan, LOF)
Pertumbuhan data yang terjadi saat ini berpengaruh terhadap analisis data di berbagai bidang, seperti astronomi, bisnis, kedokteran, pendidikan, dan finansial. Data yang terkumpul ...
Outlier Detection and Correction for the Deviations of Tooth Profiles of Gears
Outlier Detection and Correction for the Deviations of Tooth Profiles of Gears
To decrease the influence of outlier on the measurement of tooth profiles, this paper proposes a method of outlier detection and correction based on the grey system theory. After s...

Back to Top