Javascript must be enabled to continue!
Streaming Hierarchical Clustering Based on Point-Set Kernel
View through CrossRef
Abstract
Hierarchical clustering produces a cluster tree with different granularities. As a result, hierarchical clustering provides richer information and insight into a dataset than partitioning clustering. However, hierarchical clustering algorithms often have two weaknesses: scalability and the capacity to handle clusters of varying densities. This is because they rely on pairwise point-based similarity calculations and the similarity measure is independent of data distribution. In this paper, we aim to overcome these weaknesses and propose a novel efficient hierarchical clustering called StreaKHC that enables massive streaming data to be mined. The enabling factor is the use of a scalable point-set kernel to measure the similarity between an existing cluster in the cluster tree and a new point in the data stream. It also has an efficient mechanism to update the hierarchical structure so that a high-quality cluster tree can be maintained in real-time. Our extensive empirical evaluation shows that StreaKHC is more accurate and more efficient than existing hierarchical clustering algorithms.
Title: Streaming Hierarchical Clustering Based on Point-Set Kernel
Description:
Abstract
Hierarchical clustering produces a cluster tree with different granularities.
As a result, hierarchical clustering provides richer information and insight into a dataset than partitioning clustering.
However, hierarchical clustering algorithms often have two weaknesses: scalability and the capacity to handle clusters of varying densities.
This is because they rely on pairwise point-based similarity calculations and the similarity measure is independent of data distribution.
In this paper, we aim to overcome these weaknesses and propose a novel efficient hierarchical clustering called StreaKHC that enables massive streaming data to be mined.
The enabling factor is the use of a scalable point-set kernel to measure the similarity between an existing cluster in the cluster tree and a new point in the data stream.
It also has an efficient mechanism to update the hierarchical structure so that a high-quality cluster tree can be maintained in real-time.
Our extensive empirical evaluation shows that StreaKHC is more accurate and more efficient than existing hierarchical clustering algorithms.
Related Results
The Kernel Rough K-Means Algorithm
The Kernel Rough K-Means Algorithm
Background:
Clustering is one of the most important data mining methods. The k-means
(c-means ) and its derivative methods are the hotspot in the field of clustering research in re...
Genetic Variation in Potential Kernel Size Affects Kernel Growth and Yield of Sorghum
Genetic Variation in Potential Kernel Size Affects Kernel Growth and Yield of Sorghum
Large‐seededness can increase grain yield in sorghum [Sorghum bicolor (L.) Moench] if larger kernel size more than compensates for the associated reduction in kernel number. The ai...
A Field Streaming - Potential Experiment
A Field Streaming - Potential Experiment
Abstract
Streaming-potential experiments were conducted within the Muddy- and Dakota-sandstone interval of a Denver basin well. Analysis of the data shows that, f...
Sorghum Kernel Weight
Sorghum Kernel Weight
The influence of genotype and panicle position on sorghum [Sorghum bicolor (L.) Moench] kernel growth is poorly understood. In the present study, sorghum kernel weight (KW) differe...
Physicochemical Properties of Wheat Fractionated by Wheat Kernel Thickness and Separated by Kernel Specific Density
Physicochemical Properties of Wheat Fractionated by Wheat Kernel Thickness and Separated by Kernel Specific Density
ABSTRACTTwo wheat cultivars, soft white winter wheat Yang‐mai 11 and hard white winter wheat Zheng‐mai 9023, were fractionated by kernel thickness into five sections; the fractiona...
Clustering Analysis of Data with High Dimensionality
Clustering Analysis of Data with High Dimensionality
Clustering analysis has been widely applied in diverse fields such as data mining, access structures, knowledge discovery, software engineering, organization of information systems...
Image clustering using exponential discriminant analysis
Image clustering using exponential discriminant analysis
Local learning based image clustering models are usually employed to deal with images sampled from the non‐linear manifold. Recently, linear discriminant analysis (LDA) based vario...
A COMPARATIVE ANALYSIS OF K-MEANS AND HIERARCHICAL CLUSTERING
A COMPARATIVE ANALYSIS OF K-MEANS AND HIERARCHICAL CLUSTERING
Clustering is the process of arranging comparable data elements into groups. One of the most frequent data mining analytical techniques is clustering analysis; the clustering algor...

