Javascript must be enabled to continue!
Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density
View through CrossRef
In recent years, graph-partition-based clustering algorithms have attracted increasing attention. These algorithms first construct a graph over the data points and then partition this graph, regarding each connected subgraph in the partitioned graph as a cluster. However, traditional graph-partition-based clustering algorithms face challenges when clustering datasets with highly imbalanced density distributions. This is because, during the process of graph partitioning, they mainly rely on edge lengths and largely ignore local density variations. To address this issue, we propose the
Adaptive Graph Partitioning (AGP)
clustering algorithm. AGP integrates local density information into the partitioning process and adaptively normalizes the magnitudes of edge weights in both sparse and dense regions. This enhancement allows AGP to avoid the over-partitioning of dense clusters, effectively addressing a common problem in traditional graph-partition-based clustering algorithms. Additionally, the mechanism of connectivity domain differences is introduced into AGP, further enhancing the algorithm’s ability to discriminate between neighboring clusters that are difficult to separate. Extensive experiments on 13 benchmark datasets show that AGP achieves the best clustering performance on 7 datasets and performs competitively on the others, especially when density imbalance is severe. Moreover, scalability experiments on datasets with up to 100,000 data points, together with complexity analysis, demonstrate that AGP enjoys favorable time and memory efficiency compared with representative baselines.
Association for Computing Machinery (ACM)
Title: Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density
Description:
In recent years, graph-partition-based clustering algorithms have attracted increasing attention.
These algorithms first construct a graph over the data points and then partition this graph, regarding each connected subgraph in the partitioned graph as a cluster.
However, traditional graph-partition-based clustering algorithms face challenges when clustering datasets with highly imbalanced density distributions.
This is because, during the process of graph partitioning, they mainly rely on edge lengths and largely ignore local density variations.
To address this issue, we propose the
Adaptive Graph Partitioning (AGP)
clustering algorithm.
AGP integrates local density information into the partitioning process and adaptively normalizes the magnitudes of edge weights in both sparse and dense regions.
This enhancement allows AGP to avoid the over-partitioning of dense clusters, effectively addressing a common problem in traditional graph-partition-based clustering algorithms.
Additionally, the mechanism of connectivity domain differences is introduced into AGP, further enhancing the algorithm’s ability to discriminate between neighboring clusters that are difficult to separate.
Extensive experiments on 13 benchmark datasets show that AGP achieves the best clustering performance on 7 datasets and performs competitively on the others, especially when density imbalance is severe.
Moreover, scalability experiments on datasets with up to 100,000 data points, together with complexity analysis, demonstrate that AGP enjoys favorable time and memory efficiency compared with representative baselines.
Related Results
Linking White‐Tailed Deer Density, Nutrition, and Vegetation in a Stochastic Environment
Linking White‐Tailed Deer Density, Nutrition, and Vegetation in a Stochastic Environment
ABSTRACT
Density‐dependent behavior underpins white‐tailed deer (
Odocoileus virginianus
) theory and...
Bootstrapping a Biodiversity Knowledge Graph
Bootstrapping a Biodiversity Knowledge Graph
The "biodiversity knowledge graph" is a nice metaphor for connecting biodiversity data sources, but can we actually build it? Do we have sufficient linked data available? Given tha...
Near-Data Source Graph Partitioning
Near-Data Source Graph Partitioning
Recently, numerous graph partitioning approaches have been proposed to distribute a big graph to machines in a cluster for distributed computing. Due to heavy communication overhea...
The Kernel Rough K-Means Algorithm
The Kernel Rough K-Means Algorithm
Background:
Clustering is one of the most important data mining methods. The k-means
(c-means ) and its derivative methods are the hotspot in the field of clustering research in re...
Adaptive Graph Convolution Using Heat Kernel for Attributed Graph Clustering
Adaptive Graph Convolution Using Heat Kernel for Attributed Graph Clustering
Attributed graphs contain a lot of node features and structural relationships, and how to utilize their inherent information sufficiently to improve graph clustering performance ha...
Parallel density clustering algorithm based on MapReduce and optimized cuckoo algorithm
Parallel density clustering algorithm based on MapReduce and optimized cuckoo algorithm
In the process of parallel density clustering, the boundary points of clusters with different densities are blurred and there is data noise, which affects the clustering performanc...
Uncertain data density peak clustering algorithm based on JS divergence
Uncertain data density peak clustering algorithm based on JS divergence
Aiming at the defects of traditional density-based uncertainty clustering algorithms, such as parameter sensitivity and poor clustering results for complex manifold uncertain data ...
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
The area of Data Analytics on graphs promises a paradigm shift, as we approach information processing of new classes of data which are typically acquired on irregular but structure...

