Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density

View through CrossRef
In recent years, graph-partition-based clustering algorithms have attracted increasing attention. These algorithms first construct a graph over the data points and then partition this graph, regarding each connected subgraph in the partitioned graph as a cluster. However, traditional graph-partition-based clustering algorithms face challenges when clustering datasets with highly imbalanced density distributions. This is because, during the process of graph partitioning, they mainly rely on edge lengths and largely ignore local density variations. To address this issue, we propose the Adaptive Graph Partitioning (AGP) clustering algorithm. AGP integrates local density information into the partitioning process and adaptively normalizes the magnitudes of edge weights in both sparse and dense regions. This enhancement allows AGP to avoid the over-partitioning of dense clusters, effectively addressing a common problem in traditional graph-partition-based clustering algorithms. Additionally, the mechanism of connectivity domain differences is introduced into AGP, further enhancing the algorithm’s ability to discriminate between neighboring clusters that are difficult to separate. Extensive experiments on 13 benchmark datasets show that AGP achieves the best clustering performance on 7 datasets and performs competitively on the others, especially when density imbalance is severe. Moreover, scalability experiments on datasets with up to 100,000 data points, together with complexity analysis, demonstrate that AGP enjoys favorable time and memory efficiency compared with representative baselines.
Title: Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density
Description:
In recent years, graph-partition-based clustering algorithms have attracted increasing attention.
These algorithms first construct a graph over the data points and then partition this graph, regarding each connected subgraph in the partitioned graph as a cluster.
However, traditional graph-partition-based clustering algorithms face challenges when clustering datasets with highly imbalanced density distributions.
This is because, during the process of graph partitioning, they mainly rely on edge lengths and largely ignore local density variations.
To address this issue, we propose the Adaptive Graph Partitioning (AGP) clustering algorithm.
AGP integrates local density information into the partitioning process and adaptively normalizes the magnitudes of edge weights in both sparse and dense regions.
This enhancement allows AGP to avoid the over-partitioning of dense clusters, effectively addressing a common problem in traditional graph-partition-based clustering algorithms.
Additionally, the mechanism of connectivity domain differences is introduced into AGP, further enhancing the algorithm’s ability to discriminate between neighboring clusters that are difficult to separate.
Extensive experiments on 13 benchmark datasets show that AGP achieves the best clustering performance on 7 datasets and performs competitively on the others, especially when density imbalance is severe.
Moreover, scalability experiments on datasets with up to 100,000 data points, together with complexity analysis, demonstrate that AGP enjoys favorable time and memory efficiency compared with representative baselines.

Related Results

Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
Bootstrapping a Biodiversity Knowledge Graph
Bootstrapping a Biodiversity Knowledge Graph
The "biodiversity knowledge graph" is a nice metaphor for connecting biodiversity data sources, but can we actually build it? Do we have sufficient linked data available? Given tha...
Near-Data Source Graph Partitioning
Near-Data Source Graph Partitioning
Recently, numerous graph partitioning approaches have been proposed to distribute a big graph to machines in a cluster for distributed computing. Due to heavy communication overhea...
The Kernel Rough K-Means Algorithm
The Kernel Rough K-Means Algorithm
Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in re...
Adaptive Graph Convolution Using Heat Kernel for Attributed Graph Clustering
Adaptive Graph Convolution Using Heat Kernel for Attributed Graph Clustering
Attributed graphs contain a lot of node features and structural relationships, and how to utilize their inherent information sufficiently to improve graph clustering performance ha...
A Comparative Study of Graph Kernels and Clustering Algorithms
A Comparative Study of Graph Kernels and Clustering Algorithms
Graph kernels have evolved as a promising and popular method for graph clustering over the last decade. In this work, comparative study on the five standard graph kernel techniques...

Back to Top