Javascript must be enabled to continue!

Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density

In recent years, graph-partition-based clustering algorithms have attracted increasing attention. These algorithms first construct a graph over the data points and then partition this graph, regarding each connected subgraph in the partitioned graph as a cluster. However, traditional graph-partition-based clustering algorithms face challenges when clustering datasets with highly imbalanced density distributions. This is because, during the process of graph partitioning, they mainly rely on edge lengths and largely ignore local density variations. To address this issue, we propose the Adaptive Graph Partitioning (AGP) clustering algorithm. AGP integrates local density information into the partitioning process and adaptively normalizes the magnitudes of edge weights in both sparse and dense regions. This enhancement allows AGP to avoid the over-partitioning of dense clusters, effectively addressing a common problem in traditional graph-partition-based clustering algorithms. Additionally, the mechanism of connectivity domain differences is introduced into AGP, further enhancing the algorithm’s ability to discriminate between neighboring clusters that are difficult to separate. Extensive experiments on 13 benchmark datasets show that AGP achieves the best clustering performance on 7 datasets and performs competitively on the others, especially when density imbalance is severe. Moreover, scalability experiments on datasets with up to 100,000 data points, together with complexity analysis, demonstrate that AGP enjoys favorable time and memory efficiency compared with representative baselines.

Association for Computing Machinery (ACM)

Jing Geng Shangxian Zhao Wang Weizhe Qi Li

ACM Transactions on Knowledge Discovery from Data

2025

Title: Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density

Description:

In recent years, graph-partition-based clustering algorithms have attracted increasing attention.

These algorithms first construct a graph over the data points and then partition this graph, regarding each connected subgraph in the partitioned graph as a cluster.

However, traditional graph-partition-based clustering algorithms face challenges when clustering datasets with highly imbalanced density distributions.

This is because, during the process of graph partitioning, they mainly rely on edge lengths and largely ignore local density variations.

To address this issue, we propose the Adaptive Graph Partitioning (AGP) clustering algorithm.

AGP integrates local density information into the partitioning process and adaptively normalizes the magnitudes of edge weights in both sparse and dense regions.

This enhancement allows AGP to avoid the over-partitioning of dense clusters, effectively addressing a common problem in traditional graph-partition-based clustering algorithms.

Additionally, the mechanism of connectivity domain differences is introduced into AGP, further enhancing the algorithm’s ability to discriminate between neighboring clusters that are difficult to separate.

Extensive experiments on 13 benchmark datasets show that AGP achieves the best clustering performance on 7 datasets and performs competitively on the others, especially when density imbalance is severe.

Moreover, scalability experiments on datasets with up to 100,000 data points, together with complexity analysis, demonstrate that AGP enjoys favorable time and memory efficiency compared with representative baselines.

Back

ABSTRACT Density‐dependent behavior underpins white‐tailed deer ( Odocoileus virginianus ) theory and...

Graph convolutional neural networks for 3D data analysis

(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...

Bootstrapping a Biodiversity Knowledge Graph

The "biodiversity knowledge graph" is a nice metaphor for connecting biodiversity data sources, but can we actually build it? Do we have sufficient linked data available? Given tha...

Near-Data Source Graph Partitioning

Recently, numerous graph partitioning approaches have been proposed to distribute a big graph to machines in a cluster for distributed computing. Due to heavy communication overhea...

Bilangan Terhubung Titik Pelangi pada Graf Garis dan Graf Tengah dari Hasil Operasi Comb Graf Bintang C<sub>3</sub> dan Graf Bintang S<sub>n</sub>

Penelitian ini bertujuan menentukan bilangan terhubung titik pelangi (rainbow vertex connection number) pada graf garis dan graf tengah yang diperoleh dari hasil operasi comb antar...

The Kernel Rough K-Means Algorithm

Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in re...

Adaptive Graph Convolution Using Heat Kernel for Attributed Graph Clustering

Attributed graphs contain a lot of node features and structural relationships, and how to utilize their inherent information sufficiently to improve graph clustering performance ha...

A Comparative Study of Graph Kernels and Clustering Algorithms

Graph kernels have evolved as a promising and popular method for graph clustering over the last decade. In this work, comparative study on the five standard graph kernel techniques...

Email:
Password:

Email:

Adaptive Graph Partitioning for Clustering Datasets with Heterogeneous Density

Related Results