Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Hamiltonian formulations of centroid-based clustering

View through CrossRef
Clustering is a fundamental task in data science that aims to group data based on their similarities. However, defining similarity is often ambiguous, making it challenging to determine the most appropriate objective function for a given dataset. Traditional clustering methods, such as the k-means algorithm and weighted maximum k-cut, focus on specific objectives—typically relying on average or pairwise characteristics of the data—leading to performance that is highly data-dependent. Moreover, incorporating practical constraints into clustering objectives is not straightforward, and these problems are known to be NP-hard. In this study, we formulate the clustering problem as a search for the ground state of a Hamiltonian, providing greater flexibility in defining clustering objectives and incorporating constraints. This approach enables the application of various quantum simulation techniques, including both circuit-based quantum computation and quantum annealing, thereby opening a path toward quantum advantage in solving clustering problems. We propose various Hamiltonians to accommodate different clustering objectives, including the ability to combine multiple objectives and incorporate constraints. We evaluate the clustering performance through numerical simulations and implementations on the D-Wave quantum annealer. The results demonstrate the broad applicability of our approach to a variety of clustering problems on current quantum devices. Furthermore, we find that Hamiltonians designed for specific clustering objectives and constraints impose different requirements for qubit connectivity, indicating that certain clustering tasks are better suited to specific quantum hardware. Our experimental results highlight this by identifying the Hamiltonian that optimally utilizes the physical qubits available in the D-Wave System.
Title: Hamiltonian formulations of centroid-based clustering
Description:
Clustering is a fundamental task in data science that aims to group data based on their similarities.
However, defining similarity is often ambiguous, making it challenging to determine the most appropriate objective function for a given dataset.
Traditional clustering methods, such as the k-means algorithm and weighted maximum k-cut, focus on specific objectives—typically relying on average or pairwise characteristics of the data—leading to performance that is highly data-dependent.
Moreover, incorporating practical constraints into clustering objectives is not straightforward, and these problems are known to be NP-hard.
In this study, we formulate the clustering problem as a search for the ground state of a Hamiltonian, providing greater flexibility in defining clustering objectives and incorporating constraints.
This approach enables the application of various quantum simulation techniques, including both circuit-based quantum computation and quantum annealing, thereby opening a path toward quantum advantage in solving clustering problems.
We propose various Hamiltonians to accommodate different clustering objectives, including the ability to combine multiple objectives and incorporate constraints.
We evaluate the clustering performance through numerical simulations and implementations on the D-Wave quantum annealer.
The results demonstrate the broad applicability of our approach to a variety of clustering problems on current quantum devices.
Furthermore, we find that Hamiltonians designed for specific clustering objectives and constraints impose different requirements for qubit connectivity, indicating that certain clustering tasks are better suited to specific quantum hardware.
Our experimental results highlight this by identifying the Hamiltonian that optimally utilizes the physical qubits available in the D-Wave System.

Related Results

Roles for Spectral Centroid and Other Factors in Determining "Blended" Instrument Pairings in Orchestration
Roles for Spectral Centroid and Other Factors in Determining "Blended" Instrument Pairings in Orchestration
Three perceptual experiments using natural-sounding instrument tones arranged in concurrently sounding pairs investigate a problem of orchestration: what factors determine selectio...
The Kernel Rough K-Means Algorithm
The Kernel Rough K-Means Algorithm
Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in re...
Image clustering using exponential discriminant analysis
Image clustering using exponential discriminant analysis
Local learning based image clustering models are usually employed to deal with images sampled from the non‐linear manifold. Recently, linear discriminant analysis (LDA) based vario...
IDCUP Algorithm to Classifying Arbitrary Shapes and Densities for Center-based Clustering Performance Analysis
IDCUP Algorithm to Classifying Arbitrary Shapes and Densities for Center-based Clustering Performance Analysis
Aim/Purpose: The clustering techniques are normally considered to determine the significant and meaningful subclasses purposed in datasets. It is an unsupervised type of Machine Le...
Clustering Analysis of Data with High Dimensionality
Clustering Analysis of Data with High Dimensionality
Clustering analysis has been widely applied in diverse fields such as data mining, access structures, knowledge discovery, software engineering, organization of information systems...
The kinematics modeling and parameter optimization of six-wheel lunar exploration robot
The kinematics modeling and parameter optimization of six-wheel lunar exploration robot
This article proposes a six-wheel lunar exploration robot which will move on the lunar surface. It is known that lunar surface is mostly rugged. When the six-wheel lunar exploratio...
Parallel density clustering algorithm based on MapReduce and optimized cuckoo algorithm
Parallel density clustering algorithm based on MapReduce and optimized cuckoo algorithm
In the process of parallel density clustering, the boundary points of clusters with different densities are blurred and there is data noise, which affects the clustering performanc...
Extended Jaccard Indexive Buffalo Optimized Clustering on Geo-social Networks with Big Data
Extended Jaccard Indexive Buffalo Optimized Clustering on Geo-social Networks with Big Data
Clustering is a general task of data mining where partitioning a large dataset into dissimilar groups is done. The enormous growth of Geo-Social Networks (GeoSNs) includes users, w...

Back to Top