Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

GRAPH BASED CLUSTERING WITH CONSTRAINTS AND ACTIVE LEARNING

View through CrossRef
During the past few years, semi-supervised clustering has emerged as a new interesting direction in machine learning research. In a semi-supervised clustering algorithm, the clustering results can be significantly improved by using side information, which is available or collected from users. There are two main kinds of side information that can be learned in semi-supervised clustering algorithms: the class labels - called seeds or the pairwise constraints. The first semi-supervised clustering was introduced in 2000, and since that, many algorithms have been presented in literature. However, it is not easy to use both types of side information in the same algorithm. To address the problem, this paper proposes a semi-supervised graph based clustering algorithm that tries to use seeds and constraints in the clustering process, called MCSSGC. Moreover, we introduces a simple but efficient active learning method to collect the constraints that can boost the performance of MCSSGC, named KMMFFQS. In order to verify effectiveness of the proposed algorithm, we conducted a series of experiments not only on real data sets from UCI, but also on a document data set applied in an Information Extraction of Vietnamese documents. These obtained results show that the proposed algorithm can significantly improve the clustering process compared to some recent algorithms.
Publishing House for Science and Technology, Vietnam Academy of Science and Technology (Publications)
Title: GRAPH BASED CLUSTERING WITH CONSTRAINTS AND ACTIVE LEARNING
Description:
During the past few years, semi-supervised clustering has emerged as a new interesting direction in machine learning research.
In a semi-supervised clustering algorithm, the clustering results can be significantly improved by using side information, which is available or collected from users.
There are two main kinds of side information that can be learned in semi-supervised clustering algorithms: the class labels - called seeds or the pairwise constraints.
The first semi-supervised clustering was introduced in 2000, and since that, many algorithms have been presented in literature.
However, it is not easy to use both types of side information in the same algorithm.
To address the problem, this paper proposes a semi-supervised graph based clustering algorithm that tries to use seeds and constraints in the clustering process, called MCSSGC.
Moreover, we introduces a simple but efficient active learning method to collect the constraints that can boost the performance of MCSSGC, named KMMFFQS.
In order to verify effectiveness of the proposed algorithm, we conducted a series of experiments not only on real data sets from UCI, but also on a document data set applied in an Information Extraction of Vietnamese documents.
These obtained results show that the proposed algorithm can significantly improve the clustering process compared to some recent algorithms.

Related Results

CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
CREATING LEARNING MEDIA IN TEACHING ENGLISH AT SMP MUHAMMADIYAH 2 PAGELARAN ACADEMIC YEAR 2020/2021
The pandemic Covid-19 currently demands teachers to be able to use technology in teaching and learning process. But in reality there are still many teachers who have not been able ...
Graph convolutional neural networks for 3D data analysis
Graph convolutional neural networks for 3D data analysis
(English) Deep Learning allows the extraction of complex features directly from raw input data, eliminating the need for hand-crafted features from the classical Machine Learning p...
The Kernel Rough K-Means Algorithm
The Kernel Rough K-Means Algorithm
Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in re...
A Comparative Study of Graph Kernels and Clustering Algorithms
A Comparative Study of Graph Kernels and Clustering Algorithms
Graph kernels have evolved as a promising and popular method for graph clustering over the last decade. In this work, comparative study on the five standard graph kernel techniques...
Image clustering using exponential discriminant analysis
Image clustering using exponential discriminant analysis
Local learning based image clustering models are usually employed to deal with images sampled from the non‐linear manifold. Recently, linear discriminant analysis (LDA) based vario...
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
Data Analytics on Graphs Part I: Graphs and Spectra on Graphs
The area of Data Analytics on graphs promises a paradigm shift, as we approach information processing of new classes of data which are typically acquired on irregular but structure...
Bootstrapping a Biodiversity Knowledge Graph
Bootstrapping a Biodiversity Knowledge Graph
The "biodiversity knowledge graph" is a nice metaphor for connecting biodiversity data sources, but can we actually build it? Do we have sufficient linked data available? Given tha...

Back to Top