Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Mixtures of Variational Autoencoders for Cluster Analysis in Latent Space

View through CrossRef
<p><strong>Deep generative models have greatly advanced the field of artificial intelligence by learning the distribution of unlabelled datasets. In this thesis, we aim to develop a novel variational autoencoder (VAE)-based deep generative model that learns meaningful representations for effective unsupervised clustering in the latent space of high-dimensional datasets.</strong></p><p>Existing VAE-based deep clustering algorithms assume a mixture distribution either for the variational posterior over the latent variables or for the prior, to approximate the distribution of unlabelled datasets. The proposed mixtures variational autoencoders (MVAE) model assumes a mixture distribution for both the prior and variational posterior over the latent variables within the VAE framework.</p><p>We apply the variational inference to construct the evidence lower bound (ELBO) of the marginal log-likelihood function for our proposed MVAE model and then derive the analytical form for the clustering assignment estimate, also called the posterior probability estimate. This estimate is associated with each component of the VAE framework and enables the optimization of two distinct ELBO functions based on soft and hard assignment approaches. The training procedure for MVAE involves the joint optimization of both clustering assignment probabilities and model parameters. We evaluate the clustering performance on the latent embeddings of a dataset using both the posterior probability estimate and the Gaussian mixture model (GMM) methods.</p><p>We proposed two variants of the expectation-maximization (EM) algorithm to optimize the parameters for the MVAE model. The MVAE (EM) algorithm provides two distinct, numerically stable solutions for the posterior probability estimate: MVAE(EM-V1), where the estimate is associated with only the prior distribution, and MVAE(EM-V2), where the estimate incorporates both the variational posterior and the prior distribution. Among the proposed models, MVAE(EM-V2) requires less computation time for training. Additionally, this model achieves superior clustering performance on most datasets compared to the baseline models.</p><p>We modified the MVAE(EM-V2) algorithm by including a coefficient (β) with the regularizer term. The trained regularized mixtures VAE model with a small regularization coefficient (β < 1) achieves good unsupervised clustering performance on the test datasets. We further propose a variant of the regularized mixtures VAE model in which the regularization coefficient follows an annealing schedule from β > 1 to β < 1. The scheduled regularized model demonstrates superior clustering accuracy across most benchmark datasets compared to state-of-the-art deep clustering algorithms.</p><p>In this thesis, we point out that assuming a mixture distribution for both the prior and variational posterior components over the latent variables within the VAE framework enhances unsupervised clustering in the latent space. Our proposed models outperform state-of-the-art deep clustering algorithms, including VADE and k-DVAE, as well as the standard VAE model, in cluster analysis of latent representations across most benchmark datasets. Additionally, the proposed models demonstrate reasonable reconstruction performance and generate realistic examples from the latent space.</p>
Victoria University of Wellington Library
Title: Mixtures of Variational Autoencoders for Cluster Analysis in Latent Space
Description:
<p><strong>Deep generative models have greatly advanced the field of artificial intelligence by learning the distribution of unlabelled datasets.
In this thesis, we aim to develop a novel variational autoencoder (VAE)-based deep generative model that learns meaningful representations for effective unsupervised clustering in the latent space of high-dimensional datasets.
</strong></p><p>Existing VAE-based deep clustering algorithms assume a mixture distribution either for the variational posterior over the latent variables or for the prior, to approximate the distribution of unlabelled datasets.
The proposed mixtures variational autoencoders (MVAE) model assumes a mixture distribution for both the prior and variational posterior over the latent variables within the VAE framework.
</p><p>We apply the variational inference to construct the evidence lower bound (ELBO) of the marginal log-likelihood function for our proposed MVAE model and then derive the analytical form for the clustering assignment estimate, also called the posterior probability estimate.
This estimate is associated with each component of the VAE framework and enables the optimization of two distinct ELBO functions based on soft and hard assignment approaches.
The training procedure for MVAE involves the joint optimization of both clustering assignment probabilities and model parameters.
We evaluate the clustering performance on the latent embeddings of a dataset using both the posterior probability estimate and the Gaussian mixture model (GMM) methods.
</p><p>We proposed two variants of the expectation-maximization (EM) algorithm to optimize the parameters for the MVAE model.
The MVAE (EM) algorithm provides two distinct, numerically stable solutions for the posterior probability estimate: MVAE(EM-V1), where the estimate is associated with only the prior distribution, and MVAE(EM-V2), where the estimate incorporates both the variational posterior and the prior distribution.
Among the proposed models, MVAE(EM-V2) requires less computation time for training.
Additionally, this model achieves superior clustering performance on most datasets compared to the baseline models.
</p><p>We modified the MVAE(EM-V2) algorithm by including a coefficient (β) with the regularizer term.
The trained regularized mixtures VAE model with a small regularization coefficient (β < 1) achieves good unsupervised clustering performance on the test datasets.
We further propose a variant of the regularized mixtures VAE model in which the regularization coefficient follows an annealing schedule from β > 1 to β < 1.
The scheduled regularized model demonstrates superior clustering accuracy across most benchmark datasets compared to state-of-the-art deep clustering algorithms.
</p><p>In this thesis, we point out that assuming a mixture distribution for both the prior and variational posterior components over the latent variables within the VAE framework enhances unsupervised clustering in the latent space.
Our proposed models outperform state-of-the-art deep clustering algorithms, including VADE and k-DVAE, as well as the standard VAE model, in cluster analysis of latent representations across most benchmark datasets.
Additionally, the proposed models demonstrate reasonable reconstruction performance and generate realistic examples from the latent space.
</p>.

Related Results

Epidemiological, diagnostic and medical-social aspects of latent syphilis
Epidemiological, diagnostic and medical-social aspects of latent syphilis
Objective — to study epidemiological, clinical and medical-social aspects of latent syphilis in Ukraine over the past 40 years. Materials and methods. Data of patients with latent ...
Parametrization of Heliophysical Data Using Autoencoders
Parametrization of Heliophysical Data Using Autoencoders
One of the most important steps in any AI/ML application is the pre-processing of the data. The objective of this step is to project the original data in a new basis, or in a new l...
Inheritance of Cluster Headache and its Possible Link to Migraine
Inheritance of Cluster Headache and its Possible Link to Migraine
SYNOPSIS We evaluated the possibility that cluster headache may be a transmitted disorder, influenced by migraine genetics. In the first part of a two part study,...
Ciudad de Museos: clústeres de museos en la ciudad contemporánea
Ciudad de Museos: clústeres de museos en la ciudad contemporánea
En nuestra cultura el museo ocupa un lugar privilegiado simbólicamente, pero también físicamente, en la ciudad. Y no tan sólo lo ocupa, sino lo crea, lo define, lo cambia y le da s...
Constructing a VANET based on cluster chains
Constructing a VANET based on cluster chains
SUMMARYThe paper proposes a scheme on constructing a vehicular ad‐hoc network based on cluster chains. In the cluster construction algorithm, the distance from a potential cluster ...
Theory of variational quantum simulation
Theory of variational quantum simulation
The variational method is a versatile tool for classical simulation of a variety of quantum systems. Great efforts have recently been devoted to its extension to quantum computing ...
Evaluation of genetic divergence in Barley (Hordeum vulgare L.) germplasms
Evaluation of genetic divergence in Barley (Hordeum vulgare L.) germplasms
Thirty genotypes of wheat were evaluated for assessing genetic divergence among eleven different characters across one environment for exploitation in a breeding programme for impr...
Seditious Spaces
Seditious Spaces
The title ‘Seditious Spaces’ is derived from one aspect of Britain’s colonial legacy in Malaysia (formerly Malaya): the Sedition Act 1948. While colonial rule may seem like it was ...

Back to Top