Weonyoung Joo, Pathwise Gradient Estimators for Various Probability Distributions in Deep Generative Models, PhD Dissertation, Department of Industrial and Systems Engineering, KAIST, 2020
- File
- dissertation_WeonyoungJoo_aailab.pdf (40.7M) 32회 다운로드 DATE : 2023-11-07 14:23:03
Weonyoung Joo, Pathwise Gradient Estimators for Various Probability Distributions in Deep Generative Models, PhD Dissertation, Department of Industrial and Systems Engineering, KAIST, 2020
Abstract
Estimating the gradients of stochastic nodes is one of the crucial research questions in the deep generative modeling community to optimize the model parameters through gradient descent method. This dissertation discusses two types of pathwise gradient estimators: one for Dirichlet distribution, and the other for generic discrete distributions. In our first work, we propose Dirichlet Variational Autoencoder (DirVAE) using a Dirichlet prior. To infer the parameters of DirVAE, we develop the pathwise gradient estimator by approximating the inverse cumulative distribution function of the Gamma distribution, which is a component of the Dirichlet distribution. This approximation on a new prior led an investigation on the component collapsing, and DirVAE revealed that the component collapsing originates from two problem sources: decoder weight collapsing and latent value collapsing. By resolving the component collapsing problem with the Dirichlet prior, we show that DirVAE produces disentangled latent representation which leads to the significant performance gain. Comparing to the continuous case, the gradient estimation problem becomes further complex when we regard the stochastic nodes to be discrete because pathwise derivative techniques can not be applied. Hence, the gradient estimation requires the score function methods or the continuous relaxation of the discrete random variables. In our second work, we suggest a general version of the Gumbel-Softmax estimator with continuous relaxation, and this estimator is able to relax the discreteness of probability distributions, including broader types than the current practice. In detail, we utilize the truncation of discrete random variables and the Gumbel-Softmax trick with a linear transformation for the relaxation. The proposed approach enables the relaxed discrete random variable to be reparameterized and to backpropagate through a large scale stochastic neural network.
@phdthesis{Joo:2020,
author = {Weonyoung Joo},
advisor ={Il-Chul Moon},
title = {Pathwise Gradient Estimators for Various Probability Distributions in Deep Generative Models},
school = {KAIST},
year = {2020}
}
- PreviousHyemi Kim, Counterfactual Inference and Counterfactual Data Generation through Latent Disentanglement, Master's Thesis, Department of Industrial and Systems Engineering, KAIST, 2020
- NextSeungjae Shin, Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation, Master's Thesis, Department of Industrial and Systems Engineering, KAIST, 2020