Training Unbiased Diffusion Models From Biased Dataset
- categorize
- Machine Learning
- Conference Name
- International Conference on Learning Representations (ICLR 2024)
- Presentation Date
- May 7-11
- City
- Vienna
- Country
- Austria
Yeongmin Kim, Byeonghu Na, JoonHo Jang, Minsang Park, Dongjun Kim, Wanmo Kang, and Il-Chul Moon, Training Unbiased Diffusion Models From Biased Dataset, International Conference on Learning Representations (ICLR 2024), Vienna, Austria, May 7-11, 2024
Abstract
With significant advancements in diffusion models, addressing the potential risks of dataset bias becomes increasingly important. Since generated outputs directly suffer from dataset bias, mitigating latent bias becomes a key factor in improving sample quality and proportion. This paper proposes time-dependent importance reweighting to mitigate the bias for the diffusion models. We demonstrate that the time-dependent density ratio becomes more precise than previous approaches, thereby minimizing error propagation in generative learning. While directly applying it to score-matching is intractable, we discovered that using the time-dependent density ratio both for reweighting and score correction can lead to a tractable form of the objective function to regenerate the unbiased data density. Furthermore, we theoretically established a connection with traditional score-matching, and we demonstrated its convergence to an unbiased distribution. The experimental evidence supports the usefulness of the proposed method, which outperforms baselines including time-independent importance reweighting on CIFAR-10, CIFAR-100, FFHQ, and CelebA with various bias settings.