We release the source code for our paper "ControlVAE: Controllable Variational Autoencoder" published at ICML 2020. It can be used for disentangled representation learning, text generation and image generation.
If you use our source code, please cite our paper:
@article{shao2020controlvae,
title={ControlVAE: Controllable Variational Autoencoder},
author={Shao, Huajie and Yao, Shuochao and Sun, Dachun and Zhang, Aston and Liu, Shengzhong and Liu, Dongxin and Wang, Jun and Abdelzaher, Tarek},
journal={Proceedings of the 37th International Conference on Machine Learning (ICML)},
year={2020},
address={Online}
}
The PI controller can serve as a plug in component in the front of hyperparameter on KL term or other terms. You can directly import the P_PID.py and set Kp and Ki as a small value, such as 0.001, -0.0001. Basically, we need to choose small hyperparameters of PI controller to guarantee the stability of control system.
About how to set the KL value, one way is to run the basic VAE and get its KL value, then you can slightly increase and decrease the KL value as you want.
If you have any questions, please feel free to contact me ([email protected]).
- Dsprite Data for disentanglement
- Step1: Enter the path "./Disentangling/"
- Step2: sh prepare_data.sh dsprites
- CelebA data for Image Generation Application
- Step 1: Download public data in the website: http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
- Step2: first download img_align_celeba.zip and put in data directory like below ./data/img_align_celeba.zip
- Step3: Enter the folder "Image_generation", and then run scrip file
$ bash prepare_data.sh CelebA
$ python3 data_split.py // split the data into training and testing data
- PTB data for Language modeling Application
* Step1: enter the path "./Language_modeling/Text_gen_PTB"
* Step2: then run (install Texar torch first) python prepare_data.py --data ptb
- Switchboard(SW) data for Dialog-generation
-
Please Download Glove word embeddings from http://nlp.stanford.edu/data/glove.twitter.27B.zip. The default setting use 200 dimension word embedding trained on Twitter. Then unzip and save the data into the path "./glove/glove.twitter.27B.200d.txt"
-
We already download the SW data and zip it in the path "./Language_modeling/NeuralDialog/data.
In order to reproduce the experiment results, please follow the instructions below to run source code.
- Before running the models, please first look at "data_download.txt" to download public data
- Please install some packages in "requirements.txt"
- run visdom server: $ bash run_server.sh
- bash run_dsprites_pid_c18.sh
- bash run_server.sh
- bash run_celeba_PID_z500_KL200_d128.sh
-
for text generation on PTB data, please run: bash run_vae_transform_ptb_pid.sh
-
for Dialog generation on SW data, please run: $ bash run_pid.sh
❤️ We thank the following users who open repositories on Github for us to build on for our experiments
- Texar-pytorch https://github.com/asyml/texar-pytorch
- 1Konny (Beta-VAE) https://github.com/1Konny/Beta-VAE
- snakeztc (NeuralDialog-CVAE) https://github.com/snakeztc/NeuralDialog-CVAE