Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

Open
fostiropoulos opened this issue Oct 8, 2020 · 3 comments
Open

Comments

@fostiropoulos
Copy link

I am using 4x Nvidia V100 and I am not able to get a batch size larger than 32 for the hyperparameters of this paper for training on the top codes. I have also changed the loss to discretized mixtures of logistics similar to the actual PixelCNN++ and PixelSnail implementation. The authors mention a batch size of 1024 which seems unreal to reach. Does this implementation of PixelSnail use more layers than the one reported in the VQVAE2 paper?

I am not able to make the mapping between this implementation and the one described in the appendix of VQVAE 2 to correctly configure it to replicate their results. Any help appreciated.

image

@rosinality
Copy link
Owner

rosinality commented Oct 8, 2020

Actually the network used in the paper is much larger than the default model in this implementation.

@fostiropoulos
Copy link
Author

Yes I would initially have thought so. I can only think of being able to train such a large model on a TPU. Do you have any insights on how it could have been done?

@rosinality
Copy link
Owner

Maybe they have used tpus or large amount of gpus. Anyway replicating the model training in the paper will be very hard (actually practically impossible) with a few number of gpus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants