Add support for Classifier Free Guidance in Network

The paper says, "We follow classifier-free guidance and train our models with conditioning dropout: conditional inputs are set to 0 for 10% of training time."

This means during 10% of the training time, only Person UNet is to be trained, without any cross-attention, or self attention or anything in conditional inputs. 

Maybe I am not familiar with the concept but how would it work without RGB-agnostic images, or how 6 channels would be passed? Do we make values 0 for RGB agnostic images? Any comments are welcome. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Classifier Free Guidance in Network #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add support for Classifier Free Guidance in Network #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions