Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResBlock with inplace relu? #60

Open
rfeinman opened this issue Jan 22, 2021 · 5 comments
Open

ResBlock with inplace relu? #60

rfeinman opened this issue Jan 22, 2021 · 5 comments

Comments

@rfeinman
Copy link

rfeinman commented Jan 22, 2021

I noticed in the ResBlock class of vqvae.py that you put a ReLU activation at the start of the residual stack, and no ReLU at the end:

class ResBlock(nn.Module):
    def __init__(self, in_channel, channel):
        super().__init__()

        self.conv = nn.Sequential(
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channel, channel, 3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(channel, in_channel, 1),
        )

    def forward(self, input):
        out = self.conv(input)
        out += input

        return out

I think the point is to pass forward the pre-activation values from the previous layer as the identity (the skip connection), rather than post-activation. However, since you have inplace=True for the start ReLU, you actually end up getting the latter. Is that intentional or a bug?

@rosinality
Copy link
Owner

Oh...Definitely it is a bug. Thank you! Fixed at ef5f67c.

@SURABHI-GUPTA
Copy link

@rfeinman How does this affect the result? Can you please explain in detail?

@rfeinman
Copy link
Author

@SURABHI-GUPTA Assume that "input" is the pre-relu activation from the previous layer. The target computation is the following:

output = self.conv(input)
output += input

However the way the code was written, the variable input is being modified in-place by a relu when self.conv(input) is called. So you are actually getting

output = self.conv(input)
output += F.relu(input)

But this has been fixed now with the new commit.

@fostiropoulos
Copy link

fostiropoulos commented Jan 23, 2021

@rfeinman @SURABHI-GUPTA I am also trying to figure out what the implication of this additional operation could be. It could work as an additional skip connection. I can only see training performance (in terms of speed) degradation.

Any further insights?

@rfeinman
Copy link
Author

@fostiropoulos I can't speak as to whether it is a better choice to pass forward the pre-activation or post-activation values for the skip connection. In popular residual architectures for classification (e.g. ResNet) they do the latter. In this repository, they chose the former.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants