Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are you even allowed to do these ops inplace? #254

Open
mayank31398 opened this issue Sep 18, 2024 · 1 comment
Open

Are you even allowed to do these ops inplace? #254

mayank31398 opened this issue Sep 18, 2024 · 1 comment

Comments

@mayank31398
Copy link

mayank31398 commented Sep 18, 2024

tl.store(a_ptr + col_offsets, da_row, mask=mask)
tl.store(b_ptr + col_offsets, db_row, mask=mask)

Lets take a custom autograd function:

class Exponential(torch.autograd.Function):
    def forward(ctx, x):
        out = torch.exp(x)
        ctx.save_for_backward(out)
        return out

    def backward(ctx, out_grad):
        out = ctx.saved_tensors
        x_grad = out_grad * out
        return x_grad

and if we have an op like swiglu that modifies inputs in backwards:

x = some tensor
x_exp = Exponential.apply(x)
y = swiglu(x_exp, x_exp)
loss = some_loss(y, target)

now during backprop, we would see incorrect behaviour right?
because the custom autograd function Exponential saves the output for backprop here instead of saving the input for backprop.

@mayank31398
Copy link
Author

Hey, guys
Any clarification regarding this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant