Skip to content

make torch use flash-attn #2189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Conversation

pass-lin
Copy link
Contributor

@pass-lin pass-lin commented Apr 2, 2025

from #2143
I found that I can't use flash-attn under torch. This doesn't seem to be a reasonable setting. According to some previous reports, this might be because there are some errors under torch. Such as #2145 and keras-team/keras#20459
So I first submitted a change to see where the test will report an error, and then adjust his code.

@mattdangerw
Copy link
Member

@divyashreepathihalli any thoughts on the overall goal with this PR?

In terms of implementation, let's wait till #2184 goes in and add this on top. We should add some unit tests to at the very least make sure the fused attention op is getting hit only when it should. We are in general trying to make the flash attention code less spaghetti, since there's so many cases to handle.

@divyashreepathihalli
Copy link
Collaborator

Can you please also add a mock test similar to Here

@pass-lin
Copy link
Contributor Author

pass-lin commented Apr 5, 2025

Can you please also add a mock test similar to Here

Of course, I have no problem. But before that, please run a GPU test to see where there are problems under the current code.

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Apr 10, 2025
@divyashreepathihalli
Copy link
Collaborator

The branch has conflicts please resolve them for the tests to run.

@pass-lin
Copy link
Contributor Author

pass-lin commented Apr 12, 2025

@divyashreepathihalli
I can run the test locally, so I don't quite understand what went wrong.
image

@pass-lin
Copy link
Contributor Author

I've been quite busy lately, and I think I should focus on these PRs for now

So I'll close this PR first.I will restart and submit this PR in the near future.

keras-team/keras#21160
#2192
#2177
@divyashreepathihalli

@pass-lin pass-lin closed this Apr 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kokoro:force-run Runs Tests on GPU
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants