Skip to content

Conversation

@ck-intel
Copy link
Contributor

Rename the parameter "softmax_sink" to "sinks" for flash_attn_with_kvcache kernel, as "sinks" is used in the sglang framework.
refer file: sglang/python/sglang/srt/layers/attention/xpu_backend.py

Run the tests:

python -m pytest tests/test_flash_attention.py::test_flash_attn_kvcache -v
96 passed, 102 skipped 

Copy link

@adityachatter adityachatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@deepvars deepvars added the run-ci label Nov 3, 2025
@sunjiweiswift
Copy link
Collaborator

sunjiweiswift commented Nov 4, 2025

"benchmark/bench_flash_attn.py" and "python/sgl_kernel/flash_attn.py" also need rename sink~

@ck-intel
Copy link
Contributor Author

ck-intel commented Nov 4, 2025

"benchmark/bench_flash_attn.py" and "python/sgl_kernel/flash_attn.py" also need rename sink~

Hi @sunjiweiswift I already did that. In case I missed anything, can you point to the specific line of code in these files?

…e kernel, as "sinks" is used in the sglang framework
@ck-intel ck-intel force-pushed the rename_sfotmax_sink branch from d0646ad to f468bbf Compare November 4, 2025 07:34
@sunjiweiswift
Copy link
Collaborator

Author

Sorry. You've already corrected it. I made a mistake.

"benchmark/bench_flash_attn.py" and "python/sgl_kernel/flash_attn.py" also need rename sink~

Hi @sunjiweiswift I already did that. In case I missed anything, can you point to the specific line of code in these files?

Sorry. You've already corrected it. I made a mistake.

@ck-intel ck-intel requested a review from deepvars November 4, 2025 07:37
@deepvars deepvars merged commit 5c88329 into sgl-project:main Nov 4, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants