[feat](PA): support pa_decode_gluon and refactor attention ops #42

PerryZhang01 · 2025-12-11T03:11:42Z

This PR integrates new paged attention triton kernel, it supports sliding_window and sink params. This PR also refactors attention layers with attention backend dispatch.

the accuracy of gpt-oss in gsm8k dataset:

valarLip · 2025-12-11T10:20:40Z

atom/model_ops/attention_mha.py

-                        output_zeros=False,
-                    )
-            else:
-                if self.rotary_emb is not None:


this pass missed?

yea, now we have model using reshape_and_cache_with_pertoken_quant ? if none, we don`t wanna introduce new dispatch or if else, if necessary, then we add it.

valarLip · 2025-12-17T08:25:52Z

atom/utils/forward_context.py

        self.reduce_indptr = reduce_indptr
        self.reduce_final_map = reduce_final_map
        self.reduce_partial_map = reduce_partial_map
-        if block_tables_converted is not None:


keep these..

just someone deleted it when rebase main, I will recover it.

zgplvyou and others added 6 commits December 11, 2025 02:57

[feat](PA): support pa_decode_gluon and refactor attention ops

1237dc5

[fix](PA): change zeros to empty

054049b

[fix](PA): init fake block table on device

629a712

Merge branch 'main' into pa_gluon

d82b4ba

update

f5ad24d

[fix](attn): support rotary_emb none

493fbae

valarLip reviewed Dec 17, 2025

View reviewed changes

zgplvyou added 2 commits December 18, 2025 02:33

[fix](PA): add one param for pa

0e98b34

[fix](pa): recover block tables convert

13dd91b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat](PA): support pa_decode_gluon and refactor attention ops #42

[feat](PA): support pa_decode_gluon and refactor attention ops #42

PerryZhang01 commented Dec 11, 2025 •

edited

Loading

Uh oh!

valarLip Dec 11, 2025

Uh oh!

PerryZhang01 Dec 18, 2025

Uh oh!

valarLip Dec 17, 2025

Uh oh!

PerryZhang01 Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[feat](PA): support pa_decode_gluon and refactor attention ops #42

Are you sure you want to change the base?

[feat](PA): support pa_decode_gluon and refactor attention ops #42

Conversation

PerryZhang01 commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valarLip Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

PerryZhang01 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

valarLip Dec 17, 2025

Choose a reason for hiding this comment

Uh oh!

PerryZhang01 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

PerryZhang01 commented Dec 11, 2025 •

edited

Loading