AICSDEV-216: gaudi oss enablement #192

vermavis · 2025-09-17T19:47:41Z

No description provided.

xuechendi · 2025-09-17T19:51:18Z

vllm_gaudi/attention/backends/hpu_attn.py

+                block_list = attn_metadata.block_list
+                block_groups = attn_metadata.block_groups
+                block_mapping = attn_metadata.block_mapping
+                attn_bias = attn_metadata.attn_bias


I don't think we should rename the attributes here. You can do

if not self.sliding_window or attn_metadata.window_block_list is None: block_list = attn_metadata.block_list block_groups = attn_metadata.block_groups block_mapping = attn_metadata.block_mapping attn_bias = attn_metadata.attn_bias else: block_list = attn_metadata.window_block_list block_groups = attn_metadata.window_block_groups block_mapping = attn_metadata.window_block_mapping attn_bias = attn_metadata.window_attn_bias

yeah problem was this:

(Worker_TP7 pid=102175) ERROR 09-17 19:58:46 [multiproc_executor.py:671] AttributeError: 'TrimmedAttentionMetadata' object has no attribute 'window_block_list'

OK nvm, I see the code in vllm-fork does have this missing attribute. Let me fix it in here..

@xuechendi I tried PR# 150 which has sliding window support and following change on top of it, but unfortunately the accuracy still looks inaccurate.

diff --git a/vllm_gaudi/attention/backends/hpu_attn.py b/vllm_gaudi/attention/backends/hpu_attn.py index a558079..1206cbe 100644 --- a/vllm_gaudi/attention/backends/hpu_attn.py +++ b/vllm_gaudi/attention/backends/hpu_attn.py @@ -351,8 +351,10 @@ class HPUAttentionImpl(AttentionImpl, torch.nn.Module): attn_type: str = AttentionType.DECODER, kv_sharing_target_layer_name: Optional[str] = None, use_irope: bool = False, + sinks: Optional[int] = None, ) -> None: super(AttentionImpl, self).__init__() + self._sinks = sinks if kv_sharing_target_layer_name is not None: raise NotImplementedError("KV sharing is not currently supported on HPU.") if use_irope: diff --git a/vllm_gaudi/extension/ops.py b/vllm_gaudi/extension/ops.py index 4e01ec8..905d93d 100644 --- a/vllm_gaudi/extension/ops.py +++ b/vllm_gaudi/extension/ops.py @@ -484,7 +484,7 @@ class VllmMixtureOfExpertsOp(torch.nn.Module): w12=w1_list, w3=w2_list, permuted_weights=permuted_weights, - activation=activation, + activation="silu", experts_min=self.experts_min, experts_max=self.experts_max) for i in range(self.moe_n_slice):

xuechendi · 2025-09-17T19:52:09Z

vllm_gaudi/extension/ops.py

                                                    w3=w2_list,
                                                    permuted_weights=permuted_weights,
-                                                    activation=activation,
+                                                    activation="silu",


If silu is necessary, pass through config instead of hard-code

Looks like upstream vllm has this hardcoded to: swigluoai.
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/gpt_oss.py#L158

vermavis requested review from kzawora-intel, xuechendi, mswiniarsk and adobrzyn as code owners September 17, 2025 19:47

vermavis marked this pull request as draft September 17, 2025 19:48

xuechendi reviewed Sep 17, 2025

View reviewed changes

AICSDEV-216: gaudi oss enablement

ed4354e

vermavis force-pushed the AICSDEV-216 branch from acf8c82 to ed4354e Compare September 23, 2025 19:12

Merge branch 'vllm-project:main' into AICSDEV-216

40c3e87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AICSDEV-216: gaudi oss enablement #192

AICSDEV-216: gaudi oss enablement #192

Uh oh!

vermavis commented Sep 17, 2025

Uh oh!

xuechendi Sep 17, 2025

Uh oh!

vermavis Sep 17, 2025

Uh oh!

vermavis Sep 17, 2025

Uh oh!

vermavis Sep 18, 2025 •

edited

Loading

Uh oh!

xuechendi Sep 17, 2025

Uh oh!

vermavis Sep 23, 2025

Uh oh!

Uh oh!

AICSDEV-216: gaudi oss enablement #192

Are you sure you want to change the base?

AICSDEV-216: gaudi oss enablement #192

Uh oh!

Conversation

vermavis commented Sep 17, 2025

Uh oh!

xuechendi Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

vermavis Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

vermavis Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

vermavis Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xuechendi Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

vermavis Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vermavis Sep 18, 2025 •

edited

Loading