Skip to content

[Snippets][CPU] Introduce jit_binary_call emitter on ARM, fix large offsets issue in Gemm #31553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

aobolensk
Copy link
Contributor

@aobolensk aobolensk commented Aug 1, 2025

Details:

  • Introduce jit_binary_call emitter on ARM for calling external functions in a more unified way and switch Gemm and GemmCopyB emitters to use it
  • Fix large offsets issue in Gemm
  • Fix register allocation and splilling
  • Enable all MatMulBias tests

Tickets:

  • 169427
  • required for 167005

@aobolensk aobolensk requested review from a team as code owners August 1, 2025 08:06
@aobolensk aobolensk added the platform: arm OpenVINO on ARM / ARM64 label Aug 1, 2025
@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Aug 1, 2025
@aobolensk aobolensk force-pushed the snippets-arm-jit-binary-call-emitter branch 2 times, most recently from 5d584fa to b0328ac Compare August 1, 2025 09:53
@a-sidorova a-sidorova modified the milestones: 2025.2, 2025.3 Aug 1, 2025
Copy link
Contributor

@a-sidorova a-sidorova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed everything except jit_load_store_emitter parts

@aobolensk aobolensk force-pushed the snippets-arm-jit-binary-call-emitter branch 3 times, most recently from 1b05f56 to e163aef Compare August 4, 2025 09:23
@aobolensk aobolensk requested a review from a-sidorova August 4, 2025 11:30
@aobolensk aobolensk requested a review from v-Golubev August 5, 2025 06:18
Comment on lines 105 to 106
const bool is_dynamic_offset = ov::snippets::utils::is_dynamic_value(m_memory_offsets[i]);
const bool is_valid_buffer_id = !ov::snippets::utils::is_dynamic_value(m_buffer_ids[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please compare m_buffer_id with SIZE_MAX because this value is returned from ov::intel_cpu::utils::get_buffer_cluster_id. If we change dynamic value from size_max to any another (for example, size_max-1), this check will be invalid. So let's align check with with get_buffer_cluster_id here

  2. Should we call assert if the offset is dynamic and id is size_max? As it's done in memory emitters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these checks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPENVINO_ASSERT is still missed here. Please double-check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returned it back

@aobolensk aobolensk force-pushed the snippets-arm-jit-binary-call-emitter branch 3 times, most recently from 4bb0d29 to e02ae65 Compare August 5, 2025 13:00
@aobolensk aobolensk requested a review from a-sidorova August 5, 2025 13:09
Comment on lines 105 to 106
const bool is_dynamic_offset = ov::snippets::utils::is_dynamic_value(m_memory_offsets[i]);
const bool is_valid_buffer_id = !ov::snippets::utils::is_dynamic_value(m_buffer_ids[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPENVINO_ASSERT is still missed here. Please double-check

if (off_mod == 0 && off_mul_vl >= 0 && off_mul_vl <= 4095) {
h->ldr(QReg(dst.getIdx()), ptr(src, static_cast<uint32_t>(offset)));
} else {
h->add_imm(h->X_DEFAULT_ADDR, src, offset, h->X_TMP_0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question for better understanding: is it safe to use h->X_DEFAULT_ADDR and h->X_TMP_0 here? How do we guarantee that src or dst are not X_DEFAULT_ADDR and X_TMP_0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YEs, we do. They are basically not used in code generation as inputs or outputs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@v-Golubev I had the same question but these registers are in blacklist for AssignRegister pass:

// Note: more details on the usage of reserved registers in aarch64/jit_kernel_emitter.cpp
if (none_of(i, Operand::SP, Operand::X18, Operand::X23, Operand::X24, Operand::X28, Operand::X29)) {
reg_pool.emplace_back(snippets::RegType::gpr, i);
}
}
return reg_pool;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to create a static vector with reserved registers somewhere? Just to avoid such questions in the future: we will have one place (vec initialization) where we will explain why they are in a blacklist. And we will be able to use them in the places we need using this vec. @a-sidorova @aobolensk what do you think?

@aobolensk aobolensk force-pushed the snippets-arm-jit-binary-call-emitter branch from 4b9f771 to a20f240 Compare August 6, 2025 18:36
if (off_mod == 0 && off_mul_vl >= 0 && off_mul_vl <= 4095) {
h->ldr(QReg(dst.getIdx()), ptr(src, static_cast<uint32_t>(offset)));
} else {
h->add_imm(h->X_DEFAULT_ADDR, src, offset, h->X_TMP_0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@v-Golubev I had the same question but these registers are in blacklist for AssignRegister pass:

// Note: more details on the usage of reserved registers in aarch64/jit_kernel_emitter.cpp
if (none_of(i, Operand::SP, Operand::X18, Operand::X23, Operand::X24, Operand::X28, Operand::X29)) {
reg_pool.emplace_back(snippets::RegType::gpr, i);
}
}
return reg_pool;

@aobolensk aobolensk force-pushed the snippets-arm-jit-binary-call-emitter branch from d0e26c7 to a8fec5c Compare August 7, 2025 05:47
@aobolensk aobolensk force-pushed the snippets-arm-jit-binary-call-emitter branch 2 times, most recently from e7031da to 34483bb Compare August 7, 2025 07:54
Copy link
Contributor

@a-sidorova a-sidorova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@a-sidorova a-sidorova requested a review from v-Golubev August 8, 2025 11:56
if (off_mod == 0 && off_mul_vl >= 0 && off_mul_vl <= 4095) {
h->ldr(QReg(dst.getIdx()), ptr(src, static_cast<uint32_t>(offset)));
} else {
h->add_imm(h->X_DEFAULT_ADDR, src, offset, h->X_TMP_0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to create a static vector with reserved registers somewhere? Just to avoid such questions in the future: we will have one place (vec initialization) where we will explain why they are in a blacklist. And we will be able to use them in the places we need using this vec. @a-sidorova @aobolensk what do you think?

@aobolensk aobolensk requested a review from v-Golubev August 8, 2025 14:12
Copy link
Contributor

@v-Golubev v-Golubev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aobolensk aobolensk enabled auto-merge August 8, 2025 16:56
@aobolensk aobolensk added this pull request to the merge queue Aug 8, 2025
Merged via the queue into openvinotoolkit:master with commit 3195c6f Aug 8, 2025
229 of 232 checks passed
@aobolensk aobolensk deleted the snippets-arm-jit-binary-call-emitter branch August 8, 2025 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin platform: arm OpenVINO on ARM / ARM64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants