DO NOT COMMIT test for u55 + mv2 #9830

digantdesai · 2025-04-02T02:58:52Z

Summary

[PLEASE REMOVE] See CONTRIBUTING.md's Pull Requests for ExecuTorch PR guidelines.

[PLEASE REMOVE] If this PR closes an issue, please add a Fixes #<issue-id> line.

[PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: " label. For a list of available release notes labels, check out CONTRIBUTING.md's Pull Requests.

Test plan

[PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.

pytorch-bot · 2025-04-02T02:58:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9830

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit e466669 with merge base bcf4b46 ():

NEW FAILURES - The following jobs have failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels
pull / unittest / linux / linux-job (gh)
RuntimeError: Command docker exec -t f9d8b25824ed2b48a0ffaec3ec9fbc68837d541bf856bca32a55a0ec1aee2fea /exec failed with exit code 56

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-04-02T03:00:00Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

digantdesai · 2025-04-03T04:29:54Z

@zingo - If I run MV2 locally I get a different number for ethos PMU, any clue on how to start debugging? I believe George also confirmed that something odd is going on after comparing two runs.

zingo · 2025-04-04T06:12:45Z

@zingo - If I run MV2 locally I get a different number for ethos PMU, any clue on how to start debugging? I believe George also confirmed that something odd is going on after comparing two runs.

When you run it locally do you use run.sh or
python3 backends/arm/test/test_model.py --test_output=arm_test/test_model --target=ethos-u55-128 --model=mv2 --extra_flags="-DET_ATOL=2.00 -DET_RTOL=2.00" --build_libs --memory_mode Sram_Only

( --memory_mode Sram_Only seem to be missing from at least this version of the patch)

But I get the same big number 8M with it so looking in the lor there is probably a bug in the backends/arm/test/test_model.py test flow as I spot that Vela is using Sram_Only BUT the elf is build with Shared_Sram

So until we fix this run.sh would work best for testing (or running all sub scrips directly)

zingo · 2025-04-04T06:58:21Z

Found the bug and a PR is on it's way :) With it I now see 5.5M NPU cycles on mv2 when using Sram_Only with test_model.py also so it should match run.sh testing. Thanks for testing and reporting this :)

zingo · 2025-04-04T10:08:32Z

I hope this PR will fix the problem you got.
#9902

gggekov · 2025-04-04T10:24:50Z

Thanks @zingo @digantdesai !
For completeness, the performance on mobilenet_v2 on U55-128 on the Corstone-300 is:

Shared Sram memory mode, Ethos_U55_Deep_Embedded system config: 8.3M NPU active cycles
Shared Sram memory mode, Ethos_U55_High_End_Embedded system config: 6.7M NPU active cycles
Sram Only memory mode, Ethos_U55_Deep_Embedded and Ethos_U55_High_End_Embedded - 5.5M NPU active cycles

For Shared_Sram, we see a difference in the performance between the Ethos_U55_Deep_Embedded and the Ethos_U55_High_End_Embedded system configs and that is expected. This is because the memory bandwidth and latency are different between the two system configurations. With the deeply embedded timing adapter settings, the NPU perceives bus bandwidth of 4bits/clock cycle(=250MB/s if you are @ 500MHz) on the external memory and for the High_End_Embedded, the bus bandwidth is 8bits/cc(or 500MB/s if the NPU is clocked at 500MHz) for the external memory. Note that even though the performance is different for Shared_Sram(8.3M vs 6.7M cycles), the PMU counters for amount of beats read on the AXI0/AXI1 interfaces are the same. In other words, the NPU reads the same amount of data between the deeply embedded & high end embedded system configs. This is again expected because we run the same model, just in the deeply embedded case the NPU has to wait longer for the data to arrive due to the lower bandwidth on the Flash.

As for Sram_Only, we place everything in the SRAM and the SRAM bandwidth is the same between Deeply Embedded & High End systems, therefore we see identical performance.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 2, 2025

digantdesai added the ciflow/trunk label Apr 2, 2025

digantdesai force-pushed the test_u55_mv2 branch from 8a305e0 to 70d1da7 Compare April 2, 2025 03:22

DO NOT COMMIT test

e466669

digantdesai force-pushed the test_u55_mv2 branch from 70d1da7 to e466669 Compare April 2, 2025 16:21

zingo added the module: arm Issues related to arm backend label Apr 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DO NOT COMMIT test for u55 + mv2 #9830

DO NOT COMMIT test for u55 + mv2 #9830

digantdesai commented Apr 2, 2025

pytorch-bot bot commented Apr 2, 2025 •

edited

Loading

github-actions bot commented Apr 2, 2025

digantdesai commented Apr 3, 2025

zingo commented Apr 4, 2025 •

edited

Loading

zingo commented Apr 4, 2025

zingo commented Apr 4, 2025

gggekov commented Apr 4, 2025

DO NOT COMMIT test for u55 + mv2 #9830

Are you sure you want to change the base?

DO NOT COMMIT test for u55 + mv2 #9830

Conversation

digantdesai commented Apr 2, 2025

Summary

Test plan

pytorch-bot bot commented Apr 2, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9830

❌ 2 New Failures

github-actions bot commented Apr 2, 2025

This PR needs a release notes: label

digantdesai commented Apr 3, 2025

zingo commented Apr 4, 2025 • edited Loading

zingo commented Apr 4, 2025

zingo commented Apr 4, 2025

gggekov commented Apr 4, 2025

pytorch-bot bot commented Apr 2, 2025 •

edited

Loading

This PR needs a `release notes:` label

zingo commented Apr 4, 2025 •

edited

Loading