-
Notifications
You must be signed in to change notification settings - Fork 801
[SYCL] Optimize getUrEvents #20895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
[SYCL] Optimize getUrEvents #20895
+123
−176
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
PatKamin
commented
Dec 15, 2025
- avoid increasing reference counter of a shared_ptr by retrieving a handle directly
- reserve memory for the whole vector of handles beforehand, avoiding possible reallocations
This reverts commit 666693d.
- avoid increasing reference counter of a shared_ptr by retrieving a handle directly - reserve memory for the whole vector of handles beforehand, avoiding possible reallocations
78a3a26 to
a4b8002
Compare
Comment on lines
175
to
276
| run: | | ||
| # Build and run benchmarks | ||
| echo "::group::install_python_deps" | ||
| echo "Installing python dependencies..." | ||
| # Using --break-system-packages because: | ||
| # - venv is not installed | ||
| # - unable to install anything via pip, as python packages in the docker | ||
| # container are managed by apt | ||
| # - apt is unable to install anything due to unresolved dpkg dependencies, | ||
| # as a result of how the sycl nightly images are created | ||
| pip install --user --break-system-packages -r ./devops/scripts/benchmarks/requirements.txt | ||
| echo "::endgroup::" | ||
| - name: Run sycl-ls | ||
| shell: bash | ||
| run: | | ||
| # Run sycl-ls | ||
| echo "::group::establish_parameters_and_vars" | ||
| export CMPLR_ROOT=./toolchain | ||
| # By default, the benchmark scripts forceload level_zero | ||
| FORCELOAD_ADAPTER="${ONEAPI_DEVICE_SELECTOR%%:*}" | ||
| echo "Adapter: $FORCELOAD_ADAPTER" | ||
| case "$ONEAPI_DEVICE_SELECTOR" in | ||
| level_zero:*) SAVE_SUFFIX="L0" ;; | ||
| level_zero_v2:*) | ||
| SAVE_SUFFIX="L0v2" | ||
| export ONEAPI_DEVICE_SELECTOR="level_zero:gpu" # "level_zero_v2:gpu" not supported anymore | ||
| export SYCL_UR_USE_LEVEL_ZERO_V2=1 | ||
| ;; | ||
| opencl:*) SAVE_SUFFIX="OCL" ;; | ||
| *) SAVE_SUFFIX="${ONEAPI_DEVICE_SELECTOR%%:*}";; | ||
| esac | ||
| case "$RUNNER_TAG" in | ||
| '["PVC_PERF"]') MACHINE_TYPE="PVC" ;; | ||
| '["BMG_PERF"]') MACHINE_TYPE="BMG" ;; | ||
| # Best effort at matching | ||
| *) | ||
| MACHINE_TYPE="${RUNNER_TAG#[\"}" | ||
| MACHINE_TYPE="${MACHINE_TYPE%_PERF=\"]}" | ||
| ;; | ||
| esac | ||
| SAVE_NAME="${SAVE_PREFIX}_${MACHINE_TYPE}_${SAVE_SUFFIX}" | ||
| echo "SAVE_NAME=$SAVE_NAME" >> $GITHUB_ENV | ||
| SAVE_TIMESTAMP="$(date -u +'%Y%m%d_%H%M%S')" # Timestamps are in UTC time | ||
| # Cache the compute_runtime version from dependencies.json, but perform a | ||
| # check with L0 version before using it: This value is not guaranteed to | ||
| # accurately reflect the current compute_runtime version used, as the | ||
| # docker images are built nightly. | ||
| export COMPUTE_RUNTIME_TAG_CACHE="$(cat ./devops/dependencies.json | jq -r .linux.compute_runtime.github_tag)" | ||
|
|
||
| echo "::endgroup::" | ||
| echo "::group::sycl_ls" | ||
| sycl-ls --verbose | ||
| - name: Build and run benchmarks | ||
| shell: bash | ||
| env: | ||
| BENCH_WORKDIR: ${{ steps.establish_outputs.outputs.BENCH_WORKDIR }} | ||
| BENCHMARK_RESULTS_REPO_PATH: ${{ steps.establish_outputs.outputs.BENCHMARK_RESULTS_REPO_PATH }} | ||
| run: | | ||
| # Build and run benchmarks | ||
| echo "::endgroup::" | ||
| echo "::group::run_benchmarks" | ||
|
|
||
| echo "::group::setup_workdir" | ||
| if [ -n "$BENCH_WORKDIR" ] && [ -d "$BENCH_WORKDIR" ] && [[ "$BENCH_WORKDIR" == *llvm_test_workdir* ]]; then rm -rf "$BENCH_WORKDIR" ; fi | ||
| WORKDIR="$(realpath ./llvm_test_workdir)" | ||
| if [ -n "$WORKDIR" ] && [ -d "$WORKDIR" ] && [[ "$WORKDIR" == *llvm_test_workdir* ]]; then rm -rf "$WORKDIR" ; fi | ||
|
|
||
| # Clean up potentially existing, old summary files | ||
| [ -f "github_summary_exe.md" ] && rm github_summary_exe.md | ||
| [ -f "github_summary_reg.md" ] && rm github_summary_reg.md | ||
|
|
||
| echo "::endgroup::" | ||
| echo "::group::run_benchmarks" | ||
| numactl --cpunodebind "$NUMA_NODE" --membind "$NUMA_NODE" \ | ||
| ./devops/scripts/benchmarks/main.py "$BENCH_WORKDIR" \ | ||
| --sycl "$(realpath $CMPLR_ROOT)" \ | ||
| ./devops/scripts/benchmarks/main.py "$WORKDIR" \ | ||
| --sycl "$(realpath ./toolchain)" \ | ||
| --ur "$(realpath ./ur/install)" \ | ||
| --adapter "$FORCELOAD_ADAPTER" \ | ||
| --save "$SAVE_NAME" \ | ||
| --output-html remote \ | ||
| --results-dir "${BENCHMARK_RESULTS_REPO_PATH}/" \ | ||
| --output-dir "${BENCHMARK_RESULTS_REPO_PATH}/" \ | ||
| --results-dir "./llvm-ci-perf-results/" \ | ||
| --output-dir "./llvm-ci-perf-results/" \ | ||
| --preset "$PRESET" \ | ||
| --timestamp-override "$SAVE_TIMESTAMP" \ | ||
| --detect-version sycl,compute_runtime \ | ||
| --produce-github-summary \ | ||
| ${{ inputs.exit_on_failure == 'true' && '--exit-on-failure --iterations 1' || '' }} | ||
| # TODO: add back: "--flamegraph inclusive" once works properly | ||
|
|
||
| echo "::endgroup::" | ||
| echo "::group::compare_results" | ||
| python3 ./devops/scripts/benchmarks/compare.py to_hist \ | ||
| --avg-type EWMA \ | ||
| --cutoff "$(date -u -d '7 days ago' +'%Y%m%d_%H%M%S')" \ | ||
| --name "$SAVE_NAME" \ | ||
| --compare-file "${BENCHMARK_RESULTS_REPO_PATH}/results/${SAVE_NAME}_${SAVE_TIMESTAMP}.json" \ | ||
| --results-dir "${BENCHMARK_RESULTS_REPO_PATH}/results/" \ | ||
| --compare-file "./llvm-ci-perf-results/results/${SAVE_NAME}_${SAVE_TIMESTAMP}.json" \ | ||
| --results-dir "./llvm-ci-perf-results/results/" \ | ||
| --regression-filter '^[a-z_]+_sycl .* CPU count' \ | ||
| --regression-filter-type 'SYCL benchmark (measured using CPU cycle count)' \ | ||
| --verbose \ | ||
| --produce-github-summary \ | ||
| ${{ inputs.dry_run == 'true' && '--dry-run' || '' }} \ | ||
|
|
||
| echo "::endgroup::" | ||
| - name: Run benchmarks integration tests | ||
| shell: bash | ||
| if: ${{ github.event_name == 'pull_request' }} | ||
| env: | ||
| BENCH_WORKDIR: ${{ steps.establish_outputs.outputs.BENCH_WORKDIR }} | ||
| LLVM_BENCHMARKS_UNIT_TESTING: 1 | ||
| COMPUTE_BENCHMARKS_BUILD_PATH: ${{ steps.establish_outputs.outputs.BENCH_WORKDIR }}/compute-benchmarks-build | ||
| run: | | ||
| # Run benchmarks' integration tests | ||
|
|
||
| # Run benchmarks' integration tests | ||
| # NOTE: Each integration test prints its own group name as part of test script | ||
| python3 ./devops/scripts/benchmarks/tests/test_integration.py | ||
| - name: Upload github summaries and cache changes | ||
| if [ '${{ github.event_name == 'pull_request' }}' = 'true' ]; then | ||
| export LLVM_BENCHMARKS_UNIT_TESTING=1 | ||
| export COMPUTE_BENCHMARKS_BUILD_PATH=$WORKDIR/compute-benchmarks-build | ||
| python3 ./devops/scripts/benchmarks/tests/test_integration.py | ||
| fi | ||
| - name: Cache changes and upload github summary | ||
| if: always() | ||
| shell: bash | ||
| env: | ||
| BENCHMARK_RESULTS_REPO_PATH: ${{ steps.establish_outputs.outputs.BENCHMARK_RESULTS_REPO_PATH }} |
Check failure
Code scanning / zizmor
dangerous use of environment file Error test
dangerous use of environment file
Comment on lines
+164
to
+168
| - name: Checkout results repo | ||
| uses: actions/checkout@v5 | ||
| with: | ||
| ref: ${{ env.BENCHMARK_RESULTS_BRANCH }} | ||
| path: llvm-ci-perf-results |
Check warning
Code scanning / zizmor
credential persistence through GitHub Actions artifacts Warning test
credential persistence through GitHub Actions artifacts
Contributor
Author
|
No perf improvement gained |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.