Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions projects/rocprofiler-compute/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.

* Option ``--rocprofiler-sdk-library-path`` has been changed to ``--rocprofiler-tool-library-path`` to better reflect the fact that we provide flexibility in choosing the path to ROCprofiler-SDK tool and not the library.

* Standalone roofline (--roof-only option) in profile mode now creates HTML file output instead of PDF file output for roofline charts

### Resolved issues

* Fixed the meaning of --dispatch option in profile mode in argparser to convey the fact that it control which iterations of the kernel to profile and not which dispatch ids to profile.
Expand Down
2 changes: 0 additions & 2 deletions projects/rocprofiler-compute/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,8 +83,6 @@ To build the binary we follow these steps:
NOTE: Since RHEL 8 ships with glibc version 2.28, this standalone binary can only be run on environment with glibc version greater than 2.28.
glibc version can be checked using `ldd --version` command.

NOTE: libnss3.so shared library is required when using --roof-only option which generates roofline data in PDF format

To test the standalone binary provide the `--call-binary` option to pytest.

## How to Cite
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,8 @@ application's profiling data:
#. Memory Chart Analysis
#. Empirical Roofline Analysis

Use ``--roofline-data-type`` option to specify which data type(s) you would like displayed on the roofline PDFs in the standalone analysis GUI.
Data types can be stacked- for example, "--roofline-data-type FP32 FP64 I32" would display one PDF with FP32 and FP64 stacked, and one PDF with INT32.
Use ``--roofline-data-type`` option to specify which data type(s) you would like displayed on the roofline HTMLs in the standalone analysis GUI.
Data types can be stacked- for example, "--roofline-data-type FP32 FP64 I32" would display one HTML with FP32 and FP64 stacked, and one HTML with INT32.
Default roofline data type plotted is FP32.

#. Top Stats (Top Kernel Statistics)
Expand Down
18 changes: 9 additions & 9 deletions projects/rocprofiler-compute/docs/how-to/profile/mode.rst
Original file line number Diff line number Diff line change
Expand Up @@ -208,8 +208,8 @@ an Instinct MI210 vs an Instinct MI250.
Additionally, you will notice a few extra files. An SoC parameters file,
``sysinfo.csv``, is created to reflect the target device settings. All
profiling output is stored in ``log.txt``. Roofline-specific benchmark
results are stored in ``roofline.csv`` and roofline plots are outputted into PDFs as
``empirRoof_gpu-0_[datatype1]_..._[datatypeN].pdf`` where data types requested through
results are stored in ``roofline.csv`` and roofline plots are outputted into HTMLs as
``empirRoof_gpu-0_[datatype1]_..._[datatypeN].html`` where data types requested through
``--roofline-data-type`` option are listed in the file name.

.. code-block:: shell-session
Expand Down Expand Up @@ -556,7 +556,7 @@ Roofline analysis occurs on any profile mode run, provided ``--no-roof`` option
You don't need to include any additional roofline-specific options for roofline analysis.
If you want to focus only on roofline-specific performance data and reduce the time it takes to profile, you can use the ``--roof-only`` option.
This option checks if there is existing profiling data in the workload directory (``pmc_perf.csv`` and ``roofline.csv``):
a) If found, uses the data files with the provided arguments to create another roofline PDF output; otherwise,
a) If found, uses the data files with the provided arguments to create another roofline HTML output; otherwise,
b) Profile mode runs but is limited to collecting only roofline performance counters.
Note that ``--roof-only`` cannot be used with ``--block`` or ``--set`` options.

Expand All @@ -580,13 +580,13 @@ Roofline options
utility. See :ref:`profiling-kernel-filtering`.

``--roofline-data-type <datatype>``
Allows you to specify data types that you want plotted in the roofline PDF output(s). Selecting more than one data type will overlay the results onto the same plot. Default: FP32
Allows you to specify data types that you want plotted in the roofline HTML output(s). Selecting more than one data type will overlay the results onto the same plot. Default: FP32

.. note::

For more information on data types supported based on the GPU architecture, see :doc:`../../conceptual/performance-model`

Each kernel in your ``.pdf`` roofline plot is automatically distinguished with a unique marker identifiable from the plot's key. The roofline PDF includes an integrated multi-subplot layout with:
Each kernel in your ``.html`` roofline plot is automatically distinguished with a unique marker identifiable from the plot's key. The roofline HTML includes an integrated multi-subplot layout with:

1. **Roofline Plot** - Shows performance ceilings and kernel arithmetic intensity points
2. **Plot Points & Values Table** - Displays AI values, performance metrics, memory/compute bound status, and cache levels for each kernel
Expand Down Expand Up @@ -633,14 +633,14 @@ The following example demonstrates profiling roofline data only:
GPU Device 0 (gfx942) with 304 CUs: Profiling...
99% [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
...
An inspection of our workload output folder shows ``.pdf`` plots were generated
An inspection of our workload output folder shows ``.html`` plots were generated
successfully.

.. code-block:: shell-session

$ ls workloads/occupancy/MI300X_A1
total 48
-rw-r--r-- 1 auser agroup 13331 Oct 29 10:33 empirRoof_gpu-0_FP32.pdf
-rw-r--r-- 1 auser agroup 13331 Oct 29 10:33 empirRoof_gpu-0_FP32.html
drwxr-xr-x 1 auser agroup 0 Oct 29 10:33 perfmon
-rw-r--r-- 1 auser agroup 1101 Oct 29 10:33 pmc_perf.csv
-rw-r--r-- 1 auser agroup 1715 Oct 29 10:33 roofline.csv
Expand All @@ -649,9 +649,9 @@ successfully.

.. note::

* ROCm Compute Profiler currently captures roofline profiling for all data types, and you can reduce the clutter in the PDF outputs by filtering the data type(s). Selecting multiple data types will overlay the results into the same PDF. To generate results in separate PDFs for each data type from the same workload run, you can re-run the profiling command with each data type as long as the ``roofline.csv`` file still exists in the workload folder.
* ROCm Compute Profiler currently captures roofline profiling for all data types, and you can reduce the clutter in the HTML outputs by filtering the data type(s). Selecting multiple data types will overlay the results into the same HTML. To generate results in separate HTML for each data type from the same workload run, you can re-run the profiling command with each data type as long as the ``roofline.csv`` file still exists in the workload folder.

The following image is a sample ``empirRoof_gpu-0_FP32.pdf`` roofline
The following image is a sample ``empirRoof_gpu-0_FP32.html`` roofline
plot.

.. image:: ../../data/profile/sample-roof-plot.jpg
Expand Down
4 changes: 2 additions & 2 deletions projects/rocprofiler-compute/src/argparser.py
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,7 @@ def omniarg_parser(
type=str,
default=["FP32"],
help=(
"\t\t\tChoose datatypes to view roofline PDFs for: (DEFAULT: FP32)\n"
"\t\t\tChoose datatypes to view roofline HTMLs for: (DEFAULT: FP32)\n"
"\t\t\t FP4\n"
"\t\t\t FP6\n"
"\t\t\t FP8\n"
Expand Down Expand Up @@ -694,7 +694,7 @@ def omniarg_parser(
type=str,
default=["FP32"],
help=(
"\t\tChoose datatypes to view roofline PDFs for: (DEFAULT: FP32)\n"
"\t\tChoose datatypes to view roofline HTMLs for: (DEFAULT: FP32)\n"
"\t\t\t FP4\n"
"\t\t\t FP6\n"
"\t\t\t FP8\n"
Expand Down
24 changes: 6 additions & 18 deletions projects/rocprofiler-compute/src/roofline.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ def empirical_roofline(
all_flops_ceiling_data[str(dt)] = self.__ceiling_data

# Output will be different depending on interaction type:
# Save PDFs if we're in "standalone roofline" mode,
# Save HTMLs if we're in "standalone roofline" mode,
# otherwise return HTML to be used in GUI outputif flops_figure:

if self.__run_parameters["is_standalone"]:
Expand All @@ -386,28 +386,16 @@ def empirical_roofline(
kernel_list += "_" + name

if ops_figure:
actual_height = int(ops_figure.layout.height)
# minimum height of 1000 to avoid cutting off content
pdf_height = max(actual_height, 1000)

ops_figure.write_image(
f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{ops_dt_list}{kernel_list}.pdf",
width=1000,
height=pdf_height,
ops_figure.write_html(
f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{ops_dt_list}{kernel_list}.html"
)

if flops_figure:
actual_height = int(flops_figure.layout.height)
# minimum height of 1000 to avoid cutting off content
pdf_height = max(actual_height, 1000)

flops_figure.write_image(
f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{flops_dt_list}{kernel_list}.pdf",
width=1000,
height=pdf_height,
flops_figure.write_html(
f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{flops_dt_list}{kernel_list}.html"
)

console_log("roofline", "Empirical Roofline PDFs saved!")
console_log("roofline", "Empirical Roofline HTML file saved!")
else:
# Create HTML output for GUI mode.
ops_graph = (
Expand Down
2 changes: 1 addition & 1 deletion projects/rocprofiler-compute/src/utils/roofline_calc.py
Original file line number Diff line number Diff line change
Expand Up @@ -471,7 +471,7 @@ def calc_ai_profile(
"""Given counter data, calculate arithmetic intensity for each kernel
in the application. Leverage hard-coded equations to calculate AI values.

Used during profiling stage to generate roofline PDF, since Roofline yamls
Used during profiling stage to generate roofline HTML, since Roofline yamls
are not available in the profiling stage."""

console_debug(
Expand Down
38 changes: 19 additions & 19 deletions projects/rocprofiler-compute/tests/test_profile_general.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@
])

ROOF_ONLY_FILES = sorted([
"empirRoof_gpu-0_FP32.pdf",
"empirRoof_gpu-0_FP32.html",
"pmc_perf.csv",
"roofline.csv",
"sysinfo.csv",
Expand Down Expand Up @@ -813,9 +813,9 @@ def test_path_csv(
@pytest.mark.roofline_1
def test_roof_basic_validation(binary_handler_profile_rocprof_compute):
"""
Test basic roofline PDF generation with full validation pipeline.
Test basic roofline HTML generation with full validation pipeline.
This test runs the complete validation flow including counter logging
and metric comparison (if enabled in config). Validates that roofline PDFs
and metric comparison (if enabled in config). Validates that roofline HTMLs
are generated with the integrated multi-subplot layout (roofline plot +
plot points table + kernel names table).
"""
Expand Down Expand Up @@ -1134,9 +1134,9 @@ def test_roofline_empty_kernel_names_handling(binary_handler_profile_rocprof_com

assert returncode == 1, f"Expected error (returncode=1), got {returncode}"

pdf_files = list(Path(workload_dir).glob("empirRoof_*.pdf"))
assert len(pdf_files) == 0, (
"No roofline PDF should be generated when no kernels match"
html_files = list(Path(workload_dir).glob("empirRoof_*.html"))
assert len(html_files) == 0, (
"No roofline HTML should be generated when no kernels match"
)

test_utils.clean_output_dir(config["cleanup"], workload_dir)
Expand Down Expand Up @@ -1212,12 +1212,12 @@ def test_roofline_unsupported_datatype_error(binary_handler_profile_rocprof_comp
[
(
["--device", "0", "--roof-only", "--roofline-data-type", "FP32"],
["empirRoof_gpu-0_FP32.pdf"],
["empirRoof_gpu-0_FP32.html"],
"FP32_datatype",
),
(
["--device", "0", "--roof-only", "--roofline-data-type", "FP16"],
["empirRoof_gpu-0_FP16.pdf"],
["empirRoof_gpu-0_FP16.html"],
"FP16_datatype",
),
(
Expand Down Expand Up @@ -1605,12 +1605,12 @@ def __init__(self):
@pytest.mark.roofline_2
def test_roofline_many_kernels_dynamic_height(binary_handler_profile_rocprof_compute):
"""
Test roofline PDF generation with many kernels (10+) to verify:
Test roofline HTML generation with many kernels (10+) to verify:
- Dynamic height calculation works
- PDF is generated successfully
- HTML is generated successfully
- File size is reasonable

Note: This test uses a regular workload but validates the PDF structure
Note: This test uses a regular workload but validates the HTML structure
can handle the multi-subplot layout properly.
"""
if soc in ("MI100"):
Expand All @@ -1626,19 +1626,19 @@ def test_roofline_many_kernels_dynamic_height(binary_handler_profile_rocprof_com

assert returncode == 0, "Roofline profiling should succeed"

pdf_files = list(Path(workload_dir).glob("empirRoof_*.pdf"))
assert len(pdf_files) > 0, "At least one roofline PDF should be generated"
html_files = list(Path(workload_dir).glob("empirRoof_*.html"))
assert len(html_files) > 0, "At least one roofline HTML should be generated"

for pdf_file in pdf_files:
assert pdf_file.exists(), f"PDF file {pdf_file} should exist"
file_size = pdf_file.stat().st_size
for html_file in html_files:
assert html_file.exists(), f"HTML file {html_file} should exist"
file_size = html_file.stat().st_size

# PDF should be larger than 10KB (has content) but less than 50MB (reasonable)
# HTML should be larger than 10KB (has content) but less than 50MB (reasonable)
assert file_size > 10000, (
f"PDF {pdf_file} too small ({file_size} bytes), may be malformed"
f"HTML {html_file} too small ({file_size} bytes), may be malformed"
)
assert file_size < 50000000, (
f"PDF {pdf_file} too large ({file_size} bytes), may have issues"
f"HTML {html_file} too large ({file_size} bytes), may have issues"
)

file_dict = test_utils.check_csv_files(workload_dir, 1, num_kernels)
Expand Down
4 changes: 2 additions & 2 deletions projects/rocprofiler-compute/tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,8 +209,8 @@ def check_csv_files(output_dir, num_devices, num_kernels):
assert len(file_dict[file].index) >= num_devices
elif "sysinfo" not in file and "ps_file" not in file:
assert len(file_dict[file].index) >= num_kernels
elif file.endswith(".pdf"):
file_dict[file] = "pdf"
elif file.endswith(".html"):
file_dict[file] = "html"
elif file.endswith(".json"):
file_dict[file] = "json"
return file_dict
Expand Down