diff --git a/projects/rocprofiler-compute/CHANGELOG.md b/projects/rocprofiler-compute/CHANGELOG.md index 4bc6e2c4384..5b4c63c36a3 100644 --- a/projects/rocprofiler-compute/CHANGELOG.md +++ b/projects/rocprofiler-compute/CHANGELOG.md @@ -27,6 +27,8 @@ Full documentation for ROCm Compute Profiler is available at [https://rocm.docs. * Option ``--rocprofiler-sdk-library-path`` has been changed to ``--rocprofiler-tool-library-path`` to better reflect the fact that we provide flexibility in choosing the path to ROCprofiler-SDK tool and not the library. +* Standalone roofline (--roof-only option) in profile mode now creates HTML file output instead of PDF file output for roofline charts + ### Resolved issues * Fixed the meaning of --dispatch option in profile mode in argparser to convey the fact that it control which iterations of the kernel to profile and not which dispatch ids to profile. diff --git a/projects/rocprofiler-compute/README.md b/projects/rocprofiler-compute/README.md index 69552116a14..01f5fb504bd 100644 --- a/projects/rocprofiler-compute/README.md +++ b/projects/rocprofiler-compute/README.md @@ -83,8 +83,6 @@ To build the binary we follow these steps: NOTE: Since RHEL 8 ships with glibc version 2.28, this standalone binary can only be run on environment with glibc version greater than 2.28. glibc version can be checked using `ldd --version` command. -NOTE: libnss3.so shared library is required when using --roof-only option which generates roofline data in PDF format - To test the standalone binary provide the `--call-binary` option to pytest. ## How to Cite diff --git a/projects/rocprofiler-compute/docs/how-to/analyze/standalone-gui.rst b/projects/rocprofiler-compute/docs/how-to/analyze/standalone-gui.rst index 7f7bf38f40f..e7cedb3fa12 100644 --- a/projects/rocprofiler-compute/docs/how-to/analyze/standalone-gui.rst +++ b/projects/rocprofiler-compute/docs/how-to/analyze/standalone-gui.rst @@ -74,8 +74,8 @@ application's profiling data: #. Memory Chart Analysis #. Empirical Roofline Analysis - Use ``--roofline-data-type`` option to specify which data type(s) you would like displayed on the roofline PDFs in the standalone analysis GUI. - Data types can be stacked- for example, "--roofline-data-type FP32 FP64 I32" would display one PDF with FP32 and FP64 stacked, and one PDF with INT32. + Use ``--roofline-data-type`` option to specify which data type(s) you would like displayed on the roofline HTMLs in the standalone analysis GUI. + Data types can be stacked- for example, "--roofline-data-type FP32 FP64 I32" would display one HTML with FP32 and FP64 stacked, and one HTML with INT32. Default roofline data type plotted is FP32. #. Top Stats (Top Kernel Statistics) diff --git a/projects/rocprofiler-compute/docs/how-to/profile/mode.rst b/projects/rocprofiler-compute/docs/how-to/profile/mode.rst index 95cf6a1b6de..230da34b67c 100644 --- a/projects/rocprofiler-compute/docs/how-to/profile/mode.rst +++ b/projects/rocprofiler-compute/docs/how-to/profile/mode.rst @@ -208,8 +208,8 @@ an Instinct MI210 vs an Instinct MI250. Additionally, you will notice a few extra files. An SoC parameters file, ``sysinfo.csv``, is created to reflect the target device settings. All profiling output is stored in ``log.txt``. Roofline-specific benchmark - results are stored in ``roofline.csv`` and roofline plots are outputted into PDFs as - ``empirRoof_gpu-0_[datatype1]_..._[datatypeN].pdf`` where data types requested through + results are stored in ``roofline.csv`` and roofline plots are outputted into HTMLs as + ``empirRoof_gpu-0_[datatype1]_..._[datatypeN].html`` where data types requested through ``--roofline-data-type`` option are listed in the file name. .. code-block:: shell-session @@ -556,7 +556,7 @@ Roofline analysis occurs on any profile mode run, provided ``--no-roof`` option You don't need to include any additional roofline-specific options for roofline analysis. If you want to focus only on roofline-specific performance data and reduce the time it takes to profile, you can use the ``--roof-only`` option. This option checks if there is existing profiling data in the workload directory (``pmc_perf.csv`` and ``roofline.csv``): - a) If found, uses the data files with the provided arguments to create another roofline PDF output; otherwise, + a) If found, uses the data files with the provided arguments to create another roofline HTML output; otherwise, b) Profile mode runs but is limited to collecting only roofline performance counters. Note that ``--roof-only`` cannot be used with ``--block`` or ``--set`` options. @@ -580,13 +580,13 @@ Roofline options utility. See :ref:`profiling-kernel-filtering`. ``--roofline-data-type `` - Allows you to specify data types that you want plotted in the roofline PDF output(s). Selecting more than one data type will overlay the results onto the same plot. Default: FP32 + Allows you to specify data types that you want plotted in the roofline HTML output(s). Selecting more than one data type will overlay the results onto the same plot. Default: FP32 .. note:: For more information on data types supported based on the GPU architecture, see :doc:`../../conceptual/performance-model` -Each kernel in your ``.pdf`` roofline plot is automatically distinguished with a unique marker identifiable from the plot's key. The roofline PDF includes an integrated multi-subplot layout with: +Each kernel in your ``.html`` roofline plot is automatically distinguished with a unique marker identifiable from the plot's key. The roofline HTML includes an integrated multi-subplot layout with: 1. **Roofline Plot** - Shows performance ceilings and kernel arithmetic intensity points 2. **Plot Points & Values Table** - Displays AI values, performance metrics, memory/compute bound status, and cache levels for each kernel @@ -633,14 +633,14 @@ The following example demonstrates profiling roofline data only: GPU Device 0 (gfx942) with 304 CUs: Profiling... 99% [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ] ... -An inspection of our workload output folder shows ``.pdf`` plots were generated +An inspection of our workload output folder shows ``.html`` plots were generated successfully. .. code-block:: shell-session $ ls workloads/occupancy/MI300X_A1 total 48 - -rw-r--r-- 1 auser agroup 13331 Oct 29 10:33 empirRoof_gpu-0_FP32.pdf + -rw-r--r-- 1 auser agroup 13331 Oct 29 10:33 empirRoof_gpu-0_FP32.html drwxr-xr-x 1 auser agroup 0 Oct 29 10:33 perfmon -rw-r--r-- 1 auser agroup 1101 Oct 29 10:33 pmc_perf.csv -rw-r--r-- 1 auser agroup 1715 Oct 29 10:33 roofline.csv @@ -649,9 +649,9 @@ successfully. .. note:: - * ROCm Compute Profiler currently captures roofline profiling for all data types, and you can reduce the clutter in the PDF outputs by filtering the data type(s). Selecting multiple data types will overlay the results into the same PDF. To generate results in separate PDFs for each data type from the same workload run, you can re-run the profiling command with each data type as long as the ``roofline.csv`` file still exists in the workload folder. + * ROCm Compute Profiler currently captures roofline profiling for all data types, and you can reduce the clutter in the HTML outputs by filtering the data type(s). Selecting multiple data types will overlay the results into the same HTML. To generate results in separate HTML for each data type from the same workload run, you can re-run the profiling command with each data type as long as the ``roofline.csv`` file still exists in the workload folder. -The following image is a sample ``empirRoof_gpu-0_FP32.pdf`` roofline +The following image is a sample ``empirRoof_gpu-0_FP32.html`` roofline plot. .. image:: ../../data/profile/sample-roof-plot.jpg diff --git a/projects/rocprofiler-compute/src/argparser.py b/projects/rocprofiler-compute/src/argparser.py index 3b89f87dbff..df56c69b197 100644 --- a/projects/rocprofiler-compute/src/argparser.py +++ b/projects/rocprofiler-compute/src/argparser.py @@ -503,7 +503,7 @@ def omniarg_parser( type=str, default=["FP32"], help=( - "\t\t\tChoose datatypes to view roofline PDFs for: (DEFAULT: FP32)\n" + "\t\t\tChoose datatypes to view roofline HTMLs for: (DEFAULT: FP32)\n" "\t\t\t FP4\n" "\t\t\t FP6\n" "\t\t\t FP8\n" @@ -694,7 +694,7 @@ def omniarg_parser( type=str, default=["FP32"], help=( - "\t\tChoose datatypes to view roofline PDFs for: (DEFAULT: FP32)\n" + "\t\tChoose datatypes to view roofline HTMLs for: (DEFAULT: FP32)\n" "\t\t\t FP4\n" "\t\t\t FP6\n" "\t\t\t FP8\n" diff --git a/projects/rocprofiler-compute/src/roofline.py b/projects/rocprofiler-compute/src/roofline.py index 749aa9caf31..15fec9da7ce 100644 --- a/projects/rocprofiler-compute/src/roofline.py +++ b/projects/rocprofiler-compute/src/roofline.py @@ -376,7 +376,7 @@ def empirical_roofline( all_flops_ceiling_data[str(dt)] = self.__ceiling_data # Output will be different depending on interaction type: - # Save PDFs if we're in "standalone roofline" mode, + # Save HTMLs if we're in "standalone roofline" mode, # otherwise return HTML to be used in GUI outputif flops_figure: if self.__run_parameters["is_standalone"]: @@ -386,28 +386,16 @@ def empirical_roofline( kernel_list += "_" + name if ops_figure: - actual_height = int(ops_figure.layout.height) - # minimum height of 1000 to avoid cutting off content - pdf_height = max(actual_height, 1000) - - ops_figure.write_image( - f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{ops_dt_list}{kernel_list}.pdf", - width=1000, - height=pdf_height, + ops_figure.write_html( + f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{ops_dt_list}{kernel_list}.html" ) if flops_figure: - actual_height = int(flops_figure.layout.height) - # minimum height of 1000 to avoid cutting off content - pdf_height = max(actual_height, 1000) - - flops_figure.write_image( - f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{flops_dt_list}{kernel_list}.pdf", - width=1000, - height=pdf_height, + flops_figure.write_html( + f"{self.__run_parameters['workload_dir']}/empirRoof_gpu-{dev_id}{flops_dt_list}{kernel_list}.html" ) - console_log("roofline", "Empirical Roofline PDFs saved!") + console_log("roofline", "Empirical Roofline HTML file saved!") else: # Create HTML output for GUI mode. ops_graph = ( diff --git a/projects/rocprofiler-compute/src/utils/roofline_calc.py b/projects/rocprofiler-compute/src/utils/roofline_calc.py index 851d62a4d0e..14229e5cf9e 100644 --- a/projects/rocprofiler-compute/src/utils/roofline_calc.py +++ b/projects/rocprofiler-compute/src/utils/roofline_calc.py @@ -471,7 +471,7 @@ def calc_ai_profile( """Given counter data, calculate arithmetic intensity for each kernel in the application. Leverage hard-coded equations to calculate AI values. - Used during profiling stage to generate roofline PDF, since Roofline yamls + Used during profiling stage to generate roofline HTML, since Roofline yamls are not available in the profiling stage.""" console_debug( diff --git a/projects/rocprofiler-compute/tests/test_profile_general.py b/projects/rocprofiler-compute/tests/test_profile_general.py index ff95607af34..4b685e6ddec 100644 --- a/projects/rocprofiler-compute/tests/test_profile_general.py +++ b/projects/rocprofiler-compute/tests/test_profile_general.py @@ -89,7 +89,7 @@ ]) ROOF_ONLY_FILES = sorted([ - "empirRoof_gpu-0_FP32.pdf", + "empirRoof_gpu-0_FP32.html", "pmc_perf.csv", "roofline.csv", "sysinfo.csv", @@ -813,9 +813,9 @@ def test_path_csv( @pytest.mark.roofline_1 def test_roof_basic_validation(binary_handler_profile_rocprof_compute): """ - Test basic roofline PDF generation with full validation pipeline. + Test basic roofline HTML generation with full validation pipeline. This test runs the complete validation flow including counter logging - and metric comparison (if enabled in config). Validates that roofline PDFs + and metric comparison (if enabled in config). Validates that roofline HTMLs are generated with the integrated multi-subplot layout (roofline plot + plot points table + kernel names table). """ @@ -1134,9 +1134,9 @@ def test_roofline_empty_kernel_names_handling(binary_handler_profile_rocprof_com assert returncode == 1, f"Expected error (returncode=1), got {returncode}" - pdf_files = list(Path(workload_dir).glob("empirRoof_*.pdf")) - assert len(pdf_files) == 0, ( - "No roofline PDF should be generated when no kernels match" + html_files = list(Path(workload_dir).glob("empirRoof_*.html")) + assert len(html_files) == 0, ( + "No roofline HTML should be generated when no kernels match" ) test_utils.clean_output_dir(config["cleanup"], workload_dir) @@ -1212,12 +1212,12 @@ def test_roofline_unsupported_datatype_error(binary_handler_profile_rocprof_comp [ ( ["--device", "0", "--roof-only", "--roofline-data-type", "FP32"], - ["empirRoof_gpu-0_FP32.pdf"], + ["empirRoof_gpu-0_FP32.html"], "FP32_datatype", ), ( ["--device", "0", "--roof-only", "--roofline-data-type", "FP16"], - ["empirRoof_gpu-0_FP16.pdf"], + ["empirRoof_gpu-0_FP16.html"], "FP16_datatype", ), ( @@ -1605,12 +1605,12 @@ def __init__(self): @pytest.mark.roofline_2 def test_roofline_many_kernels_dynamic_height(binary_handler_profile_rocprof_compute): """ - Test roofline PDF generation with many kernels (10+) to verify: + Test roofline HTML generation with many kernels (10+) to verify: - Dynamic height calculation works - - PDF is generated successfully + - HTML is generated successfully - File size is reasonable - Note: This test uses a regular workload but validates the PDF structure + Note: This test uses a regular workload but validates the HTML structure can handle the multi-subplot layout properly. """ if soc in ("MI100"): @@ -1626,19 +1626,19 @@ def test_roofline_many_kernels_dynamic_height(binary_handler_profile_rocprof_com assert returncode == 0, "Roofline profiling should succeed" - pdf_files = list(Path(workload_dir).glob("empirRoof_*.pdf")) - assert len(pdf_files) > 0, "At least one roofline PDF should be generated" + html_files = list(Path(workload_dir).glob("empirRoof_*.html")) + assert len(html_files) > 0, "At least one roofline HTML should be generated" - for pdf_file in pdf_files: - assert pdf_file.exists(), f"PDF file {pdf_file} should exist" - file_size = pdf_file.stat().st_size + for html_file in html_files: + assert html_file.exists(), f"HTML file {html_file} should exist" + file_size = html_file.stat().st_size - # PDF should be larger than 10KB (has content) but less than 50MB (reasonable) + # HTML should be larger than 10KB (has content) but less than 50MB (reasonable) assert file_size > 10000, ( - f"PDF {pdf_file} too small ({file_size} bytes), may be malformed" + f"HTML {html_file} too small ({file_size} bytes), may be malformed" ) assert file_size < 50000000, ( - f"PDF {pdf_file} too large ({file_size} bytes), may have issues" + f"HTML {html_file} too large ({file_size} bytes), may have issues" ) file_dict = test_utils.check_csv_files(workload_dir, 1, num_kernels) diff --git a/projects/rocprofiler-compute/tests/test_utils.py b/projects/rocprofiler-compute/tests/test_utils.py index ef44e08b5cc..9ce9c98e46d 100644 --- a/projects/rocprofiler-compute/tests/test_utils.py +++ b/projects/rocprofiler-compute/tests/test_utils.py @@ -209,8 +209,8 @@ def check_csv_files(output_dir, num_devices, num_kernels): assert len(file_dict[file].index) >= num_devices elif "sysinfo" not in file and "ps_file" not in file: assert len(file_dict[file].index) >= num_kernels - elif file.endswith(".pdf"): - file_dict[file] = "pdf" + elif file.endswith(".html"): + file_dict[file] = "html" elif file.endswith(".json"): file_dict[file] = "json" return file_dict