We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/tools/profiler/cutlass_profiler --dist=uniform,min:-2.3,max:2.3,scale:-1 --kernels=cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem --m=8 --n=8192 --k=8192 --verification-enabled =false ============================= Problem ID: 1 Provider: CUTLASS OperationKind: gemm Operation: cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem Status: Success Verification: OFF Disposition: Failed Arguments: --gemm_kind=universal --m=8 --n=8192 --k=8192 --A=bf16:row --B=bf16:column --C=bf16:column --D=bf16:column \ --alpha=1 --beta=0 --split_k_mode=serial --split_k_slices=1 --batch_count=1 --raster_order=heuristic \ --runtime_input_datatype_a=invalid --runtime_input_datatype_b=invalid --use_pdl=false --enable_sm90_mixed_dtype_shuffle_test=false \ --swizzle_size=1 --op_class=tensorop --accum=f32 --cta_m=128 --cta_n=128 --cta_k=64 --cluster_m=1 --cluster_n=1 \ --cluster_k=1 --cluster_m_fallback=0 --cluster_n_fallback=0 --cluster_k_fallback=0 --stages=7 --warps_m=4 \ --warps_n=2 --warps_k=1 --inst_m=64 --inst_n=128 --inst_k=16 --min_cc=90 --max_cc=90 Bytes: 134479872 bytes FLOPs: 1073872896 flops FLOPs/Byte: 7
./tools/profiler/cutlass_profiler --dist=uniform,min:-2.3,max:2.3,scale:-1 --kernels=cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem --m=8 --n=8192 --k=128 --verification-enabled=false ============================= Problem ID: 1 Provider: CUTLASS OperationKind: gemm Operation: cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem Status: Success Verification: OFF Disposition: Not verified Arguments: --gemm_kind=universal --m=8 --n=8192 --k=128 --A=bf16:row --B=bf16:column --C=bf16:column --D=bf16:column \ --alpha=1 --beta=0 --split_k_mode=serial --split_k_slices=1 --batch_count=1 --raster_order=heuristic \ --runtime_input_datatype_a=invalid --runtime_input_datatype_b=invalid --use_pdl=false --enable_sm90_mixed_dtype_shuffle_test=false \ --swizzle_size=1 --op_class=tensorop --accum=f32 --cta_m=128 --cta_n=128 --cta_k=64 --cluster_m=1 --cluster_n=1 \ --cluster_k=1 --cluster_m_fallback=0 --cluster_n_fallback=0 --cluster_k_fallback=0 --stages=7 --warps_m=4 \ --warps_n=2 --warps_k=1 --inst_m=64 --inst_n=128 --inst_k=16 --min_cc=90 --max_cc=90 Bytes: 2230272 bytes FLOPs: 16908288 flops FLOPs/Byte: 7 Runtime: 0.0130992 ms Memory: 158.567 GiB/s Math: 1290.79 GFLOP/s
The text was updated successfully, but these errors were encountered:
Problem ID: 1 Provider: CUTLASS OperationKind: gemm Operation: cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x1x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_tma Status: Success Verification: OFF Disposition: Failed
Another one that failed . The common pattern between these are cluster=4x1x1, stream_k, ws
Sorry, something went wrong.
Problem ID: 1 Provider: CUTLASS OperationKind: gemm Operation: cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x2x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem Status: Success Verification: OFF Disposition: Failed
Problem ID: 1 Provider: CUTLASS OperationKind: gemm Operation: cutlass3x_sm90_tensorop_s64x128x16gemm_bf16_bf16_f32_bf16_bf16_128x128x64_4x4x1_0_tnn_align8_stream_k_warpspecialized_cooperative_epi_nosmem Status: Error: internal Verification: OFF Disposition: Failed
@jackkosaian
No branches or pull requests
GEMM Problem Shape --m=8 --n=8192 --k=8192 Does NOT Work
GEMM Problem Shape --m=8 --n=8192 --k=128 Works
The text was updated successfully, but these errors were encountered: