-
Notifications
You must be signed in to change notification settings - Fork 26
Add pass to convert atomic rmw to non-atomic ops when legal #1782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 16b7be2 | Previous: 0d06322 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / Jax / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / HLOOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / PartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / IPartOpt / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / DefOpt / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / IDefOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
actmtch / JaXPipe / cuda / Forward |
0.00001008 s |
0.000010944 s |
0.92 |
actmtch / Jax / cuda / Forward |
0.000010176 s |
0.000010464 s |
0.97 |
actmtch / HLOOpt / cuda / Forward |
0.00001024 s |
0.000011552 s |
0.89 |
actmtch / PartOpt / cuda / Forward |
0.000010016 s |
0.000010912 s |
0.92 |
actmtch / IPartOpt / cuda / Forward |
0.000010304 s |
0.000011520000000000002 s |
0.89 |
actmtch / DefOpt / cuda / Forward |
0.000010335 s |
0.000011424 s |
0.90 |
actmtch / IDefOpt / cuda / Forward |
0.000010144 s |
0.00001184 s |
0.86 |
actmtch / JaXPipe / cuda / PreRev |
0.000009824 s |
0.000010624 s |
0.92 |
actmtch / JaXPipe / cuda / PostRev |
0.000010463 s |
0.000010336 s |
1.01 |
actmtch / JaXPipe / cuda / BothRev |
0.0000104 s |
0.0000104 s |
1 |
actmtch / Jax / cuda / BothRev |
0.000010016 s |
0.000010688 s |
0.94 |
actmtch / HLOOpt / cuda / PreRev |
0.000010336 s |
0.000010336 s |
1 |
actmtch / HLOOpt / cuda / PostRev |
0.000010207 s |
0.000010208 s |
1.00 |
actmtch / HLOOpt / cuda / BothRev |
0.000010016 s |
0.000010528 s |
0.95 |
actmtch / PartOpt / cuda / PreRev |
0.000010433 s |
0.00001024 s |
1.02 |
actmtch / PartOpt / cuda / PostRev |
0.00001024 s |
0.000010847 s |
0.94 |
actmtch / PartOpt / cuda / BothRev |
0.0000104 s |
0.000010496 s |
0.99 |
actmtch / IPartOpt / cuda / PreRev |
0.000010336 s |
0.000010592 s |
0.98 |
actmtch / IPartOpt / cuda / PostRev |
0.0000104 s |
0.000010464 s |
0.99 |
actmtch / IPartOpt / cuda / BothRev |
0.0000104 s |
0.000010591 s |
0.98 |
actmtch / DefOpt / cuda / PreRev |
0.000010527 s |
0.000010689 s |
0.98 |
actmtch / DefOpt / cuda / PostRev |
0.000010144 s |
0.000010304 s |
0.98 |
actmtch / DefOpt / cuda / BothRev |
0.000010112 s |
0.000010208 s |
0.99 |
actmtch / IDefOpt / cuda / PreRev |
0.000010496 s |
0.000010528 s |
1.00 |
actmtch / IDefOpt / cuda / PostRev |
0.000010208 s |
0.000010208 s |
1 |
actmtch / IDefOpt / cuda / BothRev |
0.00001024 s |
0.000010272 s |
1.00 |
actmtch / JaXPipe / tpu / Primal |
5.63325e-7 s |
5.631e-7 s |
1.00 |
actmtch / Jax / tpu / Primal |
5.96775e-7 s |
5.96675e-7 s |
1.00 |
actmtch / HLOOpt / tpu / Primal |
0.00000209765 s |
0.000002098725 s |
1.00 |
actmtch / PartOpt / tpu / Primal |
5.96475e-7 s |
5.9665e-7 s |
1.00 |
actmtch / IPartOpt / tpu / Primal |
5.52525e-7 s |
5.52425e-7 s |
1.00 |
actmtch / DefOpt / tpu / Primal |
0.000002154225 s |
0.000002163725 s |
1.00 |
actmtch / IDefOpt / tpu / Primal |
0.0000021087 s |
0.00000210695 s |
1.00 |
actmtch / JaXPipe / tpu / Forward |
0.000003817525 s |
0.000003815725 s |
1.00 |
actmtch / Jax / tpu / Forward |
0.000001209375 s |
0.0000012058000000000002 s |
1.00 |
actmtch / HLOOpt / tpu / Forward |
0.0000039481500000000005 s |
0.0000039318 s |
1.00 |
actmtch / PartOpt / tpu / Forward |
0.000003911225 s |
0.000003918225 s |
1.00 |
actmtch / IPartOpt / tpu / Forward |
0.00000395565 s |
0.000003930175 s |
1.01 |
actmtch / DefOpt / tpu / Forward |
0.0000039117 s |
0.000003920199999999999 s |
1.00 |
actmtch / IDefOpt / tpu / Forward |
0.0000039437 s |
0.000003930175 s |
1.00 |
actmtch / JaXPipe / tpu / PreRev |
0.00000347665 s |
0.0000034626750000000003 s |
1.00 |
actmtch / JaXPipe / tpu / PostRev |
0.00000163625 s |
0.000001652625 s |
0.99 |
actmtch / JaXPipe / tpu / BothRev |
0.00000347965 s |
0.0000034682 s |
1.00 |
actmtch / Jax / tpu / BothRev |
0.00000163515 s |
0.0000016323 s |
1.00 |
actmtch / HLOOpt / tpu / PreRev |
0.0000034804 s |
0.000003474925 s |
1.00 |
actmtch / HLOOpt / tpu / PostRev |
0.0000034102 s |
0.00000340775 s |
1.00 |
actmtch / HLOOpt / tpu / BothRev |
0.0000034808 s |
0.000003469675 s |
1.00 |
actmtch / PartOpt / tpu / PreRev |
0.000003424 s |
0.0000034048 s |
1.01 |
actmtch / PartOpt / tpu / PostRev |
0.000001593625 s |
0.0000016085 s |
0.99 |
actmtch / PartOpt / tpu / BothRev |
0.000003418675 s |
0.0000034112 s |
1.00 |
actmtch / IPartOpt / tpu / PreRev |
0.000003470125 s |
0.0000034773 s |
1.00 |
actmtch / IPartOpt / tpu / PostRev |
0.0000016353750000000002 s |
0.000001637625 s |
1.00 |
actmtch / IPartOpt / tpu / BothRev |
0.000003476225 s |
0.0000034829500000000003 s |
1.00 |
actmtch / DefOpt / tpu / PreRev |
0.0000034158 s |
0.00000341875 s |
1.00 |
actmtch / DefOpt / tpu / PostRev |
0.00000341495 s |
0.00000341295 s |
1.00 |
actmtch / DefOpt / tpu / BothRev |
0.0000034144000000000003 s |
0.00000342575 s |
1.00 |
actmtch / IDefOpt / tpu / PreRev |
0.00000348695 s |
0.0000034794750000000003 s |
1.00 |
actmtch / IDefOpt / tpu / PostRev |
0.0000034130500000000003 s |
0.0000034012 s |
1.00 |
actmtch / IDefOpt / tpu / BothRev |
0.000003473625 s |
0.000003471725 s |
1.00 |
actmtch / JaXPipe / cpu / Primal |
0.000013372 s |
0.00000702896004440845 s |
1.90 |
actmtch / Jax / cpu / Primal |
0.000013422 s |
0.000007090619947121013 s |
1.89 |
actmtch / HLOOpt / cpu / Primal |
0.000013898 s |
0.00000910397997358814 s |
1.53 |
actmtch / PartOpt / cpu / Primal |
0.000013384 s |
0.000007699900033912854 s |
1.74 |
actmtch / IPartOpt / cpu / Primal |
0.000012885 s |
0.000008126540051307529 s |
1.59 |
actmtch / DefOpt / cpu / Primal |
0.000013938 s |
0.000008300439876620658 s |
1.68 |
actmtch / IDefOpt / cpu / Primal |
0.000014018 s |
0.000008272720024251611 s |
1.69 |
actmtch / JaXPipe / cpu / Forward |
0.000019152 s |
0.000012635160037461902 s |
1.52 |
actmtch / Jax / cpu / Forward |
0.000018323 s |
0.000010485599959793037 s |
1.75 |
actmtch / HLOOpt / cpu / Forward |
0.00001908 s |
0.000012656719991355204 s |
1.51 |
actmtch / PartOpt / cpu / Forward |
0.000019069 s |
0.000011976340065302791 s |
1.59 |
actmtch / IPartOpt / cpu / Forward |
0.000018627 s |
0.000012473439983295976 s |
1.49 |
actmtch / DefOpt / cpu / Forward |
0.000018997 s |
0.000012138799938838929 s |
1.56 |
actmtch / IDefOpt / cpu / Forward |
0.000019386 s |
0.000011991440060228342 s |
1.62 |
actmtch / JaXPipe / cpu / PreRev |
0.00001949 s |
0.000012268239970580908 s |
1.59 |
actmtch / JaXPipe / cpu / PostRev |
0.000017723 s |
0.000010976959965773856 s |
1.61 |
actmtch / JaXPipe / cpu / BothRev |
0.00001896 s |
0.000012177059979876504 s |
1.56 |
actmtch / Jax / cpu / BothRev |
0.000017857000000000002 s |
0.00001028339993354166 s |
1.74 |
actmtch / HLOOpt / cpu / PreRev |
0.000019524 s |
0.00001290521991904825 s |
1.51 |
actmtch / HLOOpt / cpu / PostRev |
0.000019221 s |
0.000014558360035152874 s |
1.32 |
actmtch / HLOOpt / cpu / BothRev |
0.000019206 s |
0.000011904499988304451 s |
1.61 |
actmtch / PartOpt / cpu / PreRev |
0.00001939 s |
0.000012542979893623851 s |
1.55 |
actmtch / PartOpt / cpu / PostRev |
0.000017353 s |
0.000011923499969270778 s |
1.46 |
actmtch / PartOpt / cpu / BothRev |
0.000019404000000000003 s |
0.000013031420112383784 s |
1.49 |
actmtch / IPartOpt / cpu / PreRev |
0.000019296 s |
0.000012351440091151744 s |
1.56 |
actmtch / IPartOpt / cpu / PostRev |
0.000017395000000000002 s |
0.000010943919951387216 s |
1.59 |
actmtch / IPartOpt / cpu / BothRev |
0.000019646 s |
0.000012477419968490722 s |
1.57 |
actmtch / DefOpt / cpu / PreRev |
0.000019149 s |
0.000012202039979456458 s |
1.57 |
actmtch / DefOpt / cpu / PostRev |
0.000019233 s |
0.000012676960086537292 s |
1.52 |
actmtch / DefOpt / cpu / BothRev |
0.000019185 s |
0.00001223014001880074 s |
1.57 |
actmtch / IDefOpt / cpu / PreRev |
0.000019215 s |
0.00001243941997017828 s |
1.54 |
actmtch / IDefOpt / cpu / PostRev |
0.000018949 s |
0.000012198299955343828 s |
1.55 |
actmtch / IDefOpt / cpu / BothRev |
0.000019742 s |
0.000011905400006071432 s |
1.66 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / JaXPipe / cuda / Forward |
0.000010304 s |
0.000010272 s |
1.00 |
add_one / Jax / cuda / Forward |
0.000010432 s |
0.000010272 s |
1.02 |
add_one / HLOOpt / cuda / Forward |
0.000010336 s |
0.000010336 s |
1 |
add_one / PartOpt / cuda / Forward |
0.000010271 s |
0.000010112 s |
1.02 |
add_one / IPartOpt / cuda / Forward |
0.00001024 s |
0.000010272 s |
1.00 |
add_one / DefOpt / cuda / Forward |
0.000009984 s |
0.000010304 s |
0.97 |
add_one / IDefOpt / cuda / Forward |
0.00001008 s |
0.000010016 s |
1.01 |
add_one / JaXPipe / cuda / PreRev |
0.000025888 s |
0.000025568 s |
1.01 |
add_one / JaXPipe / cuda / PostRev |
0.000025536 s |
0.000025023 s |
1.02 |
add_one / JaXPipe / cuda / BothRev |
0.000025696 s |
0.000025056 s |
1.03 |
add_one / Jax / cuda / BothRev |
0.000025888 s |
0.000025312 s |
1.02 |
add_one / HLOOpt / cuda / PreRev |
0.00002576 s |
0.000025216 s |
1.02 |
add_one / HLOOpt / cuda / PostRev |
0.000025632 s |
0.000024768 s |
1.03 |
add_one / HLOOpt / cuda / BothRev |
0.000025408 s |
0.000024992 s |
1.02 |
add_one / PartOpt / cuda / PreRev |
0.000025569 s |
0.000025408 s |
1.01 |
add_one / PartOpt / cuda / PostRev |
0.000024992 s |
0.000024896 s |
1.00 |
add_one / PartOpt / cuda / BothRev |
0.000025729 s |
0.000025087 s |
1.03 |
add_one / IPartOpt / cuda / PreRev |
0.00002544 s |
0.000025376 s |
1.00 |
add_one / IPartOpt / cuda / PostRev |
0.000025472000000000003 s |
0.000025408 s |
1.00 |
add_one / IPartOpt / cuda / BothRev |
0.000024448 s |
0.000025312 s |
0.97 |
add_one / DefOpt / cuda / PreRev |
0.000025504 s |
0.000025824 s |
0.99 |
add_one / DefOpt / cuda / PostRev |
0.000026144 s |
0.0000256 s |
1.02 |
add_one / DefOpt / cuda / BothRev |
0.000025504 s |
0.000026368 s |
0.97 |
add_one / IDefOpt / cuda / PreRev |
0.000025184 s |
0.000025471 s |
0.99 |
add_one / IDefOpt / cuda / PostRev |
0.00002592 s |
0.00002544 s |
1.02 |
add_one / IDefOpt / cuda / BothRev |
0.000025664 s |
0.000025792 s |
1.00 |
add_one / JaXPipe / tpu / Primal |
0.0000014295749999999998 s |
0.0000014340249999999998 s |
1.00 |
add_one / Jax / tpu / Primal |
0.000001404625 s |
0.000001401425 s |
1.00 |
add_one / HLOOpt / tpu / Primal |
0.0000014306 s |
0.000001425525 s |
1.00 |
add_one / PartOpt / tpu / Primal |
0.0000014030000000000002 s |
0.000001415625 s |
0.99 |
add_one / IPartOpt / tpu / Primal |
0.0000014320250000000005 s |
0.00000142825 s |
1.00 |
add_one / DefOpt / tpu / Primal |
0.0000014068 s |
0.0000014157 s |
0.99 |
add_one / IDefOpt / tpu / Primal |
0.000001429825 s |
0.000001426675 s |
1.00 |
add_one / JaXPipe / tpu / Forward |
0.000001856925 s |
0.0000018621 s |
1.00 |
add_one / Jax / tpu / Forward |
0.00000184955 s |
0.00000184395 s |
1.00 |
add_one / HLOOpt / tpu / Forward |
0.0000018464 s |
0.000001853375 s |
1.00 |
add_one / PartOpt / tpu / Forward |
0.000001838275 s |
0.000001845425 s |
1.00 |
add_one / IPartOpt / tpu / Forward |
0.00000184925 s |
0.000001850575 s |
1.00 |
add_one / DefOpt / tpu / Forward |
0.00000184045 s |
0.00000185095 s |
0.99 |
add_one / IDefOpt / tpu / Forward |
0.000001851225 s |
0.00000184935 s |
1.00 |
add_one / JaXPipe / tpu / PreRev |
0.0000022357 s |
0.000002234425 s |
1.00 |
add_one / JaXPipe / tpu / PostRev |
0.00000224335 s |
0.000002245875 s |
1.00 |
add_one / JaXPipe / tpu / BothRev |
0.000002245675 s |
0.000002231825 s |
1.01 |
add_one / Jax / tpu / BothRev |
0.000002232575 s |
0.000002238975 s |
1.00 |
add_one / HLOOpt / tpu / PreRev |
0.00000223485 s |
0.000002237225 s |
1.00 |
add_one / HLOOpt / tpu / PostRev |
0.0000022358 s |
0.00000223695 s |
1.00 |
add_one / HLOOpt / tpu / BothRev |
0.000002238325 s |
0.0000022443250000000003 s |
1.00 |
add_one / PartOpt / tpu / PreRev |
0.000002238525 s |
0.0000022425750000000003 s |
1.00 |
add_one / PartOpt / tpu / PostRev |
0.000002236525 s |
0.00000224105 s |
1.00 |
add_one / PartOpt / tpu / BothRev |
0.0000022431 s |
0.0000022365750000000003 s |
1.00 |
add_one / IPartOpt / tpu / PreRev |
0.0000022303 s |
0.000002238025 s |
1.00 |
add_one / IPartOpt / tpu / PostRev |
0.0000022403500000000005 s |
0.000002244225 s |
1.00 |
add_one / IPartOpt / tpu / BothRev |
0.00000223075 s |
0.00000223555 s |
1.00 |
add_one / DefOpt / tpu / PreRev |
0.00000224955 s |
0.000002244225 s |
1.00 |
add_one / DefOpt / tpu / PostRev |
0.00000223095 s |
0.0000022363750000000003 s |
1.00 |
add_one / DefOpt / tpu / BothRev |
0.000002254725 s |
0.0000022566750000000003 s |
1.00 |
add_one / IDefOpt / tpu / PreRev |
0.000002235925 s |
0.0000022435250000000003 s |
1.00 |
add_one / IDefOpt / tpu / PostRev |
0.0000022467 s |
0.000002242425 s |
1.00 |
add_one / IDefOpt / tpu / BothRev |
0.00000223015 s |
0.000002233975 s |
1.00 |
add_one / JaXPipe / cpu / Primal |
0.000013385 s |
0.000007482279979740269 s |
1.79 |
add_one / Jax / cpu / Primal |
0.000013297 s |
0.000007295579998753965 s |
1.82 |
add_one / HLOOpt / cpu / Primal |
0.000012982 s |
0.0000072367599568678995 s |
1.79 |
add_one / PartOpt / cpu / Primal |
0.000012878 s |
0.000006717680007568561 s |
1.92 |
add_one / IPartOpt / cpu / Primal |
0.00001299 s |
0.000007430179957736982 s |
1.75 |
add_one / DefOpt / cpu / Primal |
0.000013048 s |
0.000006753920042683603 s |
1.93 |
add_one / IDefOpt / cpu / Primal |
0.000012777 s |
0.000007092920041031902 s |
1.80 |
add_one / JaXPipe / cpu / Forward |
0.000017578 s |
0.000010866239972529 s |
1.62 |
add_one / Jax / cpu / Forward |
0.000017602 s |
0.000011250040042796172 s |
1.56 |
add_one / HLOOpt / cpu / Forward |
0.000017697 s |
0.000011172939975949704 s |
1.58 |
add_one / PartOpt / cpu / Forward |
0.000017697 s |
0.000011107540030934617 s |
1.59 |
add_one / IPartOpt / cpu / Forward |
0.000017434999999999998 s |
0.000010974180022458312 s |
1.59 |
add_one / DefOpt / cpu / Forward |
0.000017402 s |
0.000011392880023777253 s |
1.53 |
add_one / IDefOpt / cpu / Forward |
0.000017668 s |
0.00001110596007492859 s |
1.59 |
add_one / JaXPipe / cpu / PreRev |
0.000019806 s |
0.000013583340059994953 s |
1.46 |
add_one / JaXPipe / cpu / PostRev |
0.000019673 s |
0.00001288485997065436 s |
1.53 |
add_one / JaXPipe / cpu / BothRev |
0.000019475 s |
0.000013069859996903689 s |
1.49 |
add_one / Jax / cpu / BothRev |
0.000019681 s |
0.000012902259913971648 s |
1.53 |
add_one / HLOOpt / cpu / PreRev |
0.000019523 s |
0.000012942099983774824 s |
1.51 |
add_one / HLOOpt / cpu / PostRev |
0.00001981 s |
0.000014932679914636537 s |
1.33 |
add_one / HLOOpt / cpu / BothRev |
0.000019556 s |
0.000012773839953297284 s |
1.53 |
add_one / PartOpt / cpu / PreRev |
0.000019586 s |
0.00001311142006670707 s |
1.49 |
add_one / PartOpt / cpu / PostRev |
0.000019508 s |
0.000012460100024327402 s |
1.57 |
add_one / PartOpt / cpu / BothRev |
0.000019421 s |
0.000013669600029970752 s |
1.42 |
add_one / IPartOpt / cpu / PreRev |
0.000019492 s |
0.000012843979930039496 s |
1.52 |
add_one / IPartOpt / cpu / PostRev |
0.000019463 s |
0.00001249579998329864 s |
1.56 |
add_one / IPartOpt / cpu / BothRev |
0.000019325 s |
0.000012642160054383569 s |
1.53 |
add_one / DefOpt / cpu / PreRev |
0.0000195 s |
0.000012758939992636445 s |
1.53 |
add_one / DefOpt / cpu / PostRev |
0.000019511 s |
0.000012898780059913406 s |
1.51 |
add_one / DefOpt / cpu / BothRev |
0.000019662 s |
0.000012417840061971218 s |
1.58 |
add_one / IDefOpt / cpu / PreRev |
0.000019662 s |
0.000013013019943173276 s |
1.51 |
add_one / IDefOpt / cpu / PostRev |
0.000019615 s |
0.000012688639926636823 s |
1.55 |
add_one / IDefOpt / cpu / BothRev |
0.000019683 s |
0.000012303200019232463 s |
1.60 |
add_two / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IDefOpt / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
add_two / JaXPipe / cuda / Forward |
0.000009984 s |
0.00001024 s |
0.97 |
add_two / Jax / cuda / Forward |
0.000010048 s |
0.000010048 s |
1 |
add_two / HLOOpt / cuda / Forward |
0.000010144 s |
0.000009631 s |
1.05 |
add_two / PartOpt / cuda / Forward |
0.000009888 s |
0.000010112 s |
0.98 |
add_two / IPartOpt / cuda / Forward |
0.000009952 s |
0.000010143 s |
0.98 |
add_two / DefOpt / cuda / Forward |
0.00001008 s |
0.000010176 s |
0.99 |
add_two / IDefOpt / cuda / Forward |
0.000015296 s |
0.000009952 s |
1.54 |
add_two / JaXPipe / cuda / PreRev |
0.000032448 s |
0.000031873 s |
1.02 |
add_two / JaXPipe / cuda / PostRev |
0.000032096 s |
0.000032576 s |
0.99 |
add_two / JaXPipe / cuda / BothRev |
0.000032224 s |
0.000033024 s |
0.98 |
add_two / Jax / cuda / BothRev |
0.000033088 s |
0.000033152000000000004 s |
1.00 |
add_two / HLOOpt / cuda / PreRev |
0.000032993 s |
0.000033343 s |
0.99 |
add_two / HLOOpt / cuda / PostRev |
0.000031968 s |
0.000032606999999999995 s |
0.98 |
add_two / HLOOpt / cuda / BothRev |
0.000032992 s |
0.000032928 s |
1.00 |
add_two / PartOpt / cuda / PreRev |
0.000032544 s |
0.000032704 s |
1.00 |
add_two / PartOpt / cuda / PostRev |
0.000031776 s |
0.000032767999999999995 s |
0.97 |
add_two / PartOpt / cuda / BothRev |
0.000032544 s |
0.000032608 s |
1.00 |
add_two / IPartOpt / cuda / PreRev |
0.000032385 s |
0.000033024 s |
0.98 |
add_two / IPartOpt / cuda / PostRev |
0.000032032 s |
0.000033696 s |
0.95 |
add_two / IPartOpt / cuda / BothRev |
0.000032608 s |
0.000033504 s |
0.97 |
add_two / DefOpt / cuda / PreRev |
0.000033536000000000006 s |
0.00003344 s |
1.00 |
add_two / DefOpt / cuda / PostRev |
0.000032064 s |
0.000033536000000000006 s |
0.96 |
add_two / DefOpt / cuda / BothRev |
0.000032033 s |
0.000033408 s |
0.96 |
add_two / IDefOpt / cuda / PreRev |
0.000033632 s |
0.000033024 s |
1.02 |
add_two / IDefOpt / cuda / PostRev |
0.000032896000000000005 s |
0.000032736 s |
1.00 |
add_two / IDefOpt / cuda / BothRev |
0.000032608 s |
0.000033376 s |
0.98 |
add_two / JaXPipe / tpu / Primal |
0.000001438975 s |
0.0000014327 s |
1.00 |
add_two / Jax / tpu / Primal |
0.0000014689 s |
0.00000148525 s |
0.99 |
add_two / HLOOpt / tpu / Primal |
0.00000143015 s |
0.000001432125 s |
1.00 |
add_two / PartOpt / tpu / Primal |
0.000001474925 s |
0.0000014746000000000002 s |
1.00 |
add_two / IPartOpt / tpu / Primal |
0.000001432625 s |
0.000001427125 s |
1.00 |
add_two / DefOpt / tpu / Primal |
0.0000014809 s |
0.000001486675 s |
1.00 |
add_two / IDefOpt / tpu / Primal |
0.0000014341999999999998 s |
0.000001434 s |
1.00 |
add_two / JaXPipe / tpu / Forward |
0.000001823225 s |
0.000001838025 s |
0.99 |
add_two / Jax / tpu / Forward |
0.000001829025 s |
0.000001850425 s |
0.99 |
add_two / HLOOpt / tpu / Forward |
0.0000018356 s |
0.000001825675 s |
1.01 |
add_two / PartOpt / tpu / Forward |
0.000001820275 s |
0.00000183435 s |
0.99 |
add_two / IPartOpt / tpu / Forward |
0.0000018284 s |
0.00000182775 s |
1.00 |
add_two / DefOpt / tpu / Forward |
0.00000182515 s |
0.000001821225 s |
1.00 |
add_two / IDefOpt / tpu / Forward |
0.00000182685 s |
0.000001830125 s |
1.00 |
add_two / JaXPipe / tpu / PreRev |
0.0000028352 s |
0.000002833475 s |
1.00 |
add_two / JaXPipe / tpu / PostRev |
0.00000274935 s |
0.000002741825 s |
1.00 |
add_two / JaXPipe / tpu / BothRev |
0.00000282795 s |
0.0000028442 s |
0.99 |
add_two / Jax / tpu / BothRev |
0.0000027508500000000005 s |
0.0000027538750000000004 s |
1.00 |
add_two / HLOOpt / tpu / PreRev |
0.000002834325 s |
0.00000282865 s |
1.00 |
add_two / HLOOpt / tpu / PostRev |
0.000002744625 s |
0.00000275385 s |
1.00 |
add_two / HLOOpt / tpu / BothRev |
0.00000283115 s |
0.000002856775 s |
0.99 |
add_two / PartOpt / tpu / PreRev |
0.00000275215 s |
0.0000027495750000000003 s |
1.00 |
add_two / PartOpt / tpu / PostRev |
0.0000028335 s |
0.0000028459 s |
1.00 |
add_two / PartOpt / tpu / BothRev |
0.00000274645 s |
0.00000275325 s |
1.00 |
add_two / IPartOpt / tpu / PreRev |
0.000002841475 s |
0.00000283295 s |
1.00 |
add_two / IPartOpt / tpu / PostRev |
0.00000275105 s |
0.000002750475 s |
1.00 |
add_two / IPartOpt / tpu / BothRev |
0.0000028461999999999995 s |
0.0000028408 s |
1.00 |
add_two / DefOpt / tpu / PreRev |
0.0000027508500000000005 s |
0.0000027521250000000004 s |
1.00 |
add_two / DefOpt / tpu / PostRev |
0.000002838825 s |
0.00000284555 s |
1.00 |
add_two / DefOpt / tpu / BothRev |
0.0000027479500000000005 s |
0.000002756675 s |
1.00 |
add_two / IDefOpt / tpu / PreRev |
0.000002836525 s |
0.0000028440500000000003 s |
1.00 |
add_two / IDefOpt / tpu / PostRev |
0.000002750825 s |
0.000002742525 s |
1.00 |
add_two / IDefOpt / tpu / BothRev |
0.0000028325 s |
0.00000283935 s |
1.00 |
add_two / JaXPipe / cpu / Primal |
0.00001378 s |
0.000007603080048284028 s |
1.81 |
add_two / Jax / cpu / Primal |
0.00001369 s |
0.000007265580097737256 s |
1.88 |
add_two / HLOOpt / cpu / Primal |
0.000013507999999999998 s |
0.000007252379946294241 s |
1.86 |
add_two / PartOpt / cpu / Primal |
0.00001352 s |
0.000007664579989068443 s |
1.76 |
add_two / IPartOpt / cpu / Primal |
0.000013443 s |
0.000007609659987792838 s |
1.77 |
add_two / DefOpt / cpu / Primal |
0.000013431 s |
0.000007436000050802249 s |
1.81 |
add_two / IDefOpt / cpu / Primal |
0.000013241 s |
0.000007664840049983468 s |
1.73 |
add_two / JaXPipe / cpu / Forward |
0.000018459 s |
0.000011162979917571648 s |
1.65 |
add_two / Jax / cpu / Forward |
0.00001794 s |
0.000011393439945095453 s |
1.57 |
add_two / HLOOpt / cpu / Forward |
0.000018044 s |
0.000011876660046254983 s |
1.52 |
add_two / PartOpt / cpu / Forward |
0.000017726 s |
0.000011154319981869775 s |
1.59 |
add_two / IPartOpt / cpu / Forward |
0.000017871999999999998 s |
0.000011239279920118862 s |
1.59 |
add_two / DefOpt / cpu / Forward |
0.000018431 s |
0.000011385859997972148 s |
1.62 |
add_two / IDefOpt / cpu / Forward |
0.000018132 s |
0.000011184240047441565 s |
1.62 |
add_two / JaXPipe / cpu / PreRev |
0.000023186 s |
0.000015534419944742694 s |
1.49 |
add_two / JaXPipe / cpu / PostRev |
0.000023057 s |
0.000015442319981957552 s |
1.49 |
add_two / JaXPipe / cpu / BothRev |
0.000023192 s |
0.00001556208002511994 s |
1.49 |
add_two / Jax / cpu / BothRev |
0.000023047 s |
0.00001551052004288067 s |
1.49 |
add_two / HLOOpt / cpu / PreRev |
0.000022954 s |
0.00001560625998536125 s |
1.47 |
add_two / HLOOpt / cpu / PostRev |
0.000023582 s |
0.000017657759926805738 s |
1.34 |
add_two / HLOOpt / cpu / BothRev |
0.000023197 s |
0.00001540101999125909 s |
1.51 |
add_two / PartOpt / cpu / PreRev |
0.000022731 s |
0.000015437360052601433 s |
1.47 |
add_two / PartOpt / cpu / PostRev |
0.000023254 s |
0.000015096120041562244 s |
1.54 |
add_two / PartOpt / cpu / BothRev |
0.000023164 s |
0.00001628869993510307 s |
1.42 |
add_two / IPartOpt / cpu / PreRev |
0.000022821 s |
0.000015847779923205963 s |
1.44 |
add_two / IPartOpt / cpu / PostRev |
0.000023167 s |
0.000015286219950212397 s |
1.52 |
add_two / IPartOpt / cpu / BothRev |
0.0000233 s |
0.000015069540022523144 s |
1.55 |
add_two / DefOpt / cpu / PreRev |
0.000023057 s |
0.000015951179975672857 s |
1.45 |
add_two / DefOpt / cpu / PostRev |
0.000022896 s |
0.000014918000033503633 s |
1.53 |
add_two / DefOpt / cpu / BothRev |
0.000023424 s |
0.000015067820040712831 s |
1.55 |
add_two / IDefOpt / cpu / PreRev |
0.000022848 s |
0.000015652199981559533 s |
1.46 |
add_two / IDefOpt / cpu / PostRev |
0.000023058 s |
0.000014937559899408373 s |
1.54 |
add_two / IDefOpt / cpu / BothRev |
0.000022987 s |
0.00001542553998660878 s |
1.49 |
cache / JaXPipe / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
cache / Jax / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
cache / HLOOpt / cuda / Primal |
0.000002207 s |
0.00000224 s |
0.99 |
cache / PartOpt / cuda / Primal |
0.00000224 s |
0.00000224 s |
1 |
cache / IPartOpt / cuda / Primal |
0.000002303 s |
0.000002303 s |
1 |
cache / DefOpt / cuda / Primal |
0.00000224 s |
0.000002272 s |
0.99 |
cache / IDefOpt / cuda / Primal |
0.000002208 s |
0.000002208 s |
1 |
cache / JaXPipe / cuda / Forward |
0.000002335 s |
0.000002303 s |
1.01 |
cache / Jax / cuda / Forward |
0.000002336 s |
0.000002304 s |
1.01 |
cache / HLOOpt / cuda / Forward |
0.000002304 s |
0.000002335 s |
0.99 |
cache / PartOpt / cuda / Forward |
0.000002335 s |
0.000002335 s |
1 |
cache / IPartOpt / cuda / Forward |
0.000002304 s |
0.000002336 s |
0.99 |
cache / DefOpt / cuda / Forward |
0.000002273 s |
0.000002272 s |
1.00 |
cache / IDefOpt / cuda / Forward |
0.000002335 s |
0.000002335 s |
1 |
cache / JaXPipe / cuda / PreRev |
0.000011264 s |
0.00001072 s |
1.05 |
cache / JaXPipe / cuda / PostRev |
0.000010975 s |
0.000010848 s |
1.01 |
cache / JaXPipe / cuda / BothRev |
0.000010944 s |
0.000010784 s |
1.01 |
cache / Jax / cuda / BothRev |
0.00001088 s |
0.000011008 s |
0.99 |
cache / HLOOpt / cuda / PreRev |
0.000013472 s |
0.000013471 s |
1.00 |
cache / HLOOpt / cuda / PostRev |
0.000013408 s |
0.000013408 s |
1 |
cache / HLOOpt / cuda / BothRev |
0.000013472 s |
0.000013472 s |
1 |
cache / PartOpt / cuda / PreRev |
0.000011104 s |
0.000011104 s |
1 |
cache / PartOpt / cuda / PostRev |
0.000010688 s |
0.000010911 s |
0.98 |
cache / PartOpt / cuda / BothRev |
0.000010848 s |
0.000010976 s |
0.99 |
cache / IPartOpt / cuda / PreRev |
0.000010976 s |
0.000010304 s |
1.07 |
cache / IPartOpt / cuda / PostRev |
0.00001104 s |
0.000010528 s |
1.05 |
cache / IPartOpt / cuda / BothRev |
0.000011297 s |
0.000011008 s |
1.03 |
cache / DefOpt / cuda / PreRev |
0.000011008 s |
0.00001088 s |
1.01 |
cache / DefOpt / cuda / PostRev |
0.000010816 s |
0.000010912 s |
0.99 |
cache / DefOpt / cuda / BothRev |
0.000011169 s |
0.000010912 s |
1.02 |
cache / IDefOpt / cuda / PreRev |
0.00001072 s |
0.000011008 s |
0.97 |
cache / IDefOpt / cuda / PostRev |
0.000011008 s |
0.000011136 s |
0.99 |
cache / IDefOpt / cuda / BothRev |
0.000010688 s |
0.000011296 s |
0.95 |
cache / JaXPipe / tpu / Primal |
0.0000024818 s |
0.00000247055 s |
1.00 |
cache / Jax / tpu / Primal |
0.000002462625 s |
0.000002459375 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.000002467375 s |
0.000002470125 s |
1.00 |
cache / PartOpt / tpu / Primal |
0.0000024736 s |
0.000002456775 s |
1.01 |
cache / IPartOpt / tpu / Primal |
0.00000245645 s |
0.00000246025 s |
1.00 |
cache / DefOpt / tpu / Primal |
0.0000024712750000000003 s |
0.0000024752 s |
1.00 |
cache / IDefOpt / tpu / Primal |
0.000002458875 s |
0.000002482 s |
0.99 |
cache / JaXPipe / tpu / Forward |
0.000003554225 s |
0.000003567575 s |
1.00 |
cache / Jax / tpu / Forward |
0.000003529175 s |
0.0000035388 s |
1.00 |
cache / HLOOpt / tpu / Forward |
0.0000035462 s |
0.00000355415 s |
1.00 |
cache / PartOpt / tpu / Forward |
0.0000035356 s |
0.0000035287 s |
1.00 |
cache / IPartOpt / tpu / Forward |
0.000003544775 s |
0.0000035521000000000003 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.000003523875 s |
0.000003546825 s |
0.99 |
cache / IDefOpt / tpu / Forward |
0.0000035322500000000004 s |
0.0000035494750000000004 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.0000049448 s |
0.0000049307 s |
1.00 |
cache / JaXPipe / tpu / PostRev |
0.000004956975 s |
0.000004935399999999999 s |
1.00 |
cache / JaXPipe / tpu / BothRev |
0.000004976775 s |
0.000004954025 s |
1.00 |
cache / Jax / tpu / BothRev |
0.00000498915 s |
0.000004991775 s |
1.00 |
cache / HLOOpt / tpu / PreRev |
0.0000039467 s |
0.000003940575 s |
1.00 |
cache / HLOOpt / tpu / PostRev |
0.000004124050000000001 s |
0.0000041419 s |
1.00 |
cache / HLOOpt / tpu / BothRev |
0.000003937574999999999 s |
0.0000039466 s |
1.00 |
cache / PartOpt / tpu / PreRev |
0.0000049567 s |
0.00000498505 s |
0.99 |
cache / PartOpt / tpu / PostRev |
0.00000496865 s |
0.0000049616 s |
1.00 |
cache / PartOpt / tpu / BothRev |
0.00000496925 s |
0.000004985 s |
1.00 |
cache / IPartOpt / tpu / PreRev |
0.0000049759 s |
0.00000495935 s |
1.00 |
cache / IPartOpt / tpu / PostRev |
0.000004976175 s |
0.000004985075 s |
1.00 |
cache / IPartOpt / tpu / BothRev |
0.0000049627 s |
0.000004959749999999999 s |
1.00 |
cache / DefOpt / tpu / PreRev |
0.000004984075 s |
0.0000049680000000000005 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.0000049547 s |
0.000004955975 s |
1.00 |
cache / DefOpt / tpu / BothRev |
0.00000496015 s |
0.00000495815 s |
1.00 |
cache / IDefOpt / tpu / PreRev |
0.000004952625 s |
0.00000497235 s |
1.00 |
cache / IDefOpt / tpu / PostRev |
0.000004961775 s |
0.0000049673 s |
1.00 |
cache / IDefOpt / tpu / BothRev |
0.0000049675 s |
0.0000049585 s |
1.00 |
cache / JaXPipe / cpu / Primal |
0.000012766 s |
0.000007113319916243199 s |
1.79 |
cache / Jax / cpu / Primal |
0.000012616 s |
0.000007572799877380019 s |
1.67 |
cache / HLOOpt / cpu / Primal |
0.000012608 s |
0.000006496139994851546 s |
1.94 |
cache / PartOpt / cpu / Primal |
0.000012696 s |
0.0000067282200507179366 s |
1.89 |
cache / IPartOpt / cpu / Primal |
0.000012595 s |
0.000006539540081575979 s |
1.93 |
cache / DefOpt / cpu / Primal |
0.000013088 s |
0.000007011559991951799 s |
1.87 |
cache / IDefOpt / cpu / Primal |
0.000012757 s |
0.000006468759947892977 s |
1.97 |
cache / JaXPipe / cpu / Forward |
0.000023196000000000003 s |
0.000014514539998344844 s |
1.60 |
cache / Jax / cpu / Forward |
0.000025577 s |
0.00001441554000848555 s |
1.77 |
cache / HLOOpt / cpu / Forward |
0.000023504 s |
0.000014784280047024368 s |
1.59 |
cache / PartOpt / cpu / Forward |
0.000023067 s |
0.000014338260025397175 s |
1.61 |
cache / IPartOpt / cpu / Forward |
0.000024474 s |
0.00001500011998359696 s |
1.63 |
cache / DefOpt / cpu / Forward |
0.000023398 s |
0.000015464900061488153 s |
1.51 |
cache / IDefOpt / cpu / Forward |
0.000023658 s |
0.000014286339974205475 s |
1.66 |
cache / JaXPipe / cpu / PreRev |
0.000027277 s |
0.000016791880061646224 s |
1.62 |
cache / JaXPipe / cpu / PostRev |
0.000040874 s |
0.0000222802400458022 s |
1.83 |
cache / JaXPipe / cpu / BothRev |
0.000024903 s |
0.00001686876010353444 s |
1.48 |
cache / Jax / cpu / BothRev |
0.000027881 s |
0.000021221759961918 s |
1.31 |
cache / HLOOpt / cpu / PreRev |
0.000025826000000000003 s |
0.000017306019944953732 s |
1.49 |
cache / HLOOpt / cpu / PostRev |
0.000018278 s |
0.00001918961999763269 s |
0.95 |
cache / HLOOpt / cpu / BothRev |
0.000018195 s |
0.00001752650006892509 s |
1.04 |
cache / PartOpt / cpu / PreRev |
0.000017261 s |
0.000016286500012938632 s |
1.06 |
cache / PartOpt / cpu / PostRev |
0.000020198 s |
0.000020723759989778044 s |
0.97 |
cache / PartOpt / cpu / BothRev |
0.000017698 s |
0.000017491439957666445 s |
1.01 |
cache / IPartOpt / cpu / PreRev |
0.000017885999999999998 s |
0.0000170234999677632 s |
1.05 |
cache / IPartOpt / cpu / PostRev |
0.000019997 s |
0.00002054747998045059 s |
0.97 |
cache / IPartOpt / cpu / BothRev |
0.0000187 s |
0.000017071119982574602 s |
1.10 |
cache / DefOpt / cpu / PreRev |
0.000017942 s |
0.000016248899992206134 s |
1.10 |
cache / DefOpt / cpu / PostRev |
0.000017944000000000003 s |
0.00001709864003714756 s |
1.05 |
cache / DefOpt / cpu / BothRev |
0.00002393 s |
0.000016460659990116254 s |
1.45 |
cache / IDefOpt / cpu / PreRev |
0.000017306 s |
0.000016951420202531154 s |
1.02 |
cache / IDefOpt / cpu / PostRev |
0.000017391 s |
0.000016882059899216984 s |
1.03 |
cache / IDefOpt / cpu / BothRev |
0.000018095 s |
0.000016809840071800864 s |
1.08 |
Concat / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / PartOpt / cuda / Primal |
0.000001919 s |
0.0000019200000000000003 s |
1.00 |
Concat / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / JaXPipe / cuda / Forward |
0.00001008 s |
0.000010432 s |
0.97 |
Concat / Jax / cuda / Forward |
0.00001008 s |
0.000010432 s |
0.97 |
Concat / HLOOpt / cuda / Forward |
0.000010112 s |
0.000010272 s |
0.98 |
Concat / PartOpt / cuda / Forward |
0.000010304 s |
0.000010528 s |
0.98 |
Concat / IPartOpt / cuda / Forward |
0.000010048 s |
0.000010464 s |
0.96 |
Concat / DefOpt / cuda / Forward |
0.000009952 s |
0.000010207 s |
0.98 |
Concat / IDefOpt / cuda / Forward |
0.000010208 s |
0.000010303 s |
0.99 |
Concat / JaXPipe / cuda / PreRev |
0.000016672 s |
0.000017056 s |
0.98 |
Concat / JaXPipe / cuda / PostRev |
0.000017217 s |
0.000016864 s |
1.02 |
Concat / JaXPipe / cuda / BothRev |
0.000017024 s |
0.000016768000000000003 s |
1.02 |
Concat / Jax / cuda / BothRev |
0.000016864 s |
0.0000168 s |
1.00 |
Concat / HLOOpt / cuda / PreRev |
0.00001712 s |
0.000016896000000000002 s |
1.01 |
Concat / HLOOpt / cuda / PostRev |
0.000017216 s |
0.000016832 s |
1.02 |
Concat / HLOOpt / cuda / BothRev |
0.000016927000000000002 s |
0.000016896000000000002 s |
1.00 |
Concat / PartOpt / cuda / PreRev |
0.000016608 s |
0.000017888000000000002 s |
0.93 |
Concat / PartOpt / cuda / PostRev |
0.000016736 s |
0.000017216 s |
0.97 |
Concat / PartOpt / cuda / BothRev |
0.000016607 s |
0.000016063999999999997 s |
1.03 |
Concat / IPartOpt / cuda / PreRev |
0.000017375999999999998 s |
0.000017152 s |
1.01 |
Concat / IPartOpt / cuda / PostRev |
0.00001712 s |
0.00001648 s |
1.04 |
Concat / IPartOpt / cuda / BothRev |
0.000016672 s |
0.000016545 s |
1.01 |
Concat / DefOpt / cuda / PreRev |
0.000017024 s |
0.000017184 s |
0.99 |
Concat / DefOpt / cuda / PostRev |
0.000016927999999999998 s |
0.000016768000000000003 s |
1.01 |
Concat / DefOpt / cuda / BothRev |
0.000016736 s |
0.000016832 s |
0.99 |
Concat / IDefOpt / cuda / PreRev |
0.000017184 s |
0.000017665 s |
0.97 |
Concat / IDefOpt / cuda / PostRev |
0.000016544 s |
0.000016768000000000003 s |
0.99 |
Concat / IDefOpt / cuda / BothRev |
0.000016896000000000002 s |
0.000017344 s |
0.97 |
Concat / JaXPipe / tpu / Primal |
0.0000015334 s |
0.000001524275 s |
1.01 |
Concat / Jax / tpu / Primal |
0.000001536875 s |
0.000001537425 s |
1.00 |
Concat / HLOOpt / tpu / Primal |
0.000001536175 s |
0.000001522125 s |
1.01 |
Concat / PartOpt / tpu / Primal |
0.000001521825 s |
0.00000153655 s |
0.99 |
Concat / IPartOpt / tpu / Primal |
0.000001540875 s |
0.000001517 s |
1.02 |
Concat / DefOpt / tpu / Primal |
0.000001519575 s |
0.0000015408 s |
0.99 |
Concat / IDefOpt / tpu / Primal |
0.00000153305 s |
0.000001532075 s |
1.00 |
Concat / JaXPipe / tpu / Forward |
0.0000015795500000000002 s |
0.0000015805250000000002 s |
1.00 |
Concat / Jax / tpu / Forward |
0.0000015522250000000002 s |
0.00000155565 s |
1.00 |
Concat / HLOOpt / tpu / Forward |
0.0000015728249999999998 s |
0.000001594875 s |
0.99 |
Concat / PartOpt / tpu / Forward |
0.000001565925 s |
0.0000015500249999999998 s |
1.01 |
Concat / IPartOpt / tpu / Forward |
0.000001577925 s |
0.000001577275 s |
1.00 |
Concat / DefOpt / tpu / Forward |
0.0000015536 s |
0.00000153945 s |
1.01 |
Concat / IDefOpt / tpu / Forward |
0.0000015745 s |
0.0000015775 s |
1.00 |
Concat / JaXPipe / tpu / PreRev |
0.000002002275 s |
0.000001998375 s |
1.00 |
Concat / JaXPipe / tpu / PostRev |
0.0000020871500000000004 s |
0.0000020933 s |
1.00 |
Concat / JaXPipe / tpu / BothRev |
0.000002003325 s |
0.000001995775 s |
1.00 |
Concat / Jax / tpu / BothRev |
0.0000020753250000000003 s |
0.0000020737 s |
1.00 |
Concat / HLOOpt / tpu / PreRev |
0.000002013225 s |
0.000001997775 s |
1.01 |
Concat / HLOOpt / tpu / PostRev |
0.000002091225 s |
0.0000020863000000000003 s |
1.00 |
Concat / HLOOpt / tpu / BothRev |
0.00000202255 s |
0.0000019951 s |
1.01 |
Concat / PartOpt / tpu / PreRev |
0.0000020752 s |
0.000002087125 s |
0.99 |
Concat / PartOpt / tpu / PostRev |
0.000002015975 s |
0.000001998675 s |
1.01 |
Concat / PartOpt / tpu / BothRev |
0.000002083475 s |
0.000002087575 s |
1.00 |
Concat / IPartOpt / tpu / PreRev |
0.00000201125 s |
0.00000200505 s |
1.00 |
Concat / IPartOpt / tpu / PostRev |
0.000002072375 s |
0.0000020779 s |
1.00 |
Concat / IPartOpt / tpu / BothRev |
0.000002018075 s |
0.0000020009 s |
1.01 |
Concat / DefOpt / tpu / PreRev |
0.00000206945 s |
0.00000207885 s |
1.00 |
Concat / DefOpt / tpu / PostRev |
0.00000200075 s |
0.00000199995 s |
1.00 |
Concat / DefOpt / tpu / BothRev |
0.0000020683999999999995 s |
0.0000020805 s |
0.99 |
Concat / IDefOpt / tpu / PreRev |
0.000002007825 s |
0.0000019929 s |
1.01 |
Concat / IDefOpt / tpu / PostRev |
0.00000206635 s |
0.00000208795 s |
0.99 |
Concat / IDefOpt / tpu / BothRev |
0.0000020123750000000004 s |
0.000001993925 s |
1.01 |
Concat / JaXPipe / cpu / Primal |
0.000013001 s |
0.000007628220027982025 s |
1.70 |
Concat / Jax / cpu / Primal |
0.000013153 s |
0.000006918299950484652 s |
1.90 |
Concat / HLOOpt / cpu / Primal |
0.000012833 s |
0.000006906620074005332 s |
1.86 |
Concat / PartOpt / cpu / Primal |
0.000012929 s |
0.000007574280025437474 s |
1.71 |
Concat / IPartOpt / cpu / Primal |
0.000012904 s |
0.00000827180003398098 s |
1.56 |
Concat / DefOpt / cpu / Primal |
0.000012958 s |
0.000007524260072386824 s |
1.72 |
Concat / IDefOpt / cpu / Primal |
0.00001295 s |
0.000007567080010630889 s |
1.71 |
Concat / JaXPipe / cpu / Forward |
0.0000182 s |
0.000010493979989405488 s |
1.73 |
Concat / Jax / cpu / Forward |
0.000017488 s |
0.000010956819933198858 s |
1.60 |
Concat / HLOOpt / cpu / Forward |
0.000017589 s |
0.00001056330000210437 s |
1.67 |
Concat / PartOpt / cpu / Forward |
0.000017503999999999997 s |
0.000011186020019522404 s |
1.56 |
Concat / IPartOpt / cpu / Forward |
0.000017595 s |
0.00001060269996742136 s |
1.66 |
Concat / DefOpt / cpu / Forward |
0.000017607000000000003 s |
0.000010989679976773914 s |
1.60 |
Concat / IDefOpt / cpu / Forward |
0.000017643 s |
0.000011122739979327889 s |
1.59 |
Concat / JaXPipe / cpu / PreRev |
0.000019726 s |
0.000013004640004510291 s |
1.52 |
Concat / JaXPipe / cpu / PostRev |
0.000019657 s |
0.00001280795999264228 s |
1.53 |
Concat / JaXPipe / cpu / BothRev |
0.000019136 s |
0.000013130419956723928 s |
1.46 |
Concat / Jax / cpu / BothRev |
0.00001975 s |
0.00001225699990754947 s |
1.61 |
Concat / HLOOpt / cpu / PreRev |
0.000019731 s |
0.00001261576004253584 s |
1.56 |
Concat / HLOOpt / cpu / PostRev |
0.000019326 s |
0.000015385800033982377 s |
1.26 |
Concat / HLOOpt / cpu / BothRev |
0.000019266 s |
0.000012868580015492623 s |
1.50 |
Concat / PartOpt / cpu / PreRev |
0.000019798 s |
0.000012579099930007944 s |
1.57 |
Concat / PartOpt / cpu / PostRev |
0.000019496 s |
0.00001343272004305618 s |
1.45 |
Concat / PartOpt / cpu / BothRev |
0.000019719 s |
0.000013328659861144844 s |
1.48 |
Concat / IPartOpt / cpu / PreRev |
0.000019847 s |
0.000012759020046360093 s |
1.56 |
Concat / IPartOpt / cpu / PostRev |
0.000019652 s |
0.000012730760008707876 s |
1.54 |
Concat / IPartOpt / cpu / BothRev |
0.000019707 s |
0.000012841920033679344 s |
1.53 |
Concat / DefOpt / cpu / PreRev |
0.000019742 s |
0.00001291278002099716 s |
1.53 |
Concat / DefOpt / cpu / PostRev |
0.000019543 s |
0.00001354980011456064 s |
1.44 |
Concat / DefOpt / cpu / BothRev |
0.000019506 s |
0.00001339397997071501 s |
1.46 |
Concat / IDefOpt / cpu / PreRev |
0.00002005 s |
0.00001281239998206729 s |
1.56 |
Concat / IDefOpt / cpu / PostRev |
0.000019686 s |
0.000012565760098368628 s |
1.57 |
Concat / IDefOpt / cpu / BothRev |
0.000019597 s |
0.000013160980088287034 s |
1.49 |
const_scatter / JaXPipe / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / Jax / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / HLOOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
const_scatter / PartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / IPartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / DefOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
const_scatter / IDefOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
const_scatter / JaXPipe / cuda / Forward |
0.000010464 s |
0.000009792 s |
1.07 |
const_scatter / Jax / cuda / Forward |
0.000010272 s |
0.00001008 s |
1.02 |
const_scatter / HLOOpt / cuda / Forward |
0.000010336 s |
0.00001008 s |
1.03 |
const_scatter / PartOpt / cuda / Forward |
0.000010176 s |
0.00000976 s |
1.04 |
const_scatter / IPartOpt / cuda / Forward |
0.000009984 s |
0.000009857 s |
1.01 |
const_scatter / DefOpt / cuda / Forward |
0.000010112 s |
0.000009984 s |
1.01 |
const_scatter / IDefOpt / cuda / Forward |
0.000010272 s |
0.000009888 s |
1.04 |
const_scatter / JaXPipe / cuda / PreRev |
0.00001696 s |
0.000016929 s |
1.00 |
const_scatter / JaXPipe / cuda / PostRev |
0.00001696 s |
0.000016544 s |
1.03 |
const_scatter / JaXPipe / cuda / BothRev |
0.000016032999999999997 s |
0.000016832 s |
0.95 |
const_scatter / Jax / cuda / BothRev |
0.000016768000000000003 s |
0.00001728 s |
0.97 |
const_scatter / HLOOpt / cuda / PreRev |
0.000016704 s |
0.000016192 s |
1.03 |
const_scatter / HLOOpt / cuda / PostRev |
0.000016896000000000002 s |
0.000017087 s |
0.99 |
const_scatter / HLOOpt / cuda / BothRev |
0.000016193 s |
0.00001616 s |
1.00 |
const_scatter / PartOpt / cuda / PreRev |
0.00001648 s |
0.000017056 s |
0.97 |
const_scatter / PartOpt / cuda / PostRev |
0.000017088 s |
0.000015808 s |
1.08 |
const_scatter / PartOpt / cuda / BothRev |
0.000016447 s |
0.000018817 s |
0.87 |
const_scatter / IPartOpt / cuda / PreRev |
0.0000184 s |
0.000016864 s |
1.09 |
const_scatter / IPartOpt / cuda / PostRev |
0.00001632 s |
0.000016929 s |
0.96 |
const_scatter / IPartOpt / cuda / BothRev |
0.000016895 s |
0.000016224 s |
1.04 |
const_scatter / DefOpt / cuda / PreRev |
0.000018752000000000003 s |
0.000016288 s |
1.15 |
const_scatter / DefOpt / cuda / PostRev |
0.00001824 s |
0.000016512 s |
1.10 |
const_scatter / DefOpt / cuda / BothRev |
0.00001696 s |
0.000018752000000000003 s |
0.90 |
const_scatter / IDefOpt / cuda / PreRev |
0.00001664 s |
0.00001696 s |
0.98 |
const_scatter / IDefOpt / cuda / PostRev |
0.000017184 s |
0.000018337 s |
0.94 |
const_scatter / IDefOpt / cuda / BothRev |
0.000016544 s |
0.000016832 s |
0.98 |
const_scatter / JaXPipe / tpu / Primal |
0.000003783475 s |
0.000003809775 s |
0.99 |
const_scatter / Jax / tpu / Primal |
0.000003802475 s |
0.00000383075 s |
0.99 |
const_scatter / HLOOpt / tpu / Primal |
0.000003792025 s |
0.000003812875 s |
0.99 |
const_scatter / PartOpt / tpu / Primal |
0.00000381935 s |
0.000003816375 s |
1.00 |
const_scatter / IPartOpt / tpu / Primal |
0.00000378165 s |
0.000003810475 s |
0.99 |
const_scatter / DefOpt / tpu / Primal |
0.0000038116 s |
0.000003818025 s |
1.00 |
const_scatter / IDefOpt / tpu / Primal |
0.0000038054 s |
0.000003819575 s |
1.00 |
const_scatter / JaXPipe / tpu / Forward |
0.000006457525 s |
0.000006456925 s |
1.00 |
const_scatter / Jax / tpu / Forward |
0.00000648975 s |
0.0000064949 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.0000064604 s |
0.000006442675 s |
1.00 |
const_scatter / PartOpt / tpu / Forward |
0.0000064826 s |
0.000006506075 s |
1.00 |
const_scatter / IPartOpt / tpu / Forward |
0.000006458775 s |
0.000006462024999999999 s |
1.00 |
const_scatter / DefOpt / tpu / Forward |
0.000006482725 s |
0.000006515 s |
1.00 |
const_scatter / IDefOpt / tpu / Forward |
0.000006464975 s |
0.000006475775000000001 s |
1.00 |
const_scatter / JaXPipe / tpu / PreRev |
0.000006604275 s |
0.00000669465 s |
0.99 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006620075 s |
0.000006671425 s |
0.99 |
const_scatter / JaXPipe / tpu / BothRev |
0.00000663015 s |
0.000006659125 s |
1.00 |
const_scatter / Jax / tpu / BothRev |
0.0000066189 s |
0.0000066715250000000005 s |
0.99 |
const_scatter / HLOOpt / tpu / PreRev |
0.000006604425000000001 s |
0.000006665824999999999 s |
0.99 |
const_scatter / HLOOpt / tpu / PostRev |
0.000006648225 s |
0.000006661775 s |
1.00 |
const_scatter / HLOOpt / tpu / BothRev |
0.000006619825000000001 s |
0.000006681724999999999 s |
0.99 |
const_scatter / PartOpt / tpu / PreRev |
0.000006638199999999999 s |
0.0000066773 s |
0.99 |
const_scatter / PartOpt / tpu / PostRev |
0.000006605225 s |
0.00000667105 s |
0.99 |
const_scatter / PartOpt / tpu / BothRev |
0.000006621099999999999 s |
0.000006679924999999999 s |
0.99 |
const_scatter / IPartOpt / tpu / PreRev |
0.000006602550000000001 s |
0.00000667415 s |
0.99 |
const_scatter / IPartOpt / tpu / PostRev |
0.000006615299999999999 s |
0.000006661175 s |
0.99 |
const_scatter / IPartOpt / tpu / BothRev |
0.000006604149999999999 s |
0.00000665635 s |
0.99 |
const_scatter / DefOpt / tpu / PreRev |
0.000006615875 s |
0.0000066613 s |
0.99 |
const_scatter / DefOpt / tpu / PostRev |
0.000006617 s |
0.000006663225 s |
0.99 |
const_scatter / DefOpt / tpu / BothRev |
0.000006632625 s |
0.000006669075 s |
0.99 |
const_scatter / IDefOpt / tpu / PreRev |
0.0000066213500000000006 s |
0.000006667524999999999 s |
0.99 |
const_scatter / IDefOpt / tpu / PostRev |
0.000006633775 s |
0.000006665875 s |
1.00 |
const_scatter / IDefOpt / tpu / BothRev |
0.00000660775 s |
0.00000665495 s |
0.99 |
const_scatter / JaXPipe / cpu / Primal |
0.000012669 s |
0.000007149519988161046 s |
1.77 |
const_scatter / Jax / cpu / Primal |
0.000012623 s |
0.000007949459886731347 s |
1.59 |
const_scatter / HLOOpt / cpu / Primal |
0.000013334 s |
0.000008326620045409073 s |
1.60 |
const_scatter / PartOpt / cpu / Primal |
0.0000126 s |
0.000006578599986823974 s |
1.92 |
const_scatter / IPartOpt / cpu / Primal |
0.000012645 s |
0.000007133699964469997 s |
1.77 |
const_scatter / DefOpt / cpu / Primal |
0.000013197 s |
0.000007454360020346939 s |
1.77 |
const_scatter / IDefOpt / cpu / Primal |
0.000013228 s |
0.000007193499932327541 s |
1.84 |
const_scatter / JaXPipe / cpu / Forward |
0.000017942 s |
0.000011891360027220798 s |
1.51 |
const_scatter / Jax / cpu / Forward |
0.000016781 s |
0.00001069080000888789 s |
1.57 |
const_scatter / HLOOpt / cpu / Forward |
0.000017967 s |
0.000012667159935517704 s |
1.42 |
const_scatter / PartOpt / cpu / Forward |
0.000018010000000000002 s |
0.000011910540033568397 s |
1.51 |
const_scatter / IPartOpt / cpu / Forward |
0.000017712999999999998 s |
0.000012706300021818604 s |
1.39 |
const_scatter / DefOpt / cpu / Forward |
0.000018010000000000002 s |
0.000011037019939976744 s |
1.63 |
const_scatter / IDefOpt / cpu / Forward |
0.000017930000000000003 s |
0.000011218979943805607 s |
1.60 |
const_scatter / JaXPipe / cpu / PreRev |
0.000496989 s |
0.000288881239976 s |
1.72 |
const_scatter / JaXPipe / cpu / PostRev |
0.000502689 s |
0.0002845111800525 s |
1.77 |
const_scatter / JaXPipe / cpu / BothRev |
0.000516574 s |
0.000285194879998 s |
1.81 |
const_scatter / Jax / cpu / BothRev |
0.000509047 s |
0.0002820368200627 s |
1.80 |
const_scatter / HLOOpt / cpu / PreRev |
0.00049836 s |
0.0002914486799454 s |
1.71 |
const_scatter / HLOOpt / cpu / PostRev |
0.000514258 s |
0.0002887479600394 s |
1.78 |
const_scatter / HLOOpt / cpu / BothRev |
0.000521565 s |
0.0002859900799921 s |
1.82 |
const_scatter / PartOpt / cpu / PreRev |
0.000530882 s |
0.0002842538599543 s |
1.87 |
const_scatter / PartOpt / cpu / PostRev |
0.000513247 s |
0.0002842179198887 s |
1.81 |
const_scatter / PartOpt / cpu / BothRev |
0.000496544 s |
0.0002973716400265 s |
1.67 |
const_scatter / IPartOpt / cpu / PreRev |
0.000503489 s |
0.0002858380599536 s |
1.76 |
const_scatter / IPartOpt / cpu / PostRev |
0.00048986 s |
0.0002828226599194 s |
1.73 |
const_scatter / IPartOpt / cpu / BothRev |
0.000532755 s |
0.0002875438000046 s |
1.85 |
const_scatter / DefOpt / cpu / PreRev |
0.000507676 s |
0.0002848754000115 s |
1.78 |
const_scatter / DefOpt / cpu / PostRev |
0.000514424 s |
0.000288019380132 s |
1.79 |
const_scatter / DefOpt / cpu / BothRev |
0.000509288 s |
0.0002848568600711 s |
1.79 |
const_scatter / IDefOpt / cpu / PreRev |
0.000518629 s |
0.0002855306399942 s |
1.82 |
const_scatter / IDefOpt / cpu / PostRev |
0.00050351 s |
0.0002858346200628 s |
1.76 |
const_scatter / IDefOpt / cpu / BothRev |
0.000501344 s |
0.0002871412999957 s |
1.75 |
GenDot / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / Jax / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / HLOOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / PartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
GenDot / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
GenDot / DefOpt / cuda / Primal |
0.000001984 s |
0.000002015 s |
0.98 |
GenDot / IDefOpt / cuda / Primal |
0.000002015 s |
0.000001984 s |
1.02 |
GenDot / JaXPipe / cuda / Forward |
0.000010368 s |
0.000010304 s |
1.01 |
GenDot / Jax / cuda / Forward |
0.0000104 s |
0.000010209 s |
1.02 |
GenDot / HLOOpt / cuda / Forward |
0.00000992 s |
0.000010304 s |
0.96 |
GenDot / PartOpt / cuda / Forward |
0.000010176 s |
0.000010272 s |
0.99 |
GenDot / IPartOpt / cuda / Forward |
0.000010304 s |
0.000009984 s |
1.03 |
GenDot / DefOpt / cuda / Forward |
0.000010432 s |
0.000010304 s |
1.01 |
GenDot / IDefOpt / cuda / Forward |
0.000010048 s |
0.00001008 s |
1.00 |
GenDot / JaXPipe / cuda / PreRev |
0.000010304 s |
0.000010687 s |
0.96 |
GenDot / JaXPipe / cuda / PostRev |
0.000010016 s |
0.000010144 s |
0.99 |
GenDot / JaXPipe / cuda / BothRev |
0.000010112 s |
0.00001024 s |
0.99 |
GenDot / Jax / cuda / BothRev |
0.000010176 s |
0.000010592 s |
0.96 |
GenDot / HLOOpt / cuda / PreRev |
0.00001056 s |
0.000010464 s |
1.01 |
GenDot / HLOOpt / cuda / PostRev |
0.000010112 s |
0.000010368 s |
0.98 |
GenDot / HLOOpt / cuda / BothRev |
0.00000992 s |
0.000009888 s |
1.00 |
GenDot / PartOpt / cuda / PreRev |
0.000009856 s |
0.000010591 s |
0.93 |
GenDot / PartOpt / cuda / PostRev |
0.000010208 s |
0.000011136 s |
0.92 |
GenDot / PartOpt / cuda / BothRev |
0.000009984 s |
0.000010048 s |
0.99 |
GenDot / IPartOpt / cuda / PreRev |
0.000010176 s |
0.000010432 s |
0.98 |
GenDot / IPartOpt / cuda / PostRev |
0.000010208 s |
0.000011393 s |
0.90 |
GenDot / IPartOpt / cuda / BothRev |
0.000009824 s |
0.000009825 s |
1.00 |
GenDot / DefOpt / cuda / PreRev |
0.00001024 s |
0.000010912 s |
0.94 |
GenDot / DefOpt / cuda / PostRev |
0.000009984 s |
0.000010113 s |
0.99 |
GenDot / DefOpt / cuda / BothRev |
0.000010144 s |
0.000010144 s |
1 |
GenDot / IDefOpt / cuda / PreRev |
0.000010271 s |
0.000010304 s |
1.00 |
GenDot / IDefOpt / cuda / PostRev |
0.000010303 s |
0.000010336 s |
1.00 |
GenDot / IDefOpt / cuda / BothRev |
0.000010112 s |
0.000010464 s |
0.97 |
GenDot / JaXPipe / tpu / Primal |
9.2945e-7 s |
9.294e-7 s |
1.00 |
GenDot / Jax / tpu / Primal |
9.25125e-7 s |
9.2555e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.0000015664 s |
0.000001581275 s |
0.99 |
GenDot / PartOpt / tpu / Primal |
9.2575e-7 s |
9.25675e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.294e-7 s |
9.297e-7 s |
1.00 |
GenDot / DefOpt / tpu / Primal |
0.00000148535 s |
0.000001502775 s |
0.99 |
GenDot / IDefOpt / tpu / Primal |
0.0000015699 s |
0.0000015887 s |
0.99 |
GenDot / JaXPipe / tpu / Forward |
0.0000031532 s |
0.00000317675 s |
0.99 |
GenDot / Jax / tpu / Forward |
0.00000231895 s |
0.00000232435 s |
1.00 |
GenDot / HLOOpt / tpu / Forward |
0.000003106625 s |
0.000003131975 s |
0.99 |
GenDot / PartOpt / tpu / Forward |
0.00000320955 s |
0.0000032324 s |
0.99 |
GenDot / IPartOpt / tpu / Forward |
0.000003103475 s |
0.0000031277500000000003 s |
0.99 |
GenDot / DefOpt / tpu / Forward |
0.000003212475 s |
0.00000322695 s |
1.00 |
GenDot / IDefOpt / tpu / Forward |
0.0000031068000000000005 s |
0.000003140975 s |
0.99 |
GenDot / JaXPipe / tpu / PreRev |
0.0000029597250000000003 s |
0.000002981425 s |
0.99 |
GenDot / JaXPipe / tpu / PostRev |
0.000002406925 s |
0.0000024025000000000003 s |
1.00 |
GenDot / JaXPipe / tpu / BothRev |
0.0000029669000000000003 s |
0.0000029811 s |
1.00 |
GenDot / Jax / tpu / BothRev |
0.0000024024 s |
0.000002399675 s |
1.00 |
GenDot / HLOOpt / tpu / PreRev |
0.0000029507 s |
0.000002974 s |
0.99 |
GenDot / HLOOpt / tpu / PostRev |
0.000002947625 s |
0.000002929375 s |
1.01 |
GenDot / HLOOpt / tpu / BothRev |
0.0000029464 s |
0.00000298385 s |
0.99 |
GenDot / PartOpt / tpu / PreRev |
0.0000029233 s |
0.000002923525 s |
1.00 |
GenDot / PartOpt / tpu / PostRev |
0.0000024043 s |
0.00000239995 s |
1.00 |
GenDot / PartOpt / tpu / BothRev |
0.000002937975 s |
0.00000292955 s |
1.00 |
GenDot / IPartOpt / tpu / PreRev |
0.000002951175 s |
0.000002978925 s |
0.99 |
GenDot / IPartOpt / tpu / PostRev |
0.0000024099 s |
0.000002407425 s |
1.00 |
GenDot / IPartOpt / tpu / BothRev |
0.000002956175 s |
0.00000297985 s |
0.99 |
GenDot / DefOpt / tpu / PreRev |
0.0000029298000000000003 s |
0.0000029243750000000003 s |
1.00 |
GenDot / DefOpt / tpu / PostRev |
0.000002962875 s |
0.0000029796 s |
0.99 |
GenDot / DefOpt / tpu / BothRev |
0.000002932625 s |
0.0000029346499999999995 s |
1.00 |
GenDot / IDefOpt / tpu / PreRev |
0.000002957775 s |
0.0000029920249999999995 s |
0.99 |
GenDot / IDefOpt / tpu / PostRev |
0.000002937875 s |
0.0000029314000000000004 s |
1.00 |
GenDot / IDefOpt / tpu / BothRev |
0.000002965975 s |
0.0000029811 s |
0.99 |
GenDot / JaXPipe / cpu / Primal |
0.000015233999999999998 s |
0.000008721840058569797 s |
1.75 |
GenDot / Jax / cpu / Primal |
0.000015128000000000002 s |
0.000008433120092377066 s |
1.79 |
GenDot / HLOOpt / cpu / Primal |
0.00001388 s |
0.000008394980050070444 s |
1.65 |
GenDot / PartOpt / cpu / Primal |
0.000015032 s |
0.00000837149993458297 s |
1.80 |
GenDot / IPartOpt / cpu / Primal |
0.000015295 s |
0.00000854325995533145 s |
1.79 |
GenDot / DefOpt / cpu / Primal |
0.000013963 s |
0.000008057859904511133 s |
1.73 |
GenDot / IDefOpt / cpu / Primal |
0.000013919 s |
0.000007973839965416118 s |
1.75 |
GenDot / JaXPipe / cpu / Forward |
0.000019893 s |
0.000011691660020005655 s |
1.70 |
GenDot / Jax / cpu / Forward |
0.000020812 s |
0.000011326959993311905 s |
1.84 |
GenDot / HLOOpt / cpu / Forward |
0.000019095 s |
0.000011817120102932676 s |
1.62 |
GenDot / PartOpt / cpu / Forward |
0.00001864 s |
0.000011394320008548677 s |
1.64 |
GenDot / IPartOpt / cpu / Forward |
0.000019555 s |
0.00001216484004544327 s |
1.61 |
GenDot / DefOpt / cpu / Forward |
0.000019317 s |
0.00001196550010718056 s |
1.61 |
GenDot / IDefOpt / cpu / Forward |
0.000019205 s |
0.00001184767999802716 s |
1.62 |
GenDot / JaXPipe / cpu / PreRev |
0.00001945 s |
0.000012314019968471258 s |
1.58 |
GenDot / JaXPipe / cpu / PostRev |
0.000020175 s |
0.000011148480025440222 s |
1.81 |
GenDot / JaXPipe / cpu / BothRev |
0.000019256 s |
0.000012437979858077596 s |
1.55 |
GenDot / Jax / cpu / BothRev |
0.000020528 s |
0.000011231699991185453 s |
1.83 |
GenDot / HLOOpt / cpu / PreRev |
0.000019207 s |
0.000012683259974437532 s |
1.51 |
GenDot / HLOOpt / cpu / PostRev |
0.000019352 s |
0.000013554239958466496 s |
1.43 |
GenDot / HLOOpt / cpu / BothRev |
0.000019169 s |
0.000011783660029323074 s |
1.63 |
GenDot / PartOpt / cpu / PreRev |
0.000019796 s |
0.000011460859968792648 s |
1.73 |
GenDot / PartOpt / cpu / PostRev |
0.0000202 s |
0.000010740839989011875 s |
1.88 |
GenDot / PartOpt / cpu / BothRev |
0.000018923 s |
0.000011952399981964843 s |
1.58 |
GenDot / IPartOpt / cpu / PreRev |
0.00001943 s |
0.00001170288000139408 s |
1.66 |
GenDot / IPartOpt / cpu / PostRev |
0.000020156 s |
0.000011515260030137145 s |
1.75 |
GenDot / IPartOpt / cpu / BothRev |
0.000019228 s |
0.000012036140014970444 s |
1.60 |
GenDot / DefOpt / cpu / PreRev |
0.000019253 s |
0.00001183655995191657 s |
1.63 |
GenDot / DefOpt / cpu / PostRev |
0.000019202 s |
0.000011791820052167168 s |
1.63 |
GenDot / DefOpt / cpu / BothRev |
0.00001932 s |
0.000012289839978620876 s |
1.57 |
GenDot / IDefOpt / cpu / PreRev |
0.000018827 s |
0.000011853979976876872 s |
1.59 |
GenDot / IDefOpt / cpu / PostRev |
0.000019058 s |
0.000011159560017404146 s |
1.71 |
GenDot / IDefOpt / cpu / BothRev |
0.000018982 s |
0.000012050519981130492 s |
1.58 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / Jax / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001983 s |
0.000001984 s |
1.00 |
hlo_ffi / PartOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002049 s |
0.00000208 s |
0.99 |
hlo_ffi / Jax / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / HLOOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / PartOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IPartOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / DefOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / IDefOpt / cuda / Forward |
0.00000208 s |
0.00000208 s |
1 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / Jax / cuda / BothRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Primal |
9.28575e-7 s |
9.315e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Primal |
9.5065e-7 s |
9.547e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Primal |
9.06675e-7 s |
9.0845e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Primal |
9.527e-7 s |
9.568749999999998e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Primal |
9.06675e-7 s |
9.10425e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Primal |
9.5065e-7 s |
9.53225e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Primal |
9.08825e-7 s |
9.049e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / Forward |
9.4945e-7 s |
9.495e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.81775e-7 s |
9.81775e-7 s |
1 |
hlo_ffi / HLOOpt / tpu / Forward |
9.74675e-7 s |
9.736e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.34625e-7 s |
9.34375e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / Forward |
9.73775e-7 s |
9.73775e-7 s |
1 |
hlo_ffi / DefOpt / tpu / Forward |
9.3455e-7 s |
9.3415e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / Forward |
9.74625e-7 s |
9.742e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.37825e-7 s |
9.377e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.6525e-7 s |
9.6535e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.625749999999998e-7 s |
9.6205e-7 s |
1.00 |
hlo_ffi / Jax / tpu / BothRev |
9.651e-7 s |
9.65125e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.62125e-7 s |
9.628e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.6505e-7 s |
9.64675e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.62125e-7 s |
9.61425e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PreRev |
9.65175e-7 s |
9.64875e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.62225e-7 s |
9.616e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / BothRev |
9.65225e-7 s |
9.650749999999998e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.61875e-7 s |
9.62125e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.64625e-7 s |
9.6515e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.618e-7 s |
9.62425e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PreRev |
9.651e-7 s |
9.64625e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / PostRev |
9.62475e-7 s |
9.62125e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / BothRev |
9.64675e-7 s |
9.64575e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.62225e-7 s |
9.61425e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.649e-7 s |
9.64925e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.625250000000002e-7 s |
9.62175e-7 s |
1.00 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000017303 s |
0.00001101443993320572 s |
1.57 |
hlo_ffi / Jax / cpu / Primal |
0.000017049999999999998 s |
0.00001038607993905316 s |
1.64 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000016889 s |
0.00000995977994534769 s |
1.70 |
hlo_ffi / PartOpt / cpu / Primal |
0.000017099 s |
0.000010140960057469783 s |
1.69 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017047 s |
0.000010418199926789383 s |
1.64 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017088 s |
0.000009951919982995604 s |
1.72 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000017173 s |
0.000009953820026566972 s |
1.73 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000024609 s |
0.000014495539962808836 s |
1.70 |
hlo_ffi / Jax / cpu / Forward |
0.000023795 s |
0.00001425110014679376 s |
1.67 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000024204 s |
0.000014269539951783372 s |
1.70 |
hlo_ffi / PartOpt / cpu / Forward |
0.000023764 s |
0.000014493419930659002 s |
1.64 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000023974 s |
0.000014325679949251936 s |
1.67 |
hlo_ffi / DefOpt / cpu / Forward |
0.000023912 s |
0.000014539819985657232 s |
1.64 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000023515 s |
0.000014344060036819427 s |
1.64 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.00002426 s |
0.00001870272000815021 s |
1.30 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000023166 s |
0.000014635620009357807 s |
1.58 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000023496 s |
0.000014097020048211562 s |
1.67 |
hlo_ffi / Jax / cpu / BothRev |
0.000023183 s |
0.000014634620056312997 s |
1.58 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000023997 s |
0.000014630739970016294 s |
1.64 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.0000235 s |
0.000018660599998838734 s |
1.26 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000023225 s |
0.000014310059978015488 s |
1.62 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000023511 s |
0.000014527380026265746 s |
1.62 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000023395 s |
0.000014577159945474704 s |
1.60 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000023626 s |
0.00001433232004274032 s |
1.65 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000023608 s |
0.00001449733992558322 s |
1.63 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000023435 s |
0.000014475579864665634 s |
1.62 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000023071 s |
0.00001461617996028508 s |
1.58 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000023847 s |
0.000014917540011083474 s |
1.60 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000023922 s |
0.000014270540013967547 s |
1.68 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000023143 s |
0.000014421520027099178 s |
1.60 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000023709 s |
0.000014733080060977954 s |
1.61 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000023419 s |
0.000014166279979690444 s |
1.65 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000023752 s |
0.000014473339997493894 s |
1.64 |
jaxmd20 / JaXPipe / cuda / Primal |
0.0014578249999999 s |
0.001461953 s |
1.00 |
jaxmd20 / Jax / cuda / Primal |
0.001506528 s |
0.001513856 s |
1.00 |
jaxmd20 / HLOOpt / cuda / Primal |
0.00133968 s |
0.0013292479999999 s |
1.01 |
jaxmd20 / PartOpt / cuda / Primal |
0.0013274559999999 s |
0.001310145 s |
1.01 |
jaxmd20 / IPartOpt / cuda / Primal |
0.001320415 s |
0.001310016 s |
1.01 |
jaxmd20 / DefOpt / cuda / Primal |
0.000913184 s |
0.000921888 s |
0.99 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000944736 s |
0.000957184 s |
0.99 |
jaxmd20 / JaXPipe / cuda / Forward |
0.001560065 s |
0.001541601 s |
1.01 |
jaxmd20 / Jax / cuda / Forward |
0.001797631 s |
0.0017879039999999 s |
1.01 |
jaxmd20 / HLOOpt / cuda / Forward |
0.001667296 s |
0.001628385 s |
1.02 |
jaxmd20 / PartOpt / cuda / Forward |
0.0016425 s |
0.001642914 s |
1.00 |
jaxmd20 / IPartOpt / cuda / Forward |
0.00161024 s |
0.0016282879999999 s |
0.99 |
jaxmd20 / DefOpt / cuda / Forward |
0.001635167 s |
0.0016393599999999 s |
1.00 |
jaxmd20 / IDefOpt / cuda / Forward |
0.001616801 s |
0.001614976 s |
1.00 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.002667714 s |
0.002704768 s |
0.99 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005376929 s |
0.005357793 s |
1.00 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.0027833999999999 s |
0.002702431 s |
1.03 |
jaxmd20 / Jax / cuda / BothRev |
0.005346277 s |
0.005367966 s |
1.00 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.00275184 s |
0.002775808 s |
0.99 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.00532515 s |
0.005392673 s |
0.99 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.002715555 s |
0.002727138 s |
1.00 |
jaxmd20 / PartOpt / cuda / PreRev |
0.00281008 s |
0.002822815 s |
1.00 |
jaxmd20 / PartOpt / cuda / PostRev |
0.0054553599999999 s |
0.0054714609999999 s |
1.00 |
jaxmd20 / PartOpt / cuda / BothRev |
0.002756899 s |
0.002776291 s |
0.99 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.002830081 s |
0.002825184 s |
1.00 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.005462531 s |
0.005470689 s |
1.00 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.002756481 s |
0.002770497 s |
0.99 |
jaxmd20 / DefOpt / cuda / PreRev |
0.00282221 s |
0.002826978 s |
1.00 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002792354 s |
0.002780129 s |
1.00 |
jaxmd20 / DefOpt / cuda / BothRev |
0.002755296 s |
0.002751842 s |
1.00 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.0028104 s |
0.002833476 s |
0.99 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.002306015 s |
0.002308319 s |
1.00 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.002864993 s |
0.002736737 s |
1.05 |
jaxmd20 / JaXPipe / tpu / Primal |
0.009286555 s |
0.00926679 s |
1.00 |
jaxmd20 / Jax / tpu / Primal |
0.0092696137499999 s |
0.009271479375 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Primal |
0.0092450181249999 s |
0.009153965 s |
1.01 |
jaxmd20 / PartOpt / tpu / Primal |
0.0091991975 s |
0.009208060625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Primal |
0.0091993399999999 s |
0.00920467375 s |
1.00 |
jaxmd20 / DefOpt / tpu / Primal |
0.008796989375 s |
0.008802515 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Primal |
0.008700651875 s |
0.008702111875 s |
1.00 |
jaxmd20 / JaXPipe / tpu / Forward |
0.017415965 s |
0.017405166875 s |
1.00 |
jaxmd20 / Jax / tpu / Forward |
0.018733326875 s |
0.018732650625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / Forward |
0.01739920375 s |
0.0174011031249999 s |
1.00 |
jaxmd20 / PartOpt / tpu / Forward |
0.01741451375 s |
0.017417733125 s |
1.00 |
jaxmd20 / IPartOpt / tpu / Forward |
0.017409101875 s |
0.01741585875 s |
1.00 |
jaxmd20 / DefOpt / tpu / Forward |
0.017413069375 s |
0.01742094875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / Forward |
0.017412073125 s |
0.017415853125 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PreRev |
0.02544395125 s |
0.025454308125 s |
1.00 |
jaxmd20 / JaXPipe / tpu / PostRev |
0.02187623375 s |
0.021875835625 s |
1.00 |
jaxmd20 / JaXPipe / tpu / BothRev |
0.02546934 s |
0.025454168125 s |
1.00 |
jaxmd20 / Jax / tpu / BothRev |
0.0218733775 s |
0.02187132625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PreRev |
0.025586291875 s |
0.025574765625 s |
1.00 |
jaxmd20 / HLOOpt / tpu / PostRev |
0.0210028875 s |
0.020701680625 s |
1.01 |
jaxmd20 / HLOOpt / tpu / BothRev |
0.0256876681249999 s |
0.02567604875 s |
1.00 |
jaxmd20 / PartOpt / tpu / PreRev |
0.0254885425 s |
0.0254586274999999 s |
1.00 |
jaxmd20 / PartOpt / tpu / PostRev |
0.021538471875 s |
0.021507803125 s |
1.00 |
jaxmd20 / PartOpt / tpu / BothRev |
0.025570053125 s |
0.025554400625 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PreRev |
0.02547263125 s |
0.025461835 s |
1.00 |
jaxmd20 / IPartOpt / tpu / PostRev |
0.021521653125 s |
0.02151244375 s |
1.00 |
jaxmd20 / IPartOpt / tpu / BothRev |
0.025569895625 s |
0.025553270625 s |
1.00 |
jaxmd20 / DefOpt / tpu / PreRev |
0.02548715875 s |
0.02545994625 s |
1.00 |
jaxmd20 / DefOpt / tpu / PostRev |
0.0188232662499999 s |
0.018819655 s |
1.00 |
jaxmd20 / DefOpt / tpu / BothRev |
0.025572689375 s |
0.025548431875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PreRev |
0.025472184375 s |
0.025461729375 s |
1.00 |
jaxmd20 / IDefOpt / tpu / PostRev |
0.0183257174999999 s |
0.018307651875 s |
1.00 |
jaxmd20 / IDefOpt / tpu / BothRev |
0.0255687 s |
0.02554862875 s |
1.00 |
jaxmd40 / JaXPipe / cpu / Primal |
0.072995583 s |
0.0684312 s |
1.07 |
jaxmd40 / Jax / cpu / Primal |
0.077871358 s |
0.073527102 s |
1.06 |
jaxmd40 / HLOOpt / cpu / Primal |
0.0919285629999999 s |
0.0889373889999999 s |
1.03 |
jaxmd40 / PartOpt / cpu / Primal |
0.074208656 s |
0.074375692 s |
1.00 |
jaxmd40 / IPartOpt / cpu / Primal |
0.064872874 s |
0.072943314 s |
0.89 |
jaxmd40 / DefOpt / cpu / Primal |
0.096048702 s |
0.0914378459999999 s |
1.05 |
jaxmd40 / IDefOpt / cpu / Primal |
0.0894831 s |
0.092339397 s |
0.97 |
jaxmd40 / JaXPipe / cpu / Forward |
0.170373676 s |
0.1712232119999999 s |
1.00 |
jaxmd40 / Jax / cpu / Forward |
0.086743587 s |
0.0903042959999999 s |
0.96 |
jaxmd40 / HLOOpt / cpu / Forward |
0.173024437 s |
0.1781862639999999 s |
0.97 |
jaxmd40 / PartOpt / cpu / Forward |
0.163292465 s |
0.164765115 s |
0.99 |
jaxmd40 / IPartOpt / cpu / Forward |
0.162668462 s |
0.175371687 s |
0.93 |
jaxmd40 / DefOpt / cpu / Forward |
0.167398347 s |
0.166921464 s |
1.00 |
jaxmd40 / IDefOpt / cpu / Forward |
0.161209102 s |
0.166643084 s |
0.97 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.243966287 s |
0.225262923 s |
1.08 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.144313544 s |
0.142715861 s |
1.01 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.236825549 s |
0.221521487 s |
1.07 |
jaxmd40 / Jax / cpu / BothRev |
0.134210464 s |
0.141799249 s |
0.95 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.219472778 s |
0.220571377 s |
1.00 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.179709087 s |
0.1786106069999999 s |
1.01 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.2551370169999999 s |
0.252948686 s |
1.01 |
jaxmd40 / PartOpt / cpu / PreRev |
0.237557499 s |
0.227161047 s |
1.05 |
jaxmd40 / PartOpt / cpu / PostRev |
0.124745071 s |
0.132065355 s |
0.94 |
jaxmd40 / PartOpt / cpu / BothRev |
0.25122594 s |
0.239801865 s |
1.05 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.223053461 s |
0.216225257 s |
1.03 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.133288232 s |
0.131393485 s |
1.01 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.247687632 s |
0.2527059299999999 s |
0.98 |
jaxmd40 / DefOpt / cpu / PreRev |
0.2257147099999999 s |
0.224775399 s |
1.00 |
jaxmd40 / DefOpt / cpu / PostRev |
0.167109407 s |
0.171236742 s |
0.98 |
jaxmd40 / DefOpt / cpu / BothRev |
0.252800479 s |
0.2603366589999999 s |
0.97 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.2208882439999999 s |
0.21852543 s |
1.01 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.170210224 s |
0.175843656 s |
0.97 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.2327637 s |
0.242549485 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000279488 s |
0.000281344 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000279168 s |
0.000280544 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.0002863999999999 s |
0.000288127 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.00027888 s |
0.0002802879999999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.0002790079999999 s |
0.0002807669999999 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.00028608 s |
0.000287712 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000288352 s |
0.000288673 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000557728 s |
0.0005579199999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000538879 s |
0.0005402559999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.0005574719999999 s |
0.000558048 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.000557056 s |
0.000556992 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.000558368 s |
0.00055888 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000557728 s |
0.000557985 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000558208 s |
0.000557952 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001024416 s |
0.0010287999999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.000984864 s |
0.000987616 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001026175 s |
0.001027104 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.000988768 s |
0.000989728 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001012865 s |
0.001014016 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001037856 s |
0.001041407 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001011616 s |
0.00101472 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.00102784 s |
0.00103056 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000976064 s |
0.000978272 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001028355 s |
0.001029024 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001026624 s |
0.001029248 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000976225 s |
0.00097744 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.00102528 s |
0.001028896 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001022081 s |
0.001025217 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.00096064 s |
0.000963104 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001024672 s |
0.001026944 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001021986 s |
0.001022464 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001020897 s |
0.001023265 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.001022305 s |
0.001024065 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.000130708 s |
0.0001308955 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.00012383425 s |
0.00012357475 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.0001600842499999 s |
0.00015994075 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.0001313199999999 s |
0.0001310315 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.00013863775 s |
0.0001380562499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.00014560525 s |
0.000145431 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.0001580635 s |
0.0001580635 s |
1 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.0002137719999999 s |
0.0002134492499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.00026212075 s |
0.000262756 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00022022475 s |
0.00022040825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.0002150175 s |
0.0002148825 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.000215695 s |
0.00021563875 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.000217985 s |
0.00021784225 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.000216062 s |
0.00021554975 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035520125 s |
0.00035536275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.000256249 s |
0.00025649775 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.0003552665 s |
0.00035530275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.000256134 s |
0.00025637775 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.0003551575 s |
0.0003550855 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.00029047175 s |
0.0002903284999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.00035539675 s |
0.0003555837499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.0003548515 s |
0.00035495925 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.0002717762499999 s |
0.0002719545 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.0003546925 s |
0.00035480775 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.0003553074999999 s |
0.0003554125 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.00027166775 s |
0.000271823 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.000355185 s |
0.0003553045 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.0003573155 s |
0.000357019 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.00028388325 s |
0.00028346925 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.00035686225 s |
0.0003572665 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.00035776925 s |
0.000357612 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.00030046675 s |
0.0003007119999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.00035785775 s |
0.00035751075 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002014537 s |
0.000942452199888 s |
2.14 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.001874894 s |
0.0009337108000181 s |
2.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.002008521 s |
0.0010303778000888 s |
1.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.001761161 s |
0.0009287074000894 s |
1.90 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.001543682 s |
0.0009517532000245 s |
1.62 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0017202679999999 s |
0.0009927374001563 s |
1.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.001921045 s |
0.0009864354000455 s |
1.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.004989449 s |
0.0024338845998499 s |
2.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0052621149999999 s |
0.0024766360000285 s |
2.12 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.005029493 s |
0.0023427183999956 s |
2.15 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.005188018 s |
0.002410401599991 s |
2.15 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.005162447 s |
0.0022836635998828 s |
2.26 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005029522 s |
0.0023569746001157 s |
2.13 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.004948357 s |
0.0023437149997334 s |
2.11 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0094842289999999 s |
0.0051605818000098 s |
1.84 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.009833693 s |
0.0058353222000732 s |
1.69 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.007870036 s |
0.0058945830003722 s |
1.34 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.008958305 s |
0.0056922598001619 s |
1.57 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.008127435 s |
0.0058994788001655 s |
1.38 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.008445199 s |
0.0056784344000334 s |
1.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.008108179 s |
0.006528671200067 s |
1.24 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.00774212 s |
0.0035868383998604 s |
2.16 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.008358503 s |
0.0063951135998649 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.009540762 s |
0.0056096246000379 s |
1.70 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.007322973 s |
0.0056195833998572 s |
1.30 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.009188155 s |
0.0056347833999097 s |
1.63 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0091648169999999 s |
0.005715845800114 s |
1.60 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.008688635 s |
0.0036831345998507 s |
2.36 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.006714003 s |
0.0055874166000648 s |
1.20 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.008662615 s |
0.0065726325998184 s |
1.32 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.008276221 s |
0.0036468901998887 s |
2.27 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.007799209 s |
0.0057078440000623 s |
1.37 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.008331845 s |
0.0058645370001613 s |
1.42 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.702005693 s |
1.702956951 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.704635883 s |
1.704677063 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.715270902 s |
1.7155073010000002 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.697321589 s |
1.69709385 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.695032195 s |
1.694971321 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.666882187 s |
1.665851588 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.910868471 s |
1.912409822 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.038949814375 s |
3.038985894375 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.03955622 s |
3.03952863125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.121863075625 s |
3.121900450625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.06030164 s |
3.060244081875 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.060495848125 s |
3.06051306875 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.10247015875 s |
2.1025194700000003 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
4.35654923375 s |
4.35645801625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.317055561 s |
6.19071066 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.172251312999999 s |
6.268450071 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
6.254881232 s |
6.097149982 s |
1.03 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.167481484 s |
6.316660036999999 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.30376062 s |
6.199206968 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.5054433090000003 s |
2.463648363 s |
1.02 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.540719687 s |
6.656476705 s |
0.98 |
scatter_sum / JaXPipe / cuda / Primal |
0.000009856 s |
0.000010048 s |
0.98 |
scatter_sum / Jax / cuda / Primal |
0.00001008 s |
0.000009888 s |
1.02 |
scatter_sum / HLOOpt / cuda / Primal |
0.000009856 s |
0.000009952 s |
0.99 |
scatter_sum / PartOpt / cuda / Primal |
0.00001008 s |
0.000009888 s |
1.02 |
scatter_sum / IPartOpt / cuda / Primal |
0.00001104 s |
0.000010496 s |
1.05 |
scatter_sum / DefOpt / cuda / Primal |
0.000011424 s |
0.000010464 s |
1.09 |
scatter_sum / IDefOpt / cuda / Primal |
0.000009888 s |
0.000010112 s |
0.98 |
scatter_sum / JaXPipe / cuda / Forward |
0.000017088 s |
0.000017503999999999997 s |
0.98 |
scatter_sum / Jax / cuda / Forward |
0.000017601 s |
0.000017536 s |
1.00 |
scatter_sum / HLOOpt / cuda / Forward |
0.000017184 s |
0.000017344 s |
0.99 |
scatter_sum / PartOpt / cuda / Forward |
0.000017247999999999998 s |
0.000017472 s |
0.99 |
scatter_sum / IPartOpt / cuda / Forward |
0.000017631 s |
0.000017568000000000002 s |
1.00 |
scatter_sum / DefOpt / cuda / Forward |
0.00001728 s |
0.00001744 s |
0.99 |
scatter_sum / IDefOpt / cuda / Forward |
0.00001712 s |
0.000017344 s |
0.99 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000017056 s |
0.000017568000000000002 s |
0.97 |
scatter_sum / JaXPipe / cuda / PostRev |
0.00001664 s |
0.00001728 s |
0.96 |
scatter_sum / JaXPipe / cuda / BothRev |
0.000016768000000000003 s |
0.0000184 s |
0.91 |
scatter_sum / Jax / cuda / BothRev |
0.000017056 s |
0.000017247 s |
0.99 |
scatter_sum / HLOOpt / cuda / PreRev |
0.00001728 s |
0.000017375999999999998 s |
0.99 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000016896999999999998 s |
0.000017216 s |
0.98 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000017247999999999998 s |
0.000017824 s |
0.97 |
scatter_sum / PartOpt / cuda / PreRev |
0.000017472 s |
0.000020064 s |
0.87 |
scatter_sum / PartOpt / cuda / PostRev |
0.0000168 s |
0.000017184 s |
0.98 |
scatter_sum / PartOpt / cuda / BothRev |
0.000017568000000000002 s |
0.000017056 s |
1.03 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000017152 s |
0.000017726999999999998 s |
0.97 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000017696 s |
0.000019648 s |
0.90 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000017152 s |
0.00001744 s |
0.98 |
scatter_sum / DefOpt / cuda / PreRev |
0.000017247999999999998 s |
0.000017728 s |
0.97 |
scatter_sum / DefOpt / cuda / PostRev |
0.00001744 s |
0.000017375999999999998 s |
1.00 |
scatter_sum / DefOpt / cuda / BothRev |
0.000017216 s |
0.000017471 s |
0.99 |
scatter_sum / IDefOpt / cuda / PreRev |
0.000017728 s |
0.000017728 s |
1 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000016736 s |
0.00001712 s |
0.98 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000017119 s |
0.000017472 s |
0.98 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001343625 s |
0.000001343 s |
1.00 |
scatter_sum / Jax / tpu / Primal |
0.0000014043 s |
0.0000014042750000000002 s |
1.00 |
scatter_sum / HLOOpt / tpu / Primal |
0.000001343475 s |
0.00000134435 s |
1.00 |
scatter_sum / PartOpt / tpu / Primal |
0.000001404025 s |
0.0000014044 s |
1.00 |
scatter_sum / IPartOpt / tpu / Primal |
0.000001343275 s |
0.0000013437 s |
1.00 |
scatter_sum / DefOpt / tpu / Primal |
0.00000140405 s |
0.000001404675 s |
1.00 |
scatter_sum / IDefOpt / tpu / Primal |
0.00000134325 s |
0.00000134335 s |
1.00 |
scatter_sum / JaXPipe / tpu / Forward |
0.0000027019500000000004 s |
0.000002697975 s |
1.00 |
scatter_sum / Jax / tpu / Forward |
0.0000027221 s |
0.000002713525 s |
1.00 |
scatter_sum / HLOOpt / tpu / Forward |
0.000002707325 s |
0.0000027010000000000005 s |
1.00 |
scatter_sum / PartOpt / tpu / Forward |
0.0000026926 s |
0.000002683925 s |
1.00 |
scatter_sum / IPartOpt / tpu / Forward |
0.000002711075 s |
0.000002698175 s |
1.00 |
scatter_sum / DefOpt / tpu / Forward |
0.00000268695 s |
0.0000026904 s |
1.00 |
scatter_sum / IDefOpt / tpu / Forward |
0.000002711 s |
0.00000270295 s |
1.00 |
scatter_sum / JaXPipe / tpu / PreRev |
0.0000026928000000000003 s |
0.00000268745 s |
1.00 |
scatter_sum / JaXPipe / tpu / PostRev |
0.0000026872750000000005 s |
0.000002688775 s |
1.00 |
scatter_sum / JaXPipe / tpu / BothRev |
0.000002698725 s |
0.0000026938250000000004 s |
1.00 |
scatter_sum / Jax / tpu / BothRev |
0.000002741475 s |
0.00000274215 s |
1.00 |
scatter_sum / HLOOpt / tpu / PreRev |
0.000002698775 s |
0.000002695575 s |
1.00 |
scatter_sum / HLOOpt / tpu / PostRev |
0.0000027431250000000005 s |
0.000002761025 s |
0.99 |
scatter_sum / HLOOpt / tpu / BothRev |
0.000002695275 s |
0.0000026945750000000004 s |
1.00 |
scatter_sum / PartOpt / tpu / PreRev |
0.0000027447500000000003 s |
0.0000027512 s |
1.00 |
scatter_sum / PartOpt / tpu / PostRev |
0.000002695475 s |
0.000002691725 s |
1.00 |
scatter_sum / PartOpt / tpu / BothRev |
0.0000027447749999999995 s |
0.0000027453 s |
1.00 |
scatter_sum / IPartOpt / tpu / PreRev |
0.0000026996 s |
0.0000026937 s |
1.00 |
scatter_sum / IPartOpt / tpu / PostRev |
0.000002749525 s |
0.0000027463500000000004 s |
1.00 |
scatter_sum / IPartOpt / tpu / BothRev |
0.000002697975 s |
0.0000026924 s |
1.00 |
scatter_sum / DefOpt / tpu / PreRev |
0.00000274075 s |
0.000002739525 s |
1.00 |
scatter_sum / DefOpt / tpu / PostRev |
0.000002711675 s |
0.0000026955 s |
1.01 |
scatter_sum / DefOpt / tpu / BothRev |
0.000002739975 s |
0.00000273795 s |
1.00 |
scatter_sum / IDefOpt / tpu / PreRev |
0.00000270325 s |
0.0000026977 s |
1.00 |
scatter_sum / IDefOpt / tpu / PostRev |
0.00000274865 s |
0.00000273565 s |
1.00 |
scatter_sum / IDefOpt / tpu / BothRev |
0.000002696575 s |
0.0000027043 s |
1.00 |
scatter_sum / JaXPipe / cpu / Primal |
0.000015665 s |
0.000007906760001787916 s |
1.98 |
scatter_sum / Jax / cpu / Primal |
0.000015958 s |
0.000008491600056004245 s |
1.88 |
scatter_sum / HLOOpt / cpu / Primal |
0.000015695 s |
0.00000850741997055593 s |
1.84 |
scatter_sum / PartOpt / cpu / Primal |
0.000015453 s |
0.000008762860034039477 s |
1.76 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015765999999999998 s |
0.000008235420009441441 s |
1.91 |
scatter_sum / DefOpt / cpu / Primal |
0.000015765999999999998 s |
0.000008393819953198545 s |
1.88 |
scatter_sum / IDefOpt / cpu / Primal |
0.000015472 s |
0.000008244539985753363 s |
1.88 |
scatter_sum / JaXPipe / cpu / Forward |
0.000023147 s |
0.000012941920012963236 s |
1.79 |
scatter_sum / Jax / cpu / Forward |
0.000022074 s |
0.0000128758200116863 s |
1.71 |
scatter_sum / HLOOpt / cpu / Forward |
0.000022661 s |
0.000013030260015511886 s |
1.74 |
scatter_sum / PartOpt / cpu / Forward |
0.000022276 s |
0.00001253974001883762 s |
1.78 |
scatter_sum / IPartOpt / cpu / Forward |
0.000021786 s |
0.000013102699940645834 s |
1.66 |
scatter_sum / DefOpt / cpu / Forward |
0.000022616 s |
0.000012640539953281405 s |
1.79 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022472 s |
0.000012317439995968016 s |
1.82 |
scatter_sum / JaXPipe / cpu / PreRev |
0.00002341 s |
0.00001348465992123238 s |
1.74 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000022547 s |
0.000012731879978673533 s |
1.77 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000022338 s |
0.000012610540015884909 s |
1.77 |
scatter_sum / Jax / cpu / BothRev |
0.000022252 s |
0.000013955280082882384 s |
1.59 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022457 s |
0.00001342508003290277 s |
1.67 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000023001 s |
0.000014285839952208337 s |
1.61 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000022686000000000003 s |
0.000012689840004895814 s |
1.79 |
scatter_sum / PartOpt / cpu / PreRev |
0.000022885 s |
0.00001280814005440334 s |
1.79 |
scatter_sum / PartOpt / cpu / PostRev |
0.000022865 s |
0.000012537939892354188 s |
1.82 |
scatter_sum / PartOpt / cpu / BothRev |
0.000022125 s |
0.000013116259906382763 s |
1.69 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000022604 s |
0.000012832160045945784 s |
1.76 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000022253 s |
0.000012869779984612253 s |
1.73 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000022728000000000003 s |
0.000012823780089092909 s |
1.77 |
scatter_sum / DefOpt / cpu / PreRev |
0.000022877 s |
0.000013038839988439576 s |
1.75 |
scatter_sum / DefOpt / cpu / PostRev |
0.000021823 s |
0.00001323500006037648 s |
1.65 |
scatter_sum / DefOpt / cpu / BothRev |
0.000022387 s |
0.0000122463000479911 s |
1.83 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000022417 s |
0.000013496480041794712 s |
1.66 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000022295 s |
0.000012729499976558144 s |
1.75 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000022397 s |
0.00001315758005148382 s |
1.70 |
slicing / JaXPipe / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
slicing / Jax / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / HLOOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
slicing / PartOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
slicing / IPartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / DefOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / IDefOpt / cuda / Primal |
0.000001887 s |
0.000001888 s |
1.00 |
slicing / JaXPipe / cuda / Forward |
0.000010208 s |
0.000010048 s |
1.02 |
slicing / Jax / cuda / Forward |
0.000010112 s |
0.000010049 s |
1.01 |
slicing / HLOOpt / cuda / Forward |
0.00001008 s |
0.000010272 s |
0.98 |
slicing / PartOpt / cuda / Forward |
0.00001056 s |
0.000010304 s |
1.02 |
slicing / IPartOpt / cuda / Forward |
0.000009856 s |
0.000010175 s |
0.97 |
slicing / DefOpt / cuda / Forward |
0.000010208 s |
0.000009696 s |
1.05 |
slicing / IDefOpt / cuda / Forward |
0.000010048 s |
0.000009824 s |
1.02 |
slicing / JaXPipe / cuda / PreRev |
0.000009984 s |
0.000010049 s |
0.99 |
slicing / JaXPipe / cuda / PostRev |
0.000009856 s |
0.000010272 s |
0.96 |
slicing / JaXPipe / cuda / BothRev |
0.000010144 s |
0.000009984 s |
1.02 |
slicing / Jax / cuda / BothRev |
0.000010143 s |
0.000010336 s |
0.98 |
slicing / HLOOpt / cuda / PreRev |
0.000009824 s |
0.000010176 s |
0.97 |
slicing / HLOOpt / cuda / PostRev |
0.000009664 s |
0.000010304 s |
0.94 |
slicing / HLOOpt / cuda / BothRev |
0.000009857 s |
0.00001008 s |
0.98 |
slicing / PartOpt / cuda / PreRev |
0.000009536 s |
0.000009952 s |
0.96 |
slicing / PartOpt / cuda / PostRev |
0.000009984 s |
0.000009888 s |
1.01 |
slicing / PartOpt / cuda / BothRev |
0.000011072 s |
0.000010112 s |
1.09 |
slicing / IPartOpt / cuda / PreRev |
0.000009952 s |
0.000009984 s |
1.00 |
slicing / IPartOpt / cuda / PostRev |
0.000009887 s |
0.000009952 s |
0.99 |
slicing / IPartOpt / cuda / BothRev |
0.000010144 s |
0.000009984 s |
1.02 |
slicing / DefOpt / cuda / PreRev |
0.000010208 s |
0.000010145 s |
1.01 |
slicing / DefOpt / cuda / PostRev |
0.000009632 s |
0.000010272 s |
0.94 |
slicing / DefOpt / cuda / BothRev |
0.00000992 s |
0.000010176 s |
0.97 |
slicing / IDefOpt / cuda / PreRev |
0.000010112 s |
0.000010272 s |
0.98 |
slicing / IDefOpt / cuda / PostRev |
0.000010016 s |
0.000010111 s |
0.99 |
slicing / IDefOpt / cuda / BothRev |
0.000009984 s |
0.00000976 s |
1.02 |
slicing / JaXPipe / tpu / Primal |
0.000001018075 s |
9.70875e-7 s |
1.05 |
slicing / Jax / tpu / Primal |
9.65225e-7 s |
9.697e-7 s |
1.00 |
slicing / HLOOpt / tpu / Primal |
0.00000102025 s |
9.6505e-7 s |
1.06 |
slicing / PartOpt / tpu / Primal |
9.704e-7 s |
9.71275e-7 s |
1.00 |
slicing / IPartOpt / tpu / Primal |
0.000001018 s |
9.67175e-7 s |
1.05 |
slicing / DefOpt / tpu / Primal |
9.73e-7 s |
9.754e-7 s |
1.00 |
slicing / IDefOpt / tpu / Primal |
0.000001019225 s |
9.7415e-7 s |
1.05 |
slicing / JaXPipe / tpu / Forward |
0.00000140195 s |
0.000001414775 s |
0.99 |
slicing / Jax / tpu / Forward |
0.00000146755 s |
0.0000014294000000000002 s |
1.03 |
slicing / HLOOpt / tpu / Forward |
0.000001509825 s |
0.00000151655 s |
1.00 |
slicing / PartOpt / tpu / Forward |
0.000001486675 s |
0.0000014375750000000002 s |
1.03 |
slicing / IPartOpt / tpu / Forward |
0.00000151645 s |
0.0000015182 s |
1.00 |
slicing / DefOpt / tpu / Forward |
0.000001495675 s |
0.00000144075 s |
1.04 |
slicing / IDefOpt / tpu / Forward |
0.0000015095249999999995 s |
0.00000152035 s |
0.99 |
slicing / JaXPipe / tpu / PreRev |
0.00000257065 s |
0.000002381825 s |
1.08 |
slicing / JaXPipe / tpu / PostRev |
0.0000025186 s |
0.000002513125 s |
1.00 |
slicing / JaXPipe / tpu / BothRev |
0.0000025813750000000003 s |
0.0000023991 s |
1.08 |
slicing / Jax / tpu / BothRev |
0.00000253735 s |
0.000002529925 s |
1.00 |
slicing / HLOOpt / tpu / PreRev |
0.0000025897 s |
0.000002401425 s |
1.08 |
slicing / HLOOpt / tpu / PostRev |
0.000002536125 s |
0.0000025351750000000003 s |
1.00 |
slicing / HLOOpt / tpu / BothRev |
0.0000025919750000000003 s |
0.0000023994 s |
1.08 |
slicing / PartOpt / tpu / PreRev |
0.000002532725 s |
0.000002540825 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.0000025911 s |
0.00000240305 s |
1.08 |
slicing / PartOpt / tpu / BothRev |
0.000002556925 s |
0.000002537425 s |
1.01 |
slicing / IPartOpt / tpu / PreRev |
0.0000025867 s |
0.00000240415 s |
1.08 |
slicing / IPartOpt / tpu / PostRev |
0.0000025371500000000003 s |
0.0000025435000000000003 s |
1.00 |
slicing / IPartOpt / tpu / BothRev |
0.0000025869500000000003 s |
0.000002398425 s |
1.08 |
slicing / DefOpt / tpu / PreRev |
0.000002541525 s |
0.0000025363 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.00000258315 s |
0.000002393275 s |
1.08 |
slicing / DefOpt / tpu / BothRev |
0.000002540475 s |
0.000002544625 s |
1.00 |
slicing / IDefOpt / tpu / PreRev |
0.0000025843 s |
0.000002406925 s |
1.07 |
slicing / IDefOpt / tpu / PostRev |
0.000002545825 s |
0.0000025308 s |
1.01 |
slicing / IDefOpt / tpu / BothRev |
0.0000025863750000000003 s |
0.00000240595 s |
1.07 |
slicing / JaXPipe / cpu / Primal |
0.000012766 s |
0.000007376100038527511 s |
1.73 |
slicing / Jax / cpu / Primal |
0.000012574 s |
0.000006616539958486101 s |
1.90 |
slicing / HLOOpt / cpu / Primal |
0.000012487 s |
0.0000066052799411409065 s |
1.89 |
slicing / PartOpt / cpu / Primal |
0.000012425 s |
0.000006780179974157363 s |
1.83 |
slicing / IPartOpt / cpu / Primal |
0.000012599 s |
0.000006698280012642499 s |
1.88 |
slicing / DefOpt / cpu / Primal |
0.000012671 s |
0.000006589420008822344 s |
1.92 |
slicing / IDefOpt / cpu / Primal |
0.000012486 s |
0.000006755839985999046 s |
1.85 |
slicing / JaXPipe / cpu / Forward |
0.000016743000000000002 s |
0.000010073260018543806 s |
1.66 |
slicing / Jax / cpu / Forward |
0.000016698999999999997 s |
0.000010152760005439631 s |
1.64 |
slicing / HLOOpt / cpu / Forward |
0.000016549 s |
0.000011093820012320066 s |
1.49 |
slicing / PartOpt / cpu / Forward |
0.000016595 s |
0.00001048930002070847 s |
1.58 |
slicing / IPartOpt / cpu / Forward |
0.000016715000000000002 s |
0.000010540679995756364 s |
1.59 |
slicing / DefOpt / cpu / Forward |
0.000016714 s |
0.000010364099998696475 s |
1.61 |
slicing / IDefOpt / cpu / Forward |
0.000016915 s |
0.000010913759979302997 s |
1.55 |
slicing / JaXPipe / cpu / PreRev |
0.000017919999999999998 s |
0.000010723480008891784 s |
1.67 |
slicing / JaXPipe / cpu / PostRev |
0.000017826000000000002 s |
0.00001101179987017531 s |
1.62 |
slicing / JaXPipe / cpu / BothRev |
0.000017419 s |
0.000011341339886712376 s |
1.54 |
slicing / Jax / cpu / BothRev |
0.000017628 s |
0.000011364640085957944 s |
1.55 |
slicing / HLOOpt / cpu / PreRev |
0.000017368 s |
0.00001158138000391773 s |
1.50 |
slicing / HLOOpt / cpu / PostRev |
0.000017298 s |
0.00001279515990972868 s |
1.35 |
slicing / HLOOpt / cpu / BothRev |
0.000017533 s |
0.000010748420045274545 s |
1.63 |
slicing / PartOpt / cpu / PreRev |
0.000017534000000000002 s |
0.00001081095997506054 s |
1.62 |
slicing / PartOpt / cpu / PostRev |
0.000017420000000000003 s |
0.000010711120103223947 s |
1.63 |
slicing / PartOpt / cpu / BothRev |
0.00001754 s |
0.000010752960006357173 s |
1.63 |
slicing / IPartOpt / cpu / PreRev |
0.000017336 s |
0.00001082108001355664 s |
1.60 |
slicing / IPartOpt / cpu / PostRev |
0.000017392999999999998 s |
0.00001111292003770359 s |
1.57 |
slicing / IPartOpt / cpu / BothRev |
0.000016988 s |
0.000010380180010542971 s |
1.64 |
slicing / DefOpt / cpu / PreRev |
0.000017718000000000002 s |
0.000010644059839250986 s |
1.66 |
slicing / DefOpt / cpu / PostRev |
0.000017292 s |
0.000010868580047826982 s |
1.59 |
slicing / DefOpt / cpu / BothRev |
0.000017554 s |
0.000010702139952627477 s |
1.64 |
slicing / IDefOpt / cpu / PreRev |
0.000017471 s |
0.000010620120028761447 s |
1.65 |
slicing / IDefOpt / cpu / PostRev |
0.000017171 s |
0.000011314000075799414 s |
1.52 |
slicing / IDefOpt / cpu / BothRev |
0.000017204 s |
0.00001076451997505501 s |
1.60 |
sum / JaXPipe / cuda / Primal |
0.000002111 s |
0.000002047 s |
1.03 |
sum / Jax / cuda / Primal |
0.000002111 s |
0.000002047 s |
1.03 |
sum / HLOOpt / cuda / Primal |
0.000002111 s |
0.000002048 s |
1.03 |
sum / PartOpt / cuda / Primal |
0.000002111 s |
0.000002048 s |
1.03 |
sum / IPartOpt / cuda / Primal |
0.000002111 s |
0.000002047 s |
1.03 |
sum / DefOpt / cuda / Primal |
0.00000208 s |
0.000002048 s |
1.02 |
sum / IDefOpt / cuda / Primal |
0.000002111 s |
0.000002048 s |
1.03 |
sum / JaXPipe / cuda / Forward |
0.000010913 s |
0.000010144 s |
1.08 |
sum / Jax / cuda / Forward |
0.000010272 s |
0.000010176 s |
1.01 |
sum / HLOOpt / cuda / Forward |
0.00001056 s |
0.000010272 s |
1.03 |
sum / PartOpt / cuda / Forward |
0.00001056 s |
0.000010113 s |
1.04 |
sum / IPartOpt / cuda / Forward |
0.000010528 s |
0.000010049 s |
1.05 |
sum / DefOpt / cuda / Forward |
0.000010496 s |
0.000009984 s |
1.05 |
sum / IDefOpt / cuda / Forward |
0.000010368 s |
0.000010368 s |
1 |
sum / JaXPipe / cuda / PreRev |
0.000009792 s |
0.000009728 s |
1.01 |
sum / JaXPipe / cuda / PostRev |
0.000010272 s |
0.000010048 s |
1.02 |
sum / JaXPipe / cuda / BothRev |
0.000010208 s |
0.000009984 s |
1.02 |
sum / Jax / cuda / BothRev |
0.000009856 s |
0.000010208 s |
0.97 |
sum / HLOOpt / cuda / PreRev |
0.000009984 s |
0.000010144 s |
0.98 |
sum / HLOOpt / cuda / PostRev |
0.000009248 s |
0.000010016 s |
0.92 |
sum / HLOOpt / cuda / BothRev |
0.000009792 s |
0.000009825 s |
1.00 |
sum / PartOpt / cuda / PreRev |
0.000010208 s |
0.000009888 s |
1.03 |
sum / PartOpt / cuda / PostRev |
0.000009568 s |
0.000010112 s |
0.95 |
sum / PartOpt / cuda / BothRev |
0.00000976 s |
0.000010144 s |
0.96 |
sum / IPartOpt / cuda / PreRev |
0.000009825 s |
0.000009984 s |
0.98 |
sum / IPartOpt / cuda / PostRev |
0.00000992 s |
0.000009888 s |
1.00 |
sum / IPartOpt / cuda / BothRev |
0.00000944 s |
0.000009952 s |
0.95 |
sum / DefOpt / cuda / PreRev |
0.000010336 s |
0.000010112 s |
1.02 |
sum / DefOpt / cuda / PostRev |
0.000010177 s |
0.000009951 s |
1.02 |
sum / DefOpt / cuda / BothRev |
0.000009952 s |
0.000010272 s |
0.97 |
sum / IDefOpt / cuda / PreRev |
0.000009792 s |
0.000010112 s |
0.97 |
sum / IDefOpt / cuda / PostRev |
0.00001008 s |
0.0000096 s |
1.05 |
sum / IDefOpt / cuda / BothRev |
0.00000976 s |
0.000010304 s |
0.95 |
sum / JaXPipe / tpu / Primal |
5.10575e-7 s |
5.103250000000001e-7 s |
1.00 |
sum / Jax / tpu / Primal |
5.4685e-7 s |
5.47075e-7 s |
1.00 |
sum / HLOOpt / tpu / Primal |
5.10625e-7 s |
5.0995e-7 s |
1.00 |
sum / PartOpt / tpu / Primal |
5.46975e-7 s |
5.469999999999999e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.1035e-7 s |
5.10525e-7 s |
1.00 |
sum / DefOpt / tpu / Primal |
5.471999999999999e-7 s |
5.47125e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.106499999999999e-7 s |
5.106e-7 s |
1.00 |
sum / JaXPipe / tpu / Forward |
0.00000155135 s |
0.0000015465 s |
1.00 |
sum / Jax / tpu / Forward |
0.00000149875 s |
0.0000014981 s |
1.00 |
sum / HLOOpt / tpu / Forward |
0.0000015379 s |
0.0000015301999999999998 s |
1.01 |
sum / PartOpt / tpu / Forward |
0.0000015006 s |
0.0000014973 s |
1.00 |
sum / IPartOpt / tpu / Forward |
0.0000015319250000000002 s |
0.000001534625 s |
1.00 |
sum / DefOpt / tpu / Forward |
0.000001496875 s |
0.00000150475 s |
0.99 |
sum / IDefOpt / tpu / Forward |
0.0000015317500000000002 s |
0.00000153635 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
0.000001054475 s |
0.000001002975 s |
1.05 |
sum / JaXPipe / tpu / PostRev |
0.000001083475 s |
0.0000010413500000000002 s |
1.04 |
sum / JaXPipe / tpu / BothRev |
0.0000010579 s |
0.0000010012999999999998 s |
1.06 |
sum / Jax / tpu / BothRev |
0.0000010977 s |
0.0000010361 s |
1.06 |
sum / HLOOpt / tpu / PreRev |
0.00000104765 s |
0.0000010052 s |
1.04 |
sum / HLOOpt / tpu / PostRev |
0.000001094325 s |
0.000001038575 s |
1.05 |
sum / HLOOpt / tpu / BothRev |
0.000001048425 s |
0.0000010065 s |
1.04 |
sum / PartOpt / tpu / PreRev |
0.0000010895499999999998 s |
0.000001035675 s |
1.05 |
sum / PartOpt / tpu / PostRev |
0.0000010536 s |
0.000001005275 s |
1.05 |
sum / PartOpt / tpu / BothRev |
0.000001090675 s |
0.000001038525 s |
1.05 |
sum / IPartOpt / tpu / PreRev |
0.00000105495 s |
9.99775e-7 s |
1.06 |
sum / IPartOpt / tpu / PostRev |
0.000001089725 s |
0.0000010369000000000002 s |
1.05 |
sum / IPartOpt / tpu / BothRev |
0.000001051325 s |
0.000001001675 s |
1.05 |
sum / DefOpt / tpu / PreRev |
0.0000010858 s |
0.000001037525 s |
1.05 |
sum / DefOpt / tpu / PostRev |
0.00000104795 s |
0.0000010019 s |
1.05 |
sum / DefOpt / tpu / BothRev |
0.000001090275 s |
0.000001035175 s |
1.05 |
sum / IDefOpt / tpu / PreRev |
0.0000010545 s |
0.000001003675 s |
1.05 |
sum / IDefOpt / tpu / PostRev |
0.000001088925 s |
0.000001035425 s |
1.05 |
sum / IDefOpt / tpu / BothRev |
0.0000010465 s |
0.0000010136 s |
1.03 |
sum / JaXPipe / cpu / Primal |
0.000014626 s |
0.000008208279996324564 s |
1.78 |
sum / Jax / cpu / Primal |
0.000014473 s |
0.000008321959958266234 s |
1.74 |
sum / HLOOpt / cpu / Primal |
0.000013809 s |
0.000008680300034029642 s |
1.59 |
sum / PartOpt / cpu / Primal |
0.00001424 s |
0.00000838275993373827 s |
1.70 |
sum / IPartOpt / cpu / Primal |
0.000014436 s |
0.00000831879990073503 s |
1.74 |
sum / DefOpt / cpu / Primal |
0.000014459 s |
0.000008195060017897048 s |
1.76 |
sum / IDefOpt / cpu / Primal |
0.000014184 s |
0.000008042199933697702 s |
1.76 |
sum / JaXPipe / cpu / Forward |
0.000019922 s |
0.000012547039968922036 s |
1.59 |
sum / Jax / cpu / Forward |
0.000019391 s |
0.000012595159932971 s |
1.54 |
sum / HLOOpt / cpu / Forward |
0.000019642 s |
0.000012731680035358297 s |
1.54 |
sum / PartOpt / cpu / Forward |
0.00002036 s |
0.000012554319928312907 s |
1.62 |
sum / IPartOpt / cpu / Forward |
0.000020027 s |
0.000012770499943144386 s |
1.57 |
sum / DefOpt / cpu / Forward |
0.000019755000000000003 s |
0.00001245491997906356 s |
1.59 |
sum / IDefOpt / cpu / Forward |
0.000019994 s |
0.000012343320031504846 s |
1.62 |
sum / JaXPipe / cpu / PreRev |
0.000019048 s |
0.000012384859983285424 s |
1.54 |
sum / JaXPipe / cpu / PostRev |
0.000018467 s |
0.00001178538008389296 s |
1.57 |
sum / JaXPipe / cpu / BothRev |
0.000018881 s |
0.000012185880059405465 s |
1.55 |
sum / Jax / cpu / BothRev |
0.000018662 s |
0.00001187664001918165 s |
1.57 |
sum / HLOOpt / cpu / PreRev |
0.000018744 s |
0.000011666739937936657 s |
1.61 |
sum / HLOOpt / cpu / PostRev |
0.000018936000000000003 s |
0.000013745919950451937 s |
1.38 |
sum / HLOOpt / cpu / BothRev |
0.00001859 s |
0.000011835500063170911 s |
1.57 |
sum / PartOpt / cpu / PreRev |
0.000018961 s |
0.000011959500097873388 s |
1.59 |
sum / PartOpt / cpu / PostRev |
0.00001867 s |
0.00001190447996123112 s |
1.57 |
sum / PartOpt / cpu / BothRev |
0.000018745 s |
0.00001176630008558277 s |
1.59 |
sum / IPartOpt / cpu / PreRev |
0.000018849 s |
0.0000117687799865962 s |
1.60 |
sum / IPartOpt / cpu / PostRev |
0.000018692 s |
0.000011750659941753838 s |
1.59 |
sum / IPartOpt / cpu / BothRev |
0.000018699 s |
0.000011083579938713228 s |
1.69 |
sum / DefOpt / cpu / PreRev |
0.000018751 s |
0.00001203769996209303 s |
1.56 |
sum / DefOpt / cpu / PostRev |
0.000018359 s |
0.000011714959982782605 s |
1.57 |
sum / DefOpt / cpu / BothRev |
0.000018761 s |
0.000011712280047504464 s |
1.60 |
sum / IDefOpt / cpu / PreRev |
0.000018983 s |
0.000011774200029321946 s |
1.61 |
sum / IDefOpt / cpu / PostRev |
0.000018407 s |
0.000011629780019575264 s |
1.58 |
sum / IDefOpt / cpu / BothRev |
0.000018614 s |
0.000012202980087749893 s |
1.53 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033568 s |
0.000033984 s |
0.99 |
value_and_grad / Jax / cuda / Primal |
0.00003424 s |
0.00003968 s |
0.86 |
value_and_grad / HLOOpt / cuda / Primal |
0.00003264 s |
0.000034176 s |
0.96 |
value_and_grad / PartOpt / cuda / Primal |
0.00003264 s |
0.000034049000000000006 s |
0.96 |
value_and_grad / IPartOpt / cuda / Primal |
0.00003728 s |
0.000034016 s |
1.10 |
value_and_grad / DefOpt / cuda / Primal |
0.000033696 s |
0.000034016 s |
0.99 |
value_and_grad / IDefOpt / cuda / Primal |
0.000034176 s |
0.00003424 s |
1.00 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.00002316 s |
0.000015194400020845931 s |
1.52 |
value_and_grad / Jax / cpu / Primal |
0.000022256 s |
0.000015133479901123791 s |
1.47 |
value_and_grad / HLOOpt / cpu / Primal |
0.000022601 s |
0.00001517563994639204 s |
1.49 |
value_and_grad / PartOpt / cpu / Primal |
0.000022468 s |
0.000014475799962383465 s |
1.55 |
value_and_grad / IPartOpt / cpu / Primal |
0.000022951 s |
0.00001491770000939141 s |
1.54 |
value_and_grad / DefOpt / cpu / Primal |
0.000022582 s |
0.000014694860037707258 s |
1.54 |
value_and_grad / IDefOpt / cpu / Primal |
0.000022694 s |
0.000015206619955279168 s |
1.49 |
This comment was automatically generated by workflow using github-action-benchmark.
97fccbf to
2c8b0af
Compare
wsmoses
reviewed
Dec 19, 2025
wsmoses
approved these changes
Dec 19, 2025
d6724da to
c1992d4
Compare
wip wip testing wip wip done? file remove unneeded files actually run test revert isl dep dialects missingheader stray file
c1992d4 to
16b7be2
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.