-
Notifications
You must be signed in to change notification settings - Fork 26
test: add jaxley benchmark #1896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
avik-pal
wants to merge
4
commits into
main
Choose a base branch
from
ap/neuro_benchmark
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
94c8aa8 to
9f60d0f
Compare
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 42571bd | Previous: d083e29 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000007320480008274898 s |
0.000007391040071524912 s |
0.99 |
actmtch / Jax / cpu / Primal |
0.000007492399972761632 s |
0.000007562260007034638 s |
0.99 |
actmtch / HLOOpt / cpu / Primal |
0.000009067820037671482 s |
0.000011601599999266907 s |
0.78 |
actmtch / PartOpt / cpu / Primal |
0.000008064339990596636 s |
0.000007577580017823493 s |
1.06 |
actmtch / IPartOpt / cpu / Primal |
0.000008011020054254913 s |
0.000006997520004006219 s |
1.14 |
actmtch / DefOpt / cpu / Primal |
0.00000801829996817105 s |
0.00001217241997437668 s |
0.66 |
actmtch / IDefOpt / cpu / Primal |
0.000009153419960057364 s |
0.00000807944000371208 s |
1.13 |
actmtch / JaXPipe / cpu / Forward |
0.000012967659986315991 s |
0.000011693580045175624 s |
1.11 |
actmtch / Jax / cpu / Forward |
0.000011346659975970397 s |
0.000011192800002390868 s |
1.01 |
actmtch / HLOOpt / cpu / Forward |
0.000012758000002577318 s |
0.000016192819975913154 s |
0.79 |
actmtch / PartOpt / cpu / Forward |
0.0000131860199780931 s |
0.000016015620030884747 s |
0.82 |
actmtch / IPartOpt / cpu / Forward |
0.000011969079978371157 s |
0.000011876220005433425 s |
1.01 |
actmtch / DefOpt / cpu / Forward |
0.000012896379976155004 s |
0.000016005079969545478 s |
0.81 |
actmtch / IDefOpt / cpu / Forward |
0.000012289659989619395 s |
0.000011197740032002911 s |
1.10 |
actmtch / JaXPipe / cpu / PreRev |
0.000011744420016839285 s |
0.000012147400002504583 s |
0.97 |
actmtch / JaXPipe / cpu / PostRev |
0.000011209439990125248 s |
0.000011047780008084374 s |
1.01 |
actmtch / JaXPipe / cpu / BothRev |
0.00001243812003849598 s |
0.00001310317994466459 s |
0.95 |
actmtch / Jax / cpu / BothRev |
0.00001007000002573477 s |
0.00001115773999117664 s |
0.90 |
actmtch / HLOOpt / cpu / PreRev |
0.000013017120018048443 s |
0.00001213313996231591 s |
1.07 |
actmtch / HLOOpt / cpu / PostRev |
0.00001432083998224698 s |
0.000016364479988624226 s |
0.88 |
actmtch / HLOOpt / cpu / BothRev |
0.000012737320002997875 s |
0.00001449323996894236 s |
0.88 |
actmtch / PartOpt / cpu / PreRev |
0.000014759919986317982 s |
0.000012657140041483216 s |
1.17 |
actmtch / PartOpt / cpu / PostRev |
0.000011180499996044093 s |
0.00001099301995964197 s |
1.02 |
actmtch / PartOpt / cpu / BothRev |
0.000012957059989275876 s |
0.000012769560016749891 s |
1.01 |
actmtch / IPartOpt / cpu / PreRev |
0.000012168760022177594 s |
0.000012451840038920636 s |
0.98 |
actmtch / IPartOpt / cpu / PostRev |
0.00001132361997406406 s |
0.000010973939997711567 s |
1.03 |
actmtch / IPartOpt / cpu / BothRev |
0.0000121122000291507 s |
0.000012715380007648491 s |
0.95 |
actmtch / DefOpt / cpu / PreRev |
0.000012035319978167536 s |
0.000012457099992388977 s |
0.97 |
actmtch / DefOpt / cpu / PostRev |
0.0000121648000094865 s |
0.000012909559973195429 s |
0.94 |
actmtch / DefOpt / cpu / BothRev |
0.000011860180011353805 s |
0.00001202078001369955 s |
0.99 |
actmtch / IDefOpt / cpu / PreRev |
0.000012603080003827926 s |
0.00001288414000555349 s |
0.98 |
actmtch / IDefOpt / cpu / PostRev |
0.000012897580027129153 s |
0.00001246106001417502 s |
1.04 |
actmtch / IDefOpt / cpu / BothRev |
0.00001254967999557266 s |
0.00001241245998244267 s |
1.01 |
actmtch / JaXPipe / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / Jax / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / HLOOpt / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / PartOpt / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
actmtch / IPartOpt / cuda / Primal |
0.000002016 s |
0.000002015 s |
1.00 |
actmtch / DefOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
actmtch / IDefOpt / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
actmtch / JaXPipe / cuda / Forward |
0.00000944 s |
0.000010943 s |
0.86 |
actmtch / Jax / cuda / Forward |
0.000010591 s |
0.000011072 s |
0.96 |
actmtch / HLOOpt / cuda / Forward |
0.000010016 s |
0.000010752 s |
0.93 |
actmtch / PartOpt / cuda / Forward |
0.000009503 s |
0.000010848 s |
0.88 |
actmtch / IPartOpt / cuda / Forward |
0.000009632 s |
0.00001088 s |
0.89 |
actmtch / DefOpt / cuda / Forward |
0.000011104 s |
0.000010784 s |
1.03 |
actmtch / IDefOpt / cuda / Forward |
0.000009728 s |
0.000010688 s |
0.91 |
actmtch / JaXPipe / cuda / PreRev |
0.000010016 s |
0.000010912 s |
0.92 |
actmtch / JaXPipe / cuda / PostRev |
0.000010209 s |
0.000010368 s |
0.98 |
actmtch / JaXPipe / cuda / BothRev |
0.000010111 s |
0.000010465 s |
0.97 |
actmtch / Jax / cuda / BothRev |
0.00001088 s |
0.000010401 s |
1.05 |
actmtch / HLOOpt / cuda / PreRev |
0.00001104 s |
0.00001024 s |
1.08 |
actmtch / HLOOpt / cuda / PostRev |
0.000011008 s |
0.000010433 s |
1.06 |
actmtch / HLOOpt / cuda / BothRev |
0.000009376 s |
0.000010688 s |
0.88 |
actmtch / PartOpt / cuda / PreRev |
0.00001136 s |
0.000010624 s |
1.07 |
actmtch / PartOpt / cuda / PostRev |
0.000011136 s |
0.000010368 s |
1.07 |
actmtch / PartOpt / cuda / BothRev |
0.000014144 s |
0.000010848 s |
1.30 |
actmtch / IPartOpt / cuda / PreRev |
0.000010048 s |
0.000010752 s |
0.93 |
actmtch / IPartOpt / cuda / PostRev |
0.000010144 s |
0.000011136 s |
0.91 |
actmtch / IPartOpt / cuda / BothRev |
0.000010016 s |
0.000010624 s |
0.94 |
actmtch / DefOpt / cuda / PreRev |
0.000010336 s |
0.000010976 s |
0.94 |
actmtch / DefOpt / cuda / PostRev |
0.000009951 s |
0.000010496 s |
0.95 |
actmtch / DefOpt / cuda / BothRev |
0.000010112 s |
0.000010304 s |
0.98 |
actmtch / IDefOpt / cuda / PreRev |
0.000010048 s |
0.000010688 s |
0.94 |
actmtch / IDefOpt / cuda / PostRev |
0.00001008 s |
0.000010848 s |
0.93 |
actmtch / IDefOpt / cuda / BothRev |
0.00000992 s |
0.000010336 s |
0.96 |
actmtch / JaXPipe / cpu / Primal |
0.000013343 s |
0.000007391040071524912 s |
1.81 |
actmtch / Jax / cpu / Primal |
0.000013406 s |
0.000007562260007034638 s |
1.77 |
actmtch / HLOOpt / cpu / Primal |
0.000013989 s |
0.000011601599999266907 s |
1.21 |
actmtch / PartOpt / cpu / Primal |
0.000013296 s |
0.000007577580017823493 s |
1.75 |
actmtch / IPartOpt / cpu / Primal |
0.000013475 s |
0.000006997520004006219 s |
1.93 |
actmtch / DefOpt / cpu / Primal |
0.000013943 s |
0.00001217241997437668 s |
1.15 |
actmtch / IDefOpt / cpu / Primal |
0.000013641 s |
0.00000807944000371208 s |
1.69 |
actmtch / JaXPipe / cpu / Forward |
0.000019586 s |
0.000011693580045175624 s |
1.67 |
actmtch / Jax / cpu / Forward |
0.000018009 s |
0.000011192800002390868 s |
1.61 |
actmtch / HLOOpt / cpu / Forward |
0.000019172 s |
0.000016192819975913154 s |
1.18 |
actmtch / PartOpt / cpu / Forward |
0.000019128 s |
0.000016015620030884747 s |
1.19 |
actmtch / IPartOpt / cpu / Forward |
0.000019221 s |
0.000011876220005433425 s |
1.62 |
actmtch / DefOpt / cpu / Forward |
0.000019314 s |
0.000016005079969545478 s |
1.21 |
actmtch / IDefOpt / cpu / Forward |
0.000019324 s |
0.000011197740032002911 s |
1.73 |
actmtch / JaXPipe / cpu / PreRev |
0.000019597 s |
0.000012147400002504583 s |
1.61 |
actmtch / JaXPipe / cpu / PostRev |
0.000017676999999999997 s |
0.000011047780008084374 s |
1.60 |
actmtch / JaXPipe / cpu / BothRev |
0.00001943 s |
0.00001310317994466459 s |
1.48 |
actmtch / Jax / cpu / BothRev |
0.000017771 s |
0.00001115773999117664 s |
1.59 |
actmtch / HLOOpt / cpu / PreRev |
0.000019348 s |
0.00001213313996231591 s |
1.59 |
actmtch / HLOOpt / cpu / PostRev |
0.000019587 s |
0.000016364479988624226 s |
1.20 |
actmtch / HLOOpt / cpu / BothRev |
0.000019499 s |
0.00001449323996894236 s |
1.35 |
actmtch / PartOpt / cpu / PreRev |
0.000018927 s |
0.000012657140041483216 s |
1.50 |
actmtch / PartOpt / cpu / PostRev |
0.000017913999999999998 s |
0.00001099301995964197 s |
1.63 |
actmtch / PartOpt / cpu / BothRev |
0.000019367 s |
0.000012769560016749891 s |
1.52 |
actmtch / IPartOpt / cpu / PreRev |
0.000019916 s |
0.000012451840038920636 s |
1.60 |
actmtch / IPartOpt / cpu / PostRev |
0.000017356 s |
0.000010973939997711567 s |
1.58 |
actmtch / IPartOpt / cpu / BothRev |
0.000020141 s |
0.000012715380007648491 s |
1.58 |
actmtch / DefOpt / cpu / PreRev |
0.000018809 s |
0.000012457099992388977 s |
1.51 |
actmtch / DefOpt / cpu / PostRev |
0.000019541 s |
0.000012909559973195429 s |
1.51 |
actmtch / DefOpt / cpu / BothRev |
0.00001983 s |
0.00001202078001369955 s |
1.65 |
actmtch / IDefOpt / cpu / PreRev |
0.000019325 s |
0.00001288414000555349 s |
1.50 |
actmtch / IDefOpt / cpu / PostRev |
0.000019288 s |
0.00001246106001417502 s |
1.55 |
actmtch / IDefOpt / cpu / BothRev |
0.000019483 s |
0.00001241245998244267 s |
1.57 |
add_one / JaXPipe / cpu / Primal |
0.000007586720030303695 s |
0.000007896179986346397 s |
0.96 |
add_one / Jax / cpu / Primal |
0.000007218099999590777 s |
0.000008059760020842077 s |
0.90 |
add_one / HLOOpt / cpu / Primal |
0.000007243620002554962 s |
0.000011271700041106667 s |
0.64 |
add_one / PartOpt / cpu / Primal |
0.000007362819997069892 s |
0.00000737778007533052 s |
1.00 |
add_one / IPartOpt / cpu / Primal |
0.00000721000003068184 s |
0.000007323179988816264 s |
0.98 |
add_one / DefOpt / cpu / Primal |
0.0000071258399748330704 s |
0.000011454540026534232 s |
0.62 |
add_one / IDefOpt / cpu / Primal |
0.000007469640022463863 s |
0.000007599880000270786 s |
0.98 |
add_one / JaXPipe / cpu / Forward |
0.000010538739970797906 s |
0.000011489840044305313 s |
0.92 |
add_one / Jax / cpu / Forward |
0.00001060033997418941 s |
0.00001128000002609042 s |
0.94 |
add_one / HLOOpt / cpu / Forward |
0.000011068659987358842 s |
0.000015948640002534375 s |
0.69 |
add_one / PartOpt / cpu / Forward |
0.00001063808002982114 s |
0.00001568542000313755 s |
0.68 |
add_one / IPartOpt / cpu / Forward |
0.00001085157999114017 s |
0.000011263400028838078 s |
0.96 |
add_one / DefOpt / cpu / Forward |
0.000010817299980772077 s |
0.00001617061998331337 s |
0.67 |
add_one / IDefOpt / cpu / Forward |
0.00001134470000579313 s |
0.000011673080025502714 s |
0.97 |
add_one / JaXPipe / cpu / PreRev |
0.000013125519963068657 s |
0.00001287380003304861 s |
1.02 |
add_one / JaXPipe / cpu / PostRev |
0.000013657619965670164 s |
0.000012663920015256736 s |
1.08 |
add_one / JaXPipe / cpu / BothRev |
0.00001297515996157017 s |
0.000017538880010761206 s |
0.74 |
add_one / Jax / cpu / BothRev |
0.00001289801997700124 s |
0.000012782919984601904 s |
1.01 |
add_one / HLOOpt / cpu / PreRev |
0.000013139760003468837 s |
0.000013180160012780106 s |
1.00 |
add_one / HLOOpt / cpu / PostRev |
0.00001522512001429277 s |
0.000012923799995405717 s |
1.18 |
add_one / HLOOpt / cpu / BothRev |
0.000013810899954478373 s |
0.000014851300011287096 s |
0.93 |
add_one / PartOpt / cpu / PreRev |
0.000012790659984602826 s |
0.000013053980010226951 s |
0.98 |
add_one / PartOpt / cpu / PostRev |
0.000013482940012181644 s |
0.00001259243996173609 s |
1.07 |
add_one / PartOpt / cpu / BothRev |
0.000014272800008257036 s |
0.000013003179947190802 s |
1.10 |
add_one / IPartOpt / cpu / PreRev |
0.000012330859963185503 s |
0.000015980460029823008 s |
0.77 |
add_one / IPartOpt / cpu / PostRev |
0.000013083040039418848 s |
0.00001266117998966365 s |
1.03 |
add_one / IPartOpt / cpu / BothRev |
0.000013511619990822507 s |
0.0000128849999873637 s |
1.05 |
add_one / DefOpt / cpu / PreRev |
0.000012403039982018526 s |
0.000013108160019328352 s |
0.95 |
add_one / DefOpt / cpu / PostRev |
0.0000129790199935087 s |
0.000012861699933637284 s |
1.01 |
add_one / DefOpt / cpu / BothRev |
0.000013252779945105433 s |
0.000012957519984411193 s |
1.02 |
add_one / IDefOpt / cpu / PreRev |
0.000012776920048054308 s |
0.000013135819945091498 s |
0.97 |
add_one / IDefOpt / cpu / PostRev |
0.000013778880002064395 s |
0.000012651040069613371 s |
1.09 |
add_one / IDefOpt / cpu / BothRev |
0.000013495919965862413 s |
0.000012945279995619786 s |
1.04 |
add_one / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_one / JaXPipe / cuda / Forward |
0.000010048 s |
0.000010624 s |
0.95 |
add_one / Jax / cuda / Forward |
0.000010272 s |
0.000010688 s |
0.96 |
add_one / HLOOpt / cuda / Forward |
0.000010271 s |
0.000010496 s |
0.98 |
add_one / PartOpt / cuda / Forward |
0.000009952 s |
0.00001056 s |
0.94 |
add_one / IPartOpt / cuda / Forward |
0.00001008 s |
0.000010688 s |
0.94 |
add_one / DefOpt / cuda / Forward |
0.000010016 s |
0.000010656 s |
0.94 |
add_one / IDefOpt / cuda / Forward |
0.000010015 s |
0.000010848 s |
0.92 |
add_one / JaXPipe / cuda / PreRev |
0.000024672 s |
0.000025632 s |
0.96 |
add_one / JaXPipe / cuda / PostRev |
0.000024384 s |
0.000026272 s |
0.93 |
add_one / JaXPipe / cuda / BothRev |
0.000024416 s |
0.000025952 s |
0.94 |
add_one / Jax / cuda / BothRev |
0.000024384 s |
0.00002528 s |
0.96 |
add_one / HLOOpt / cuda / PreRev |
0.00002496 s |
0.000025825 s |
0.97 |
add_one / HLOOpt / cuda / PostRev |
0.000024481 s |
0.000025024 s |
0.98 |
add_one / HLOOpt / cuda / BothRev |
0.000025055 s |
0.000026144 s |
0.96 |
add_one / PartOpt / cuda / PreRev |
0.000024832 s |
0.000026017 s |
0.95 |
add_one / PartOpt / cuda / PostRev |
0.000024608 s |
0.000026144 s |
0.94 |
add_one / PartOpt / cuda / BothRev |
0.000024864 s |
0.00002544 s |
0.98 |
add_one / IPartOpt / cuda / PreRev |
0.000024384 s |
0.00002592 s |
0.94 |
add_one / IPartOpt / cuda / PostRev |
0.000024704 s |
0.000026016 s |
0.95 |
add_one / IPartOpt / cuda / BothRev |
0.000024608 s |
0.000026144 s |
0.94 |
add_one / DefOpt / cuda / PreRev |
0.000024447 s |
0.000025792 s |
0.95 |
add_one / DefOpt / cuda / PostRev |
0.000025152 s |
0.00002624 s |
0.96 |
add_one / DefOpt / cuda / BothRev |
0.0000304 s |
0.000025568 s |
1.19 |
add_one / IDefOpt / cuda / PreRev |
0.00002464 s |
0.00002576 s |
0.96 |
add_one / IDefOpt / cuda / PostRev |
0.000024864 s |
0.000025632 s |
0.97 |
add_one / IDefOpt / cuda / BothRev |
0.000025792 s |
0.000025921 s |
1.00 |
add_one / JaXPipe / cpu / Primal |
0.000013127 s |
0.000007896179986346397 s |
1.66 |
add_one / Jax / cpu / Primal |
0.000013236 s |
0.000008059760020842077 s |
1.64 |
add_one / HLOOpt / cpu / Primal |
0.000012742 s |
0.000011271700041106667 s |
1.13 |
add_one / PartOpt / cpu / Primal |
0.000013036 s |
0.00000737778007533052 s |
1.77 |
add_one / IPartOpt / cpu / Primal |
0.000012855 s |
0.000007323179988816264 s |
1.76 |
add_one / DefOpt / cpu / Primal |
0.000013027 s |
0.000011454540026534232 s |
1.14 |
add_one / IDefOpt / cpu / Primal |
0.000012877 s |
0.000007599880000270786 s |
1.69 |
add_one / JaXPipe / cpu / Forward |
0.000018354 s |
0.000011489840044305313 s |
1.60 |
add_one / Jax / cpu / Forward |
0.000017350999999999997 s |
0.00001128000002609042 s |
1.54 |
add_one / HLOOpt / cpu / Forward |
0.000017722 s |
0.000015948640002534375 s |
1.11 |
add_one / PartOpt / cpu / Forward |
0.000017746999999999998 s |
0.00001568542000313755 s |
1.13 |
add_one / IPartOpt / cpu / Forward |
0.000017690999999999997 s |
0.000011263400028838078 s |
1.57 |
add_one / DefOpt / cpu / Forward |
0.000017623 s |
0.00001617061998331337 s |
1.09 |
add_one / IDefOpt / cpu / Forward |
0.000017814 s |
0.000011673080025502714 s |
1.53 |
add_one / JaXPipe / cpu / PreRev |
0.000020325 s |
0.00001287380003304861 s |
1.58 |
add_one / JaXPipe / cpu / PostRev |
0.000019802 s |
0.000012663920015256736 s |
1.56 |
add_one / JaXPipe / cpu / BothRev |
0.000019925 s |
0.000017538880010761206 s |
1.14 |
add_one / Jax / cpu / BothRev |
0.000019513 s |
0.000012782919984601904 s |
1.53 |
add_one / HLOOpt / cpu / PreRev |
0.000019597 s |
0.000013180160012780106 s |
1.49 |
add_one / HLOOpt / cpu / PostRev |
0.000020332 s |
0.000012923799995405717 s |
1.57 |
add_one / HLOOpt / cpu / BothRev |
0.000019687 s |
0.000014851300011287096 s |
1.33 |
add_one / PartOpt / cpu / PreRev |
0.000019923 s |
0.000013053980010226951 s |
1.53 |
add_one / PartOpt / cpu / PostRev |
0.000019959 s |
0.00001259243996173609 s |
1.58 |
add_one / PartOpt / cpu / BothRev |
0.0000196 s |
0.000013003179947190802 s |
1.51 |
add_one / IPartOpt / cpu / PreRev |
0.000019717 s |
0.000015980460029823008 s |
1.23 |
add_one / IPartOpt / cpu / PostRev |
0.000019836 s |
0.00001266117998966365 s |
1.57 |
add_one / IPartOpt / cpu / BothRev |
0.000019772 s |
0.0000128849999873637 s |
1.53 |
add_one / DefOpt / cpu / PreRev |
0.00001992 s |
0.000013108160019328352 s |
1.52 |
add_one / DefOpt / cpu / PostRev |
0.000019939 s |
0.000012861699933637284 s |
1.55 |
add_one / DefOpt / cpu / BothRev |
0.000019651 s |
0.000012957519984411193 s |
1.52 |
add_one / IDefOpt / cpu / PreRev |
0.000019722 s |
0.000013135819945091498 s |
1.50 |
add_one / IDefOpt / cpu / PostRev |
0.000020005 s |
0.000012651040069613371 s |
1.58 |
add_one / IDefOpt / cpu / BothRev |
0.0000196 s |
0.000012945279995619786 s |
1.51 |
add_two / JaXPipe / cpu / Primal |
0.000007133140034056851 s |
0.000008078299997578142 s |
0.88 |
add_two / Jax / cpu / Primal |
0.000007417800015900866 s |
0.000007480699987354456 s |
0.99 |
add_two / HLOOpt / cpu / Primal |
0.000007086120021995157 s |
0.000011705360002451924 s |
0.61 |
add_two / PartOpt / cpu / Primal |
0.000007688979985687184 s |
0.000007847820043025422 s |
0.98 |
add_two / IPartOpt / cpu / Primal |
0.000007968279987835558 s |
0.000008033739977690856 s |
0.99 |
add_two / DefOpt / cpu / Primal |
0.000007639960012966184 s |
0.000011882460030392397 s |
0.64 |
add_two / IDefOpt / cpu / Primal |
0.000007584000022688997 s |
0.000007653040029254043 s |
0.99 |
add_two / JaXPipe / cpu / Forward |
0.000010768160009320127 s |
0.00001155064003796724 s |
0.93 |
add_two / Jax / cpu / Forward |
0.0000116874999821448 s |
0.00001183109997327847 s |
0.99 |
add_two / HLOOpt / cpu / Forward |
0.000011625779970927397 s |
0.000016174619977391556 s |
0.72 |
add_two / PartOpt / cpu / Forward |
0.000011849280017486308 s |
0.00001656830003412324 s |
0.72 |
add_two / IPartOpt / cpu / Forward |
0.0000112847799937299 s |
0.000011428399993747009 s |
0.99 |
add_two / DefOpt / cpu / Forward |
0.000011702460033120588 s |
0.000011539719998836518 s |
1.01 |
add_two / IDefOpt / cpu / Forward |
0.000011423139985708986 s |
0.00001197085999592673 s |
0.95 |
add_two / JaXPipe / cpu / PreRev |
0.00001566770000863471 s |
0.000015447079995283275 s |
1.01 |
add_two / JaXPipe / cpu / PostRev |
0.000015425000010509394 s |
0.000015476439966732868 s |
1.00 |
add_two / JaXPipe / cpu / BothRev |
0.00001544820003800851 s |
0.00001541200000247045 s |
1.00 |
add_two / Jax / cpu / BothRev |
0.00001594783996551996 s |
0.00001537471996016393 s |
1.04 |
add_two / HLOOpt / cpu / PreRev |
0.000016009779974410775 s |
0.00001551939993078122 s |
1.03 |
add_two / HLOOpt / cpu / PostRev |
0.000017757300001903785 s |
0.00001580970001668902 s |
1.12 |
add_two / HLOOpt / cpu / BothRev |
0.000015089119997355738 s |
0.000017628919977141777 s |
0.86 |
add_two / PartOpt / cpu / PreRev |
0.000015416840024045087 s |
0.000015835540016269078 s |
0.97 |
add_two / PartOpt / cpu / PostRev |
0.000015211459976853802 s |
0.000015323180014092942 s |
0.99 |
add_two / PartOpt / cpu / BothRev |
0.000015825000000404545 s |
0.000015682500015827828 s |
1.01 |
add_two / IPartOpt / cpu / PreRev |
0.000015988779996405355 s |
0.000015858660008234436 s |
1.01 |
add_two / IPartOpt / cpu / PostRev |
0.000015689260008002747 s |
0.000015334940017055487 s |
1.02 |
add_two / IPartOpt / cpu / BothRev |
0.000015154360071392147 s |
0.000015353399994637585 s |
0.99 |
add_two / DefOpt / cpu / PreRev |
0.000016077440004664823 s |
0.000015210360006676635 s |
1.06 |
add_two / DefOpt / cpu / PostRev |
0.000015724759987278957 s |
0.00001574001999870234 s |
1.00 |
add_two / DefOpt / cpu / BothRev |
0.000015513580010519945 s |
0.000015454039994438064 s |
1.00 |
add_two / IDefOpt / cpu / PreRev |
0.000015886140026850626 s |
0.000015190960048130363 s |
1.05 |
add_two / IDefOpt / cpu / PostRev |
0.000014770479992876062 s |
0.000016048160032369195 s |
0.92 |
add_two / IDefOpt / cpu / BothRev |
0.00001545979996990354 s |
0.000015832279987080257 s |
0.98 |
add_two / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001888 s |
1.02 |
add_two / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.000001889 s |
1.02 |
add_two / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
add_two / JaXPipe / cuda / Forward |
0.000009633 s |
0.000010272 s |
0.94 |
add_two / Jax / cuda / Forward |
0.000009568 s |
0.000010464 s |
0.91 |
add_two / HLOOpt / cuda / Forward |
0.000009472 s |
0.000010367 s |
0.91 |
add_two / PartOpt / cuda / Forward |
0.000009632 s |
0.00000976 s |
0.99 |
add_two / IPartOpt / cuda / Forward |
0.00000976 s |
0.000010368 s |
0.94 |
add_two / DefOpt / cuda / Forward |
0.000009792 s |
0.000009664 s |
1.01 |
add_two / IDefOpt / cuda / Forward |
0.000009824 s |
0.000009952 s |
0.99 |
add_two / JaXPipe / cuda / PreRev |
0.000032225 s |
0.00003344 s |
0.96 |
add_two / JaXPipe / cuda / PostRev |
0.000031264 s |
0.000034049000000000006 s |
0.92 |
add_two / JaXPipe / cuda / BothRev |
0.00003168 s |
0.000034464 s |
0.92 |
add_two / Jax / cuda / BothRev |
0.000031904000000000005 s |
0.000032769 s |
0.97 |
add_two / HLOOpt / cuda / PreRev |
0.000032288 s |
0.000033312 s |
0.97 |
add_two / HLOOpt / cuda / PostRev |
0.000031456 s |
0.000033119999999999995 s |
0.95 |
add_two / HLOOpt / cuda / BothRev |
0.000031776 s |
0.000033856 s |
0.94 |
add_two / PartOpt / cuda / PreRev |
0.000031712 s |
0.000033568 s |
0.94 |
add_two / PartOpt / cuda / PostRev |
0.000040769 s |
0.000033569 s |
1.21 |
add_two / PartOpt / cuda / BothRev |
0.000032032 s |
0.000032800000000000004 s |
0.98 |
add_two / IPartOpt / cuda / PreRev |
0.000031552 s |
0.000032704 s |
0.96 |
add_two / IPartOpt / cuda / PostRev |
0.00003152 s |
0.00003328 s |
0.95 |
add_two / IPartOpt / cuda / BothRev |
0.000031712 s |
0.000032673000000000004 s |
0.97 |
add_two / DefOpt / cuda / PreRev |
0.000031808000000000004 s |
0.000033184 s |
0.96 |
add_two / DefOpt / cuda / PostRev |
0.000032224 s |
0.0000336 s |
0.96 |
add_two / DefOpt / cuda / BothRev |
0.000032288 s |
0.000034336 s |
0.94 |
add_two / IDefOpt / cuda / PreRev |
0.000032512 s |
0.000034144000000000004 s |
0.95 |
add_two / IDefOpt / cuda / PostRev |
0.000031648 s |
0.000033824 s |
0.94 |
add_two / IDefOpt / cuda / BothRev |
0.000031265 s |
0.000034048 s |
0.92 |
add_two / JaXPipe / cpu / Primal |
0.000021159 s |
0.000008078299997578142 s |
2.62 |
add_two / Jax / cpu / Primal |
0.000013409000000000002 s |
0.000007480699987354456 s |
1.79 |
add_two / HLOOpt / cpu / Primal |
0.000013475 s |
0.000011705360002451924 s |
1.15 |
add_two / PartOpt / cpu / Primal |
0.000013254 s |
0.000007847820043025422 s |
1.69 |
add_two / IPartOpt / cpu / Primal |
0.000013562 s |
0.000008033739977690856 s |
1.69 |
add_two / DefOpt / cpu / Primal |
0.000013011 s |
0.000011882460030392397 s |
1.09 |
add_two / IDefOpt / cpu / Primal |
0.000013226 s |
0.000007653040029254043 s |
1.73 |
add_two / JaXPipe / cpu / Forward |
0.000018317 s |
0.00001155064003796724 s |
1.59 |
add_two / Jax / cpu / Forward |
0.000017885999999999998 s |
0.00001183109997327847 s |
1.51 |
add_two / HLOOpt / cpu / Forward |
0.000018375 s |
0.000016174619977391556 s |
1.14 |
add_two / PartOpt / cpu / Forward |
0.00001828 s |
0.00001656830003412324 s |
1.10 |
add_two / IPartOpt / cpu / Forward |
0.00001816 s |
0.000011428399993747009 s |
1.59 |
add_two / DefOpt / cpu / Forward |
0.000017941 s |
0.000011539719998836518 s |
1.55 |
add_two / IDefOpt / cpu / Forward |
0.000018243 s |
0.00001197085999592673 s |
1.52 |
add_two / JaXPipe / cpu / PreRev |
0.000023963 s |
0.000015447079995283275 s |
1.55 |
add_two / JaXPipe / cpu / PostRev |
0.000023168 s |
0.000015476439966732868 s |
1.50 |
add_two / JaXPipe / cpu / BothRev |
0.000023773 s |
0.00001541200000247045 s |
1.54 |
add_two / Jax / cpu / BothRev |
0.000023244 s |
0.00001537471996016393 s |
1.51 |
add_two / HLOOpt / cpu / PreRev |
0.00002421 s |
0.00001551939993078122 s |
1.56 |
add_two / HLOOpt / cpu / PostRev |
0.000023285 s |
0.00001580970001668902 s |
1.47 |
add_two / HLOOpt / cpu / BothRev |
0.000023493 s |
0.000017628919977141777 s |
1.33 |
add_two / PartOpt / cpu / PreRev |
0.000023579 s |
0.000015835540016269078 s |
1.49 |
add_two / PartOpt / cpu / PostRev |
0.000023824 s |
0.000015323180014092942 s |
1.55 |
add_two / PartOpt / cpu / BothRev |
0.000023195 s |
0.000015682500015827828 s |
1.48 |
add_two / IPartOpt / cpu / PreRev |
0.00002324 s |
0.000015858660008234436 s |
1.47 |
add_two / IPartOpt / cpu / PostRev |
0.000023447 s |
0.000015334940017055487 s |
1.53 |
add_two / IPartOpt / cpu / BothRev |
0.000023597 s |
0.000015353399994637585 s |
1.54 |
add_two / DefOpt / cpu / PreRev |
0.000024082 s |
0.000015210360006676635 s |
1.58 |
add_two / DefOpt / cpu / PostRev |
0.000023117 s |
0.00001574001999870234 s |
1.47 |
add_two / DefOpt / cpu / BothRev |
0.000023393 s |
0.000015454039994438064 s |
1.51 |
add_two / IDefOpt / cpu / PreRev |
0.00002379 s |
0.000015190960048130363 s |
1.57 |
add_two / IDefOpt / cpu / PostRev |
0.000023502 s |
0.000016048160032369195 s |
1.46 |
add_two / IDefOpt / cpu / BothRev |
0.000023427 s |
0.000015832279987080257 s |
1.48 |
cache / JaXPipe / cpu / Primal |
0.000006432140016841004 s |
0.000007311580002351548 s |
0.88 |
cache / Jax / cpu / Primal |
0.000006882260004204 s |
0.000008188780020645936 s |
0.84 |
cache / HLOOpt / cpu / Primal |
0.000007008400016275118 s |
0.000007291099991562077 s |
0.96 |
cache / PartOpt / cpu / Primal |
0.000006896540007801377 s |
0.000007267500013767858 s |
0.95 |
cache / IPartOpt / cpu / Primal |
0.0000067228599891677736 s |
0.000007347700038735638 s |
0.91 |
cache / DefOpt / cpu / Primal |
0.000006815819970142911 s |
0.000007079820006765658 s |
0.96 |
cache / IDefOpt / cpu / Primal |
0.000007488499977625906 s |
0.000007294299994100584 s |
1.03 |
cache / JaXPipe / cpu / Forward |
0.000014228039981389884 s |
0.000015604340023855912 s |
0.91 |
cache / Jax / cpu / Forward |
0.000015017699961390465 s |
0.000016084959997897386 s |
0.93 |
cache / HLOOpt / cpu / Forward |
0.00001559464003548783 s |
0.000016539119997105445 s |
0.94 |
cache / PartOpt / cpu / Forward |
0.00001505413998529548 s |
0.00002018942001996038 s |
0.75 |
cache / IPartOpt / cpu / Forward |
0.000015285179988495655 s |
0.00001527753997834225 s |
1.00 |
cache / DefOpt / cpu / Forward |
0.00001502036001511442 s |
0.00002051123997262039 s |
0.73 |
cache / IDefOpt / cpu / Forward |
0.000014946979981687036 s |
0.00001486773998294666 s |
1.01 |
cache / JaXPipe / cpu / PreRev |
0.000016526060026080813 s |
0.000016669340011503665 s |
0.99 |
cache / JaXPipe / cpu / PostRev |
0.00002095940000799601 s |
0.000022038719989723177 s |
0.95 |
cache / JaXPipe / cpu / BothRev |
0.000017082559952541488 s |
0.00001681013998677372 s |
1.02 |
cache / Jax / cpu / BothRev |
0.000021224540014372905 s |
0.000022604220002904186 s |
0.94 |
cache / HLOOpt / cpu / PreRev |
0.000016343339993909466 s |
0.000017696700033411617 s |
0.92 |
cache / HLOOpt / cpu / PostRev |
0.000020281999977669328 s |
0.00002060288000393484 s |
0.98 |
cache / HLOOpt / cpu / BothRev |
0.000016862659977050497 s |
0.0000200398199740448 s |
0.84 |
cache / PartOpt / cpu / PreRev |
0.000016638980005154734 s |
0.000017317400006504615 s |
0.96 |
cache / PartOpt / cpu / PostRev |
0.00002204678002271976 s |
0.00002674846000445541 s |
0.82 |
cache / PartOpt / cpu / BothRev |
0.000016388900012316298 s |
0.00001720827998724417 s |
0.95 |
cache / IPartOpt / cpu / PreRev |
0.00001643406000766845 s |
0.00001730513999973482 s |
0.95 |
cache / IPartOpt / cpu / PostRev |
0.00002149917997485318 s |
0.0000223068999730458 s |
0.96 |
cache / IPartOpt / cpu / BothRev |
0.000016439380024166893 s |
0.000017066479967979832 s |
0.96 |
cache / DefOpt / cpu / PreRev |
0.00001581144004376256 s |
0.000017424999950890195 s |
0.91 |
cache / DefOpt / cpu / PostRev |
0.000017582260034032517 s |
0.000017306580002696136 s |
1.02 |
cache / DefOpt / cpu / BothRev |
0.000016724999914004 s |
0.00001792524001757556 s |
0.93 |
cache / IDefOpt / cpu / PreRev |
0.000016733059965190478 s |
0.00001806160001251556 s |
0.93 |
cache / IDefOpt / cpu / PostRev |
0.000016693339957782883 s |
0.00001825745999667561 s |
0.91 |
cache / IDefOpt / cpu / BothRev |
0.000016379819990106625 s |
0.00001824015998863615 s |
0.90 |
cache / JaXPipe / cuda / Primal |
0.000002303 s |
0.000002272 s |
1.01 |
cache / Jax / cuda / Primal |
0.000002335 s |
0.00000224 s |
1.04 |
cache / HLOOpt / cuda / Primal |
0.000002272 s |
0.00000224 s |
1.01 |
cache / PartOpt / cuda / Primal |
0.00000224 s |
0.00000224 s |
1 |
cache / IPartOpt / cuda / Primal |
0.000002303 s |
0.000002208 s |
1.04 |
cache / DefOpt / cuda / Primal |
0.000002272 s |
0.00000224 s |
1.01 |
cache / IDefOpt / cuda / Primal |
0.000002272 s |
0.000002303 s |
0.99 |
cache / JaXPipe / cuda / Forward |
0.000002335 s |
0.000002304 s |
1.01 |
cache / Jax / cuda / Forward |
0.000002336 s |
0.000002272 s |
1.03 |
cache / HLOOpt / cuda / Forward |
0.000002336 s |
0.000002304 s |
1.01 |
cache / PartOpt / cuda / Forward |
0.000002336 s |
0.000002304 s |
1.01 |
cache / IPartOpt / cuda / Forward |
0.000002336 s |
0.000002273 s |
1.03 |
cache / DefOpt / cuda / Forward |
0.000002304 s |
0.00000224 s |
1.03 |
cache / IDefOpt / cuda / Forward |
0.000002335 s |
0.00000224 s |
1.04 |
cache / JaXPipe / cuda / PreRev |
0.000010496 s |
0.000012128 s |
0.87 |
cache / JaXPipe / cuda / PostRev |
0.00001056 s |
0.000011712 s |
0.90 |
cache / JaXPipe / cuda / BothRev |
0.00001088 s |
0.000012096 s |
0.90 |
cache / Jax / cuda / BothRev |
0.00001072 s |
0.000012224 s |
0.88 |
cache / HLOOpt / cuda / PreRev |
0.000013215 s |
0.000013408 s |
0.99 |
cache / HLOOpt / cuda / PostRev |
0.000013152 s |
0.000013408 s |
0.98 |
cache / HLOOpt / cuda / BothRev |
0.000013215 s |
0.000013376 s |
0.99 |
cache / PartOpt / cuda / PreRev |
0.000010529 s |
0.00001216 s |
0.87 |
cache / PartOpt / cuda / PostRev |
0.000010593 s |
0.000011872 s |
0.89 |
cache / PartOpt / cuda / BothRev |
0.000010848 s |
0.000012192 s |
0.89 |
cache / IPartOpt / cuda / PreRev |
0.000010848 s |
0.000012032 s |
0.90 |
cache / IPartOpt / cuda / PostRev |
0.000010848 s |
0.000012128 s |
0.89 |
cache / IPartOpt / cuda / BothRev |
0.000010816 s |
0.000011936 s |
0.91 |
cache / DefOpt / cuda / PreRev |
0.000013696 s |
0.000011904 s |
1.15 |
cache / DefOpt / cuda / PostRev |
0.00001056 s |
0.000012415 s |
0.85 |
cache / DefOpt / cuda / BothRev |
0.000010752 s |
0.000011968 s |
0.90 |
cache / IDefOpt / cuda / PreRev |
0.000010784 s |
0.00001248 s |
0.86 |
cache / IDefOpt / cuda / PostRev |
0.000010752 s |
0.00001184 s |
0.91 |
cache / IDefOpt / cuda / BothRev |
0.000010784 s |
0.000011585 s |
0.93 |
cache / JaXPipe / cpu / Primal |
0.00001356 s |
0.000007311580002351548 s |
1.85 |
cache / Jax / cpu / Primal |
0.000012661 s |
0.000008188780020645936 s |
1.55 |
cache / HLOOpt / cpu / Primal |
0.00001285 s |
0.000007291099991562077 s |
1.76 |
cache / PartOpt / cpu / Primal |
0.000012785 s |
0.000007267500013767858 s |
1.76 |
cache / IPartOpt / cpu / Primal |
0.000012509 s |
0.000007347700038735638 s |
1.70 |
cache / DefOpt / cpu / Primal |
0.000012773 s |
0.000007079820006765658 s |
1.80 |
cache / IDefOpt / cpu / Primal |
0.000012641 s |
0.000007294299994100584 s |
1.73 |
cache / JaXPipe / cpu / Forward |
0.000016743999999999998 s |
0.000015604340023855912 s |
1.07 |
cache / Jax / cpu / Forward |
0.000016997 s |
0.000016084959997897386 s |
1.06 |
cache / HLOOpt / cpu / Forward |
0.000016963 s |
0.000016539119997105445 s |
1.03 |
cache / PartOpt / cpu / Forward |
0.000017327 s |
0.00002018942001996038 s |
0.86 |
cache / IPartOpt / cpu / Forward |
0.00001757 s |
0.00001527753997834225 s |
1.15 |
cache / DefOpt / cpu / Forward |
0.000016728 s |
0.00002051123997262039 s |
0.82 |
cache / IDefOpt / cpu / Forward |
0.000017302 s |
0.00001486773998294666 s |
1.16 |
cache / JaXPipe / cpu / PreRev |
0.000017846 s |
0.000016669340011503665 s |
1.07 |
cache / JaXPipe / cpu / PostRev |
0.00001957 s |
0.000022038719989723177 s |
0.89 |
cache / JaXPipe / cpu / BothRev |
0.000018164 s |
0.00001681013998677372 s |
1.08 |
cache / Jax / cpu / BothRev |
0.000020201 s |
0.000022604220002904186 s |
0.89 |
cache / HLOOpt / cpu / PreRev |
0.000017575 s |
0.000017696700033411617 s |
0.99 |
cache / HLOOpt / cpu / PostRev |
0.000018065 s |
0.00002060288000393484 s |
0.88 |
cache / HLOOpt / cpu / BothRev |
0.000017817 s |
0.0000200398199740448 s |
0.89 |
cache / PartOpt / cpu / PreRev |
0.000018004 s |
0.000017317400006504615 s |
1.04 |
cache / PartOpt / cpu / PostRev |
0.000019774 s |
0.00002674846000445541 s |
0.74 |
cache / PartOpt / cpu / BothRev |
0.000017559000000000002 s |
0.00001720827998724417 s |
1.02 |
cache / IPartOpt / cpu / PreRev |
0.000017397 s |
0.00001730513999973482 s |
1.01 |
cache / IPartOpt / cpu / PostRev |
0.000019418000000000003 s |
0.0000223068999730458 s |
0.87 |
cache / IPartOpt / cpu / BothRev |
0.000018048 s |
0.000017066479967979832 s |
1.06 |
cache / DefOpt / cpu / PreRev |
0.000017804 s |
0.000017424999950890195 s |
1.02 |
cache / DefOpt / cpu / PostRev |
0.000017899999999999998 s |
0.000017306580002696136 s |
1.03 |
cache / DefOpt / cpu / BothRev |
0.000017706 s |
0.00001792524001757556 s |
0.99 |
cache / IDefOpt / cpu / PreRev |
0.000017933999999999998 s |
0.00001806160001251556 s |
0.99 |
cache / IDefOpt / cpu / PostRev |
0.000017552 s |
0.00001825745999667561 s |
0.96 |
cache / IDefOpt / cpu / BothRev |
0.000017759 s |
0.00001824015998863615 s |
0.97 |
Concat / JaXPipe / cpu / Primal |
0.000007221699979709228 s |
0.000007725860023128916 s |
0.93 |
Concat / Jax / cpu / Primal |
0.000007020120010565734 s |
0.000007840020052753972 s |
0.90 |
Concat / HLOOpt / cpu / Primal |
0.000007035699991320144 s |
0.00000750840002183395 s |
0.94 |
Concat / PartOpt / cpu / Primal |
0.000007001780022619641 s |
0.000007161900002756738 s |
0.98 |
Concat / IPartOpt / cpu / Primal |
0.000007600919998367317 s |
0.000007567560014649644 s |
1.00 |
Concat / DefOpt / cpu / Primal |
0.000007240119975904235 s |
0.000011547119993338128 s |
0.63 |
Concat / IDefOpt / cpu / Primal |
0.000007343900006162585 s |
0.00000715518004653859 s |
1.03 |
Concat / JaXPipe / cpu / Forward |
0.0000112459200045123 s |
0.000010496600007172674 s |
1.07 |
Concat / Jax / cpu / Forward |
0.000011367659972165711 s |
0.000011853659989355949 s |
0.96 |
Concat / HLOOpt / cpu / Forward |
0.000011833399976239888 s |
0.000015137940035856443 s |
0.78 |
Concat / PartOpt / cpu / Forward |
0.000011919760017917725 s |
0.000015778240040162928 s |
0.76 |
Concat / IPartOpt / cpu / Forward |
0.000011222819975955643 s |
0.00001113319998694351 s |
1.01 |
Concat / DefOpt / cpu / Forward |
0.000011398100023143342 s |
0.00001572293999743124 s |
0.72 |
Concat / IDefOpt / cpu / Forward |
0.000011427639992689364 s |
0.00001081772003999504 s |
1.06 |
Concat / JaXPipe / cpu / PreRev |
0.000013679579960808042 s |
0.000012534260013126189 s |
1.09 |
Concat / JaXPipe / cpu / PostRev |
0.00001313661999120086 s |
0.000013171179989512894 s |
1.00 |
Concat / JaXPipe / cpu / BothRev |
0.000012453260023903568 s |
0.000015525199987678207 s |
0.80 |
Concat / Jax / cpu / BothRev |
0.000012461420019462822 s |
0.00001536104002298089 s |
0.81 |
Concat / HLOOpt / cpu / PreRev |
0.000013440659977277391 s |
0.000012696619996859228 s |
1.06 |
Concat / HLOOpt / cpu / PostRev |
0.000014991819980423316 s |
0.000016494120009156175 s |
0.91 |
Concat / HLOOpt / cpu / BothRev |
0.00001329656002781121 s |
0.000014265039962992887 s |
0.93 |
Concat / PartOpt / cpu / PreRev |
0.000013204819988459348 s |
0.000012370900021778652 s |
1.07 |
Concat / PartOpt / cpu / PostRev |
0.000012459880017559043 s |
0.000012754700019286247 s |
0.98 |
Concat / PartOpt / cpu / BothRev |
0.000013172339986340376 s |
0.00001251313999091508 s |
1.05 |
Concat / IPartOpt / cpu / PreRev |
0.000013480920006259112 s |
0.000015078859996719985 s |
0.89 |
Concat / IPartOpt / cpu / PostRev |
0.00001280384001802304 s |
0.000012710540013358696 s |
1.01 |
Concat / IPartOpt / cpu / BothRev |
0.000012921859988637153 s |
0.00001284707999730017 s |
1.01 |
Concat / DefOpt / cpu / PreRev |
0.000013157880021026358 s |
0.000012832560005335835 s |
1.03 |
Concat / DefOpt / cpu / PostRev |
0.000012598600005730986 s |
0.000012438900012057274 s |
1.01 |
Concat / DefOpt / cpu / BothRev |
0.000012484200015023816 s |
0.000012456539980121309 s |
1.00 |
Concat / IDefOpt / cpu / PreRev |
0.000012715679995380924 s |
0.000012725619917546284 s |
1.00 |
Concat / IDefOpt / cpu / PostRev |
0.000013090719985484613 s |
0.000012815459976991406 s |
1.02 |
Concat / IDefOpt / cpu / BothRev |
0.0000128324000070279 s |
0.000016934820041569765 s |
0.76 |
Concat / JaXPipe / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / Jax / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / HLOOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / PartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / IPartOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / DefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / IDefOpt / cuda / Primal |
0.0000019200000000000003 s |
0.0000019200000000000003 s |
1 |
Concat / JaXPipe / cuda / Forward |
0.00000976 s |
0.000010752 s |
0.91 |
Concat / Jax / cuda / Forward |
0.000009664 s |
0.000010945 s |
0.88 |
Concat / HLOOpt / cuda / Forward |
0.000010016 s |
0.000010592 s |
0.95 |
Concat / PartOpt / cuda / Forward |
0.000009696 s |
0.000010176 s |
0.95 |
Concat / IPartOpt / cuda / Forward |
0.000010016 s |
0.000010528 s |
0.95 |
Concat / DefOpt / cuda / Forward |
0.000010177 s |
0.00001072 s |
0.95 |
Concat / IDefOpt / cuda / Forward |
0.000009696 s |
0.000010272 s |
0.94 |
Concat / JaXPipe / cuda / PreRev |
0.00001584 s |
0.000017344 s |
0.91 |
Concat / JaXPipe / cuda / PostRev |
0.000016192 s |
0.000017664 s |
0.92 |
Concat / JaXPipe / cuda / BothRev |
0.000016383999999999998 s |
0.000017568000000000002 s |
0.93 |
Concat / Jax / cuda / BothRev |
0.000021792 s |
0.000017632 s |
1.24 |
Concat / HLOOpt / cuda / PreRev |
0.000016544 s |
0.0000176 s |
0.94 |
Concat / HLOOpt / cuda / PostRev |
0.00001616 s |
0.000017696 s |
0.91 |
Concat / HLOOpt / cuda / BothRev |
0.000016448000000000002 s |
0.0000176 s |
0.93 |
Concat / PartOpt / cuda / PreRev |
0.000016704 s |
0.000017696 s |
0.94 |
Concat / PartOpt / cuda / PostRev |
0.000016768999999999998 s |
0.000017247999999999998 s |
0.97 |
Concat / PartOpt / cuda / BothRev |
0.000016 s |
0.000017344 s |
0.92 |
Concat / IPartOpt / cuda / PreRev |
0.00001632 s |
0.000017408 s |
0.94 |
Concat / IPartOpt / cuda / PostRev |
0.00002128 s |
0.000017760000000000003 s |
1.20 |
Concat / IPartOpt / cuda / BothRev |
0.000024416 s |
0.000017567 s |
1.39 |
Concat / DefOpt / cuda / PreRev |
0.000016383999999999998 s |
0.000017664 s |
0.93 |
Concat / DefOpt / cuda / PostRev |
0.000016414999999999998 s |
0.000017311 s |
0.95 |
Concat / DefOpt / cuda / BothRev |
0.000016063999999999997 s |
0.000018112 s |
0.89 |
Concat / IDefOpt / cuda / PreRev |
0.000016224 s |
0.000019809 s |
0.82 |
Concat / IDefOpt / cuda / PostRev |
0.00002128 s |
0.00001728 s |
1.23 |
Concat / IDefOpt / cuda / BothRev |
0.000015776 s |
0.000017185 s |
0.92 |
Concat / JaXPipe / cpu / Primal |
0.000012782 s |
0.000007725860023128916 s |
1.65 |
Concat / Jax / cpu / Primal |
0.000012741 s |
0.000007840020052753972 s |
1.63 |
Concat / HLOOpt / cpu / Primal |
0.000012885 s |
0.00000750840002183395 s |
1.72 |
Concat / PartOpt / cpu / Primal |
0.000012818 s |
0.000007161900002756738 s |
1.79 |
Concat / IPartOpt / cpu / Primal |
0.000012801 s |
0.000007567560014649644 s |
1.69 |
Concat / DefOpt / cpu / Primal |
0.000012828000000000002 s |
0.000011547119993338128 s |
1.11 |
Concat / IDefOpt / cpu / Primal |
0.000012896 s |
0.00000715518004653859 s |
1.80 |
Concat / JaXPipe / cpu / Forward |
0.000017842 s |
0.000010496600007172674 s |
1.70 |
Concat / Jax / cpu / Forward |
0.000017468999999999998 s |
0.000011853659989355949 s |
1.47 |
Concat / HLOOpt / cpu / Forward |
0.000017962 s |
0.000015137940035856443 s |
1.19 |
Concat / PartOpt / cpu / Forward |
0.000017562999999999998 s |
0.000015778240040162928 s |
1.11 |
Concat / IPartOpt / cpu / Forward |
0.000017725 s |
0.00001113319998694351 s |
1.59 |
Concat / DefOpt / cpu / Forward |
0.000017799 s |
0.00001572293999743124 s |
1.13 |
Concat / IDefOpt / cpu / Forward |
0.000018273 s |
0.00001081772003999504 s |
1.69 |
Concat / JaXPipe / cpu / PreRev |
0.000020651 s |
0.000012534260013126189 s |
1.65 |
Concat / JaXPipe / cpu / PostRev |
0.000020066 s |
0.000013171179989512894 s |
1.52 |
Concat / JaXPipe / cpu / BothRev |
0.000019943 s |
0.000015525199987678207 s |
1.28 |
Concat / Jax / cpu / BothRev |
0.000020255 s |
0.00001536104002298089 s |
1.32 |
Concat / HLOOpt / cpu / PreRev |
0.000019807 s |
0.000012696619996859228 s |
1.56 |
Concat / HLOOpt / cpu / PostRev |
0.000020051 s |
0.000016494120009156175 s |
1.22 |
Concat / HLOOpt / cpu / BothRev |
0.000020586 s |
0.000014265039962992887 s |
1.44 |
Concat / PartOpt / cpu / PreRev |
0.000020417 s |
0.000012370900021778652 s |
1.65 |
Concat / PartOpt / cpu / PostRev |
0.000019568 s |
0.000012754700019286247 s |
1.53 |
Concat / PartOpt / cpu / BothRev |
0.000019945 s |
0.00001251313999091508 s |
1.59 |
Concat / IPartOpt / cpu / PreRev |
0.000020321 s |
0.000015078859996719985 s |
1.35 |
Concat / IPartOpt / cpu / PostRev |
0.000019551 s |
0.000012710540013358696 s |
1.54 |
Concat / IPartOpt / cpu / BothRev |
0.000019931 s |
0.00001284707999730017 s |
1.55 |
Concat / DefOpt / cpu / PreRev |
0.000020228 s |
0.000012832560005335835 s |
1.58 |
Concat / DefOpt / cpu / PostRev |
0.000019591 s |
0.000012438900012057274 s |
1.57 |
Concat / DefOpt / cpu / BothRev |
0.000020009 s |
0.000012456539980121309 s |
1.61 |
Concat / IDefOpt / cpu / PreRev |
0.000020193 s |
0.000012725619917546284 s |
1.59 |
Concat / IDefOpt / cpu / PostRev |
0.00001974 s |
0.000012815459976991406 s |
1.54 |
Concat / IDefOpt / cpu / BothRev |
0.000019962 s |
0.000016934820041569765 s |
1.18 |
const_scatter / JaXPipe / cpu / Primal |
0.00000747919999412261 s |
0.000009149859961326 s |
0.82 |
const_scatter / Jax / cpu / Primal |
0.000006641100017077406 s |
0.000008083280008577275 s |
0.82 |
const_scatter / HLOOpt / cpu / Primal |
0.000007145539957491564 s |
0.0000071863200082589175 s |
0.99 |
const_scatter / PartOpt / cpu / Primal |
0.000007380379975074902 s |
0.000007375739960480132 s |
1.00 |
const_scatter / IPartOpt / cpu / Primal |
0.000006980480038691894 s |
0.000007060400021146052 s |
0.99 |
const_scatter / DefOpt / cpu / Primal |
0.000006869640037621138 s |
0.000011595319983825904 s |
0.59 |
const_scatter / IDefOpt / cpu / Primal |
0.000007087079975462984 s |
0.000007401519987979555 s |
0.96 |
const_scatter / JaXPipe / cpu / Forward |
0.00001131333995544992 s |
0.000010921500024778652 s |
1.04 |
const_scatter / Jax / cpu / Forward |
0.000011043839976991877 s |
0.000011650580008790711 s |
0.95 |
const_scatter / HLOOpt / cpu / Forward |
0.000010862119979719865 s |
0.000018608339969432565 s |
0.58 |
const_scatter / PartOpt / cpu / Forward |
0.000010625499990055686 s |
0.000015195739988485002 s |
0.70 |
const_scatter / IPartOpt / cpu / Forward |
0.000011346719975335872 s |
0.000010543240014158071 s |
1.08 |
const_scatter / DefOpt / cpu / Forward |
0.0000107986600141885 s |
0.00001736160001200915 s |
0.62 |
const_scatter / IDefOpt / cpu / Forward |
0.000010474859982423368 s |
0.00001035087999298412 s |
1.01 |
const_scatter / JaXPipe / cpu / PreRev |
0.0002874413600329 s |
0.0003018126200458 s |
0.95 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002857827799834 s |
0.0002955402599673 s |
0.97 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002855203800027 s |
0.0002844799399917 s |
1.00 |
const_scatter / Jax / cpu / BothRev |
0.0002838158000122 s |
0.0002837072600505 s |
1.00 |
const_scatter / HLOOpt / cpu / PreRev |
0.000284826219995 s |
0.0002846959200269 s |
1.00 |
const_scatter / HLOOpt / cpu / PostRev |
0.0002857457200025 s |
0.0002833803199246 s |
1.01 |
const_scatter / HLOOpt / cpu / BothRev |
0.0002840817000014 s |
0.0002847788800045 s |
1.00 |
const_scatter / PartOpt / cpu / PreRev |
0.0002851620600176 s |
0.0002836582799682 s |
1.01 |
const_scatter / PartOpt / cpu / PostRev |
0.0002866757000174 s |
0.0002841413000351 s |
1.01 |
const_scatter / PartOpt / cpu / BothRev |
0.0003002397000091 s |
0.0002875587399739 s |
1.04 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002843613999993 s |
0.0002839712200147 s |
1.00 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002858131800076 s |
0.0002899488999992 s |
0.99 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002842763799981 s |
0.0002884466199793 s |
0.99 |
const_scatter / DefOpt / cpu / PreRev |
0.0002845163999154 s |
0.0002823218399953 s |
1.01 |
const_scatter / DefOpt / cpu / PostRev |
0.0002852482400066 s |
0.0002883031800047 s |
0.99 |
const_scatter / DefOpt / cpu / BothRev |
0.0002836111999749 s |
0.0002947299199695 s |
0.96 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002846000200224 s |
0.000284863320021 s |
1.00 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002853388200492 s |
0.0002873838200321 s |
0.99 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002846108599987 s |
0.0002888906400039 s |
0.99 |
const_scatter / JaXPipe / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / Jax / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / HLOOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / PartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / IPartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / DefOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / IDefOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
const_scatter / JaXPipe / cuda / Forward |
0.000009344 s |
0.000011968 s |
0.78 |
const_scatter / Jax / cuda / Forward |
0.00000992 s |
0.000011776 s |
0.84 |
const_scatter / HLOOpt / cuda / Forward |
0.000009472 s |
0.000010176 s |
0.93 |
const_scatter / PartOpt / cuda / Forward |
0.000009824 s |
0.000011744 s |
0.84 |
const_scatter / IPartOpt / cuda / Forward |
0.000009727 s |
0.000009824 s |
0.99 |
const_scatter / DefOpt / cuda / Forward |
0.000009824 s |
0.000011904 s |
0.83 |
const_scatter / IDefOpt / cuda / Forward |
0.000010048 s |
0.00001024 s |
0.98 |
const_scatter / JaXPipe / cuda / PreRev |
0.000012928 s |
0.000013152 s |
0.98 |
const_scatter / JaXPipe / cuda / PostRev |
0.000016065 s |
0.000017856 s |
0.90 |
const_scatter / JaXPipe / cuda / BothRev |
0.0000128 s |
0.000013408 s |
0.95 |
const_scatter / Jax / cuda / BothRev |
0.000016704 s |
0.000017792 s |
0.94 |
const_scatter / HLOOpt / cuda / PreRev |
0.000012704 s |
0.000013152 s |
0.97 |
const_scatter / HLOOpt / cuda / PostRev |
0.000012736 s |
0.000013504 s |
0.94 |
const_scatter / HLOOpt / cuda / BothRev |
0.00001264 s |
0.000013312 s |
0.95 |
const_scatter / PartOpt / cuda / PreRev |
0.00001264 s |
0.000013664 s |
0.93 |
const_scatter / PartOpt / cuda / PostRev |
0.000016255999999999998 s |
0.0000176 s |
0.92 |
const_scatter / PartOpt / cuda / BothRev |
0.000012736 s |
0.00001328 s |
0.96 |
const_scatter / IPartOpt / cuda / PreRev |
0.00001264 s |
0.000014816 s |
0.85 |
const_scatter / IPartOpt / cuda / PostRev |
0.000016096 s |
0.00001824 s |
0.88 |
const_scatter / IPartOpt / cuda / BothRev |
0.000012704 s |
0.000012992 s |
0.98 |
const_scatter / DefOpt / cuda / PreRev |
0.000012736 s |
0.000012928 s |
0.99 |
const_scatter / DefOpt / cuda / PostRev |
0.000014912 s |
0.00001328 s |
1.12 |
const_scatter / DefOpt / cuda / BothRev |
0.000012767 s |
0.000013216 s |
0.97 |
const_scatter / IDefOpt / cuda / PreRev |
0.000012864 s |
0.00001312 s |
0.98 |
const_scatter / IDefOpt / cuda / PostRev |
0.000012896 s |
0.000013568 s |
0.95 |
const_scatter / IDefOpt / cuda / BothRev |
0.000011904 s |
0.00001296 s |
0.92 |
const_scatter / JaXPipe / cpu / Primal |
0.00001269 s |
0.000009149859961326 s |
1.39 |
const_scatter / Jax / cpu / Primal |
0.000012849 s |
0.000008083280008577275 s |
1.59 |
const_scatter / HLOOpt / cpu / Primal |
0.000012805 s |
0.0000071863200082589175 s |
1.78 |
const_scatter / PartOpt / cpu / Primal |
0.000012751 s |
0.000007375739960480132 s |
1.73 |
const_scatter / IPartOpt / cpu / Primal |
0.000012663 s |
0.000007060400021146052 s |
1.79 |
const_scatter / DefOpt / cpu / Primal |
0.000012875 s |
0.000011595319983825904 s |
1.11 |
const_scatter / IDefOpt / cpu / Primal |
0.000012726 s |
0.000007401519987979555 s |
1.72 |
const_scatter / JaXPipe / cpu / Forward |
0.000017249 s |
0.000010921500024778652 s |
1.58 |
const_scatter / Jax / cpu / Forward |
0.000016767 s |
0.000011650580008790711 s |
1.44 |
const_scatter / HLOOpt / cpu / Forward |
0.000016929 s |
0.000018608339969432565 s |
0.91 |
const_scatter / PartOpt / cpu / Forward |
0.000017105 s |
0.000015195739988485002 s |
1.13 |
const_scatter / IPartOpt / cpu / Forward |
0.000016919 s |
0.000010543240014158071 s |
1.60 |
const_scatter / DefOpt / cpu / Forward |
0.000016734 s |
0.00001736160001200915 s |
0.96 |
const_scatter / IDefOpt / cpu / Forward |
0.000017028 s |
0.00001035087999298412 s |
1.65 |
const_scatter / JaXPipe / cpu / PreRev |
0.000505407 s |
0.0003018126200458 s |
1.67 |
const_scatter / JaXPipe / cpu / PostRev |
0.00050104 s |
0.0002955402599673 s |
1.70 |
const_scatter / JaXPipe / cpu / BothRev |
0.000494736 s |
0.0002844799399917 s |
1.74 |
const_scatter / Jax / cpu / BothRev |
0.000526503 s |
0.0002837072600505 s |
1.86 |
const_scatter / HLOOpt / cpu / PreRev |
0.000508877 s |
0.0002846959200269 s |
1.79 |
const_scatter / HLOOpt / cpu / PostRev |
0.000494903 s |
0.0002833803199246 s |
1.75 |
const_scatter / HLOOpt / cpu / BothRev |
0.000489656 s |
0.0002847788800045 s |
1.72 |
const_scatter / PartOpt / cpu / PreRev |
0.000522429 s |
0.0002836582799682 s |
1.84 |
const_scatter / PartOpt / cpu / PostRev |
0.000528957 s |
0.0002841413000351 s |
1.86 |
const_scatter / PartOpt / cpu / BothRev |
0.000523817 s |
0.0002875587399739 s |
1.82 |
const_scatter / IPartOpt / cpu / PreRev |
0.000509036 s |
0.0002839712200147 s |
1.79 |
const_scatter / IPartOpt / cpu / PostRev |
0.000513426 s |
0.0002899488999992 s |
1.77 |
const_scatter / IPartOpt / cpu / BothRev |
0.00052009 s |
0.0002884466199793 s |
1.80 |
const_scatter / DefOpt / cpu / PreRev |
0.000498483 s |
0.0002823218399953 s |
1.77 |
const_scatter / DefOpt / cpu / PostRev |
0.000517853 s |
0.0002883031800047 s |
1.80 |
const_scatter / DefOpt / cpu / BothRev |
0.000502688 s |
0.0002947299199695 s |
1.71 |
const_scatter / IDefOpt / cpu / PreRev |
0.000514704 s |
0.000284863320021 s |
1.81 |
const_scatter / IDefOpt / cpu / PostRev |
0.000508156 s |
0.0002873838200321 s |
1.77 |
const_scatter / IDefOpt / cpu / BothRev |
0.0005298859999999 s |
0.0002888906400039 s |
1.83 |
GenDot / JaXPipe / cpu / Primal |
0.000009011980027935351 s |
0.000010176719997616602 s |
0.89 |
GenDot / Jax / cpu / Primal |
0.000008056360020418651 s |
0.000008191240021915292 s |
0.98 |
GenDot / HLOOpt / cpu / Primal |
0.00000891810002940474 s |
0.000013252619964987389 s |
0.67 |
GenDot / PartOpt / cpu / Primal |
0.00000830388004942506 s |
0.000008220199997595046 s |
1.01 |
GenDot / IPartOpt / cpu / Primal |
0.000008328260028065415 s |
0.00000903473997823312 s |
0.92 |
GenDot / DefOpt / cpu / Primal |
0.000008400059987252462 s |
0.000012143100011599016 s |
0.69 |
GenDot / IDefOpt / cpu / Primal |
0.000008407279992752592 s |
0.000008335859993167105 s |
1.01 |
GenDot / JaXPipe / cpu / Forward |
0.0000128066799879889 s |
0.000012309759986237625 s |
1.04 |
GenDot / Jax / cpu / Forward |
0.000011314759995002532 s |
0.000011215760005143236 s |
1.01 |
GenDot / HLOOpt / cpu / Forward |
0.000012570180006150622 s |
0.00001224884002112958 s |
1.03 |
GenDot / PartOpt / cpu / Forward |
0.000012373539984764648 s |
0.000012817179995181504 s |
0.97 |
GenDot / IPartOpt / cpu / Forward |
0.000012824420027754968 s |
0.0000129771400224854 s |
0.99 |
GenDot / DefOpt / cpu / Forward |
0.000012670619989876286 s |
0.00001729242000692466 s |
0.73 |
GenDot / IDefOpt / cpu / Forward |
0.000012118899967390462 s |
0.000011872980030602775 s |
1.02 |
GenDot / JaXPipe / cpu / PreRev |
0.00001202866003040981 s |
0.000011857600020448444 s |
1.01 |
GenDot / JaXPipe / cpu / PostRev |
0.000011505520014907233 s |
0.000011128899986942997 s |
1.03 |
GenDot / JaXPipe / cpu / BothRev |
0.000013176360016586842 s |
0.000012159080024503054 s |
1.08 |
GenDot / Jax / cpu / BothRev |
0.00001155633996859251 s |
0.000011684280007102644 s |
0.99 |
GenDot / HLOOpt / cpu / PreRev |
0.000012753279997923528 s |
0.0000117795399910392 s |
1.08 |
GenDot / HLOOpt / cpu / PostRev |
0.000014186440002958987 s |
0.000011763639986384076 s |
1.21 |
GenDot / HLOOpt / cpu / BothRev |
0.000012762119959006667 s |
0.000013835919953635313 s |
0.92 |
GenDot / PartOpt / cpu / PreRev |
0.00001237024001966347 s |
0.000012417100015227332 s |
1.00 |
GenDot / PartOpt / cpu / PostRev |
0.000011037740005122032 s |
0.000011310380014037946 s |
0.98 |
GenDot / PartOpt / cpu / BothRev |
0.000012793879968739929 s |
0.000011758400014514336 s |
1.09 |
GenDot / IPartOpt / cpu / PreRev |
0.000012097600001652608 s |
0.000013740780032094337 s |
0.88 |
GenDot / IPartOpt / cpu / PostRev |
0.000010857959978238796 s |
0.000012285620023249068 s |
0.88 |
GenDot / IPartOpt / cpu / BothRev |
0.00001304770001297584 s |
0.000011663839995890157 s |
1.12 |
GenDot / DefOpt / cpu / PreRev |
0.000012337860007392007 s |
0.000012125860048399772 s |
1.02 |
GenDot / DefOpt / cpu / PostRev |
0.00001255270005458442 s |
0.000011902560027010624 s |
1.05 |
GenDot / DefOpt / cpu / BothRev |
0.00001281071998164407 s |
0.000012017819954053264 s |
1.07 |
GenDot / IDefOpt / cpu / PreRev |
0.000011987359985141664 s |
0.000012305520049267216 s |
0.97 |
GenDot / IDefOpt / cpu / PostRev |
0.000012748979970638175 s |
0.00001163964001534623 s |
1.10 |
GenDot / IDefOpt / cpu / BothRev |
0.000013034820012762792 s |
0.000012038459981340566 s |
1.08 |
GenDot / JaXPipe / cuda / Primal |
0.000002016 s |
0.000002016 s |
1 |
GenDot / Jax / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / HLOOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
GenDot / PartOpt / cuda / Primal |
0.000002015 s |
0.000002016 s |
1.00 |
GenDot / IPartOpt / cuda / Primal |
0.000002015 s |
0.000002015 s |
1 |
GenDot / DefOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
GenDot / IDefOpt / cuda / Primal |
0.000001984 s |
0.000001984 s |
1 |
GenDot / JaXPipe / cuda / Forward |
0.00001008 s |
0.0000104 s |
0.97 |
GenDot / Jax / cuda / Forward |
0.000010112 s |
0.000010273 s |
0.98 |
GenDot / HLOOpt / cuda / Forward |
0.000010145 s |
0.00001024 s |
0.99 |
GenDot / PartOpt / cuda / Forward |
0.000010176 s |
0.000010208 s |
1.00 |
GenDot / IPartOpt / cuda / Forward |
0.00000992 s |
0.000010432 s |
0.95 |
GenDot / DefOpt / cuda / Forward |
0.000010016 s |
0.000010528 s |
0.95 |
GenDot / IDefOpt / cuda / Forward |
0.000010751 s |
0.000010272 s |
1.05 |
GenDot / JaXPipe / cuda / PreRev |
0.000009952 s |
0.000010464 s |
0.95 |
GenDot / JaXPipe / cuda / PostRev |
0.000010048 s |
0.00001072 s |
0.94 |
GenDot / JaXPipe / cuda / BothRev |
0.00000992 s |
0.000010113 s |
0.98 |
GenDot / Jax / cuda / BothRev |
0.00000992 s |
0.000010592 s |
0.94 |
GenDot / HLOOpt / cuda / PreRev |
0.000009952 s |
0.000010304 s |
0.97 |
GenDot / HLOOpt / cuda / PostRev |
0.00001056 s |
0.000010592 s |
1.00 |
GenDot / HLOOpt / cuda / BothRev |
0.000010016 s |
0.000009984 s |
1.00 |
GenDot / PartOpt / cuda / PreRev |
0.000010016 s |
0.000010464 s |
0.96 |
GenDot / PartOpt / cuda / PostRev |
0.000009696 s |
0.000010336 s |
0.94 |
GenDot / PartOpt / cuda / BothRev |
0.000010336 s |
0.000010432 s |
0.99 |
GenDot / IPartOpt / cuda / PreRev |
0.000011616 s |
0.000010656 s |
1.09 |
GenDot / IPartOpt / cuda / PostRev |
0.00001008 s |
0.000010336 s |
0.98 |
GenDot / IPartOpt / cuda / BothRev |
0.000010048 s |
0.000009984 s |
1.01 |
GenDot / DefOpt / cuda / PreRev |
0.000010048 s |
0.000010433 s |
0.96 |
GenDot / DefOpt / cuda / PostRev |
0.000009984 s |
0.000010624 s |
0.94 |
GenDot / DefOpt / cuda / BothRev |
0.000010112 s |
0.00001056 s |
0.96 |
GenDot / IDefOpt / cuda / PreRev |
0.000010144 s |
0.000010304 s |
0.98 |
GenDot / IDefOpt / cuda / PostRev |
0.000010304 s |
0.000010944 s |
0.94 |
GenDot / IDefOpt / cuda / BothRev |
0.000009984 s |
0.000010528 s |
0.95 |
GenDot / JaXPipe / cpu / Primal |
0.000015433 s |
0.000010176719997616602 s |
1.52 |
GenDot / Jax / cpu / Primal |
0.000014574 s |
0.000008191240021915292 s |
1.78 |
GenDot / HLOOpt / cpu / Primal |
0.00001418 s |
0.000013252619964987389 s |
1.07 |
GenDot / PartOpt / cpu / Primal |
0.000015038 s |
0.000008220199997595046 s |
1.83 |
GenDot / IPartOpt / cpu / Primal |
0.000014880000000000002 s |
0.00000903473997823312 s |
1.65 |
GenDot / DefOpt / cpu / Primal |
0.000014064 s |
0.000012143100011599016 s |
1.16 |
GenDot / IDefOpt / cpu / Primal |
0.000013892 s |
0.000008335859993167105 s |
1.67 |
GenDot / JaXPipe / cpu / Forward |
0.000019604 s |
0.000012309759986237625 s |
1.59 |
GenDot / Jax / cpu / Forward |
0.000020303 s |
0.000011215760005143236 s |
1.81 |
GenDot / HLOOpt / cpu / Forward |
0.000019152 s |
0.00001224884002112958 s |
1.56 |
GenDot / PartOpt / cpu / Forward |
0.000019666 s |
0.000012817179995181504 s |
1.53 |
GenDot / IPartOpt / cpu / Forward |
0.000031682 s |
0.0000129771400224854 s |
2.44 |
GenDot / DefOpt / cpu / Forward |
0.000019936 s |
0.00001729242000692466 s |
1.15 |
GenDot / IDefOpt / cpu / Forward |
0.000018864 s |
0.000011872980030602775 s |
1.59 |
GenDot / JaXPipe / cpu / PreRev |
0.00001972 s |
0.000011857600020448444 s |
1.66 |
GenDot / JaXPipe / cpu / PostRev |
0.000020703 s |
0.000011128899986942997 s |
1.86 |
GenDot / JaXPipe / cpu / BothRev |
0.000019625 s |
0.000012159080024503054 s |
1.61 |
GenDot / Jax / cpu / BothRev |
0.000020281 s |
0.000011684280007102644 s |
1.74 |
GenDot / HLOOpt / cpu / PreRev |
0.000019512 s |
0.0000117795399910392 s |
1.66 |
GenDot / HLOOpt / cpu / PostRev |
0.000019608 s |
0.000011763639986384076 s |
1.67 |
GenDot / HLOOpt / cpu / BothRev |
0.000019621 s |
0.000013835919953635313 s |
1.42 |
GenDot / PartOpt / cpu / PreRev |
0.00001966 s |
0.000012417100015227332 s |
1.58 |
GenDot / PartOpt / cpu / PostRev |
0.000020657 s |
0.000011310380014037946 s |
1.83 |
GenDot / PartOpt / cpu / BothRev |
0.000019409 s |
0.000011758400014514336 s |
1.65 |
GenDot / IPartOpt / cpu / PreRev |
0.000019336 s |
0.000013740780032094337 s |
1.41 |
GenDot / IPartOpt / cpu / PostRev |
0.000021115 s |
0.000012285620023249068 s |
1.72 |
GenDot / IPartOpt / cpu / BothRev |
0.000019082 s |
0.000011663839995890157 s |
1.64 |
GenDot / DefOpt / cpu / PreRev |
0.000019019 s |
0.000012125860048399772 s |
1.57 |
GenDot / DefOpt / cpu / PostRev |
0.000019287 s |
0.000011902560027010624 s |
1.62 |
GenDot / DefOpt / cpu / BothRev |
0.00001974 s |
0.000012017819954053264 s |
1.64 |
GenDot / IDefOpt / cpu / PreRev |
0.000019567 s |
0.000012305520049267216 s |
1.59 |
GenDot / IDefOpt / cpu / PostRev |
0.000019825 s |
0.00001163964001534623 s |
1.70 |
GenDot / IDefOpt / cpu / BothRev |
0.000019865 s |
0.000012038459981340566 s |
1.65 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000010646659957274096 s |
0.000011426740011302172 s |
0.93 |
hlo_ffi / Jax / cpu / Primal |
0.000010196740013270756 s |
0.000011207280040252954 s |
0.91 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000010464960032550153 s |
0.000014693479970446788 s |
0.71 |
hlo_ffi / PartOpt / cpu / Primal |
0.000009691080003904064 s |
0.00001102379998883407 s |
0.88 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000010357959963585015 s |
0.000010534919974816148 s |
0.98 |
hlo_ffi / DefOpt / cpu / Primal |
0.000009698259982542369 s |
0.000010974839942718972 s |
0.88 |
hlo_ffi / IDefOpt / cpu / Primal |
0.00000969200000326964 s |
0.00001072148001185269 s |
0.90 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000014440700024351828 s |
0.00001635991998227837 s |
0.88 |
hlo_ffi / Jax / cpu / Forward |
0.000014635179986726144 s |
0.00001635847996112716 s |
0.89 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000014627160007876227 s |
0.00001682942001025367 s |
0.87 |
hlo_ffi / PartOpt / cpu / Forward |
0.00001398189997416921 s |
0.00001639388000512554 s |
0.85 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000014039860006960224 s |
0.000016555979991608184 s |
0.85 |
hlo_ffi / DefOpt / cpu / Forward |
0.00001436322002518864 s |
0.000017256500013900224 s |
0.83 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000014722099958817123 s |
0.000016987700000754557 s |
0.87 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000015184239955488013 s |
0.000015459200012628573 s |
0.98 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000014085679977142718 s |
0.0000153673000386334 s |
0.92 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000014270879973992124 s |
0.000015725039938843112 s |
0.91 |
hlo_ffi / Jax / cpu / BothRev |
0.000015199559993561706 s |
0.000015293040023607317 s |
0.99 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.00001487048001763469 s |
0.000015225179995468352 s |
0.98 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.00001670466003815818 s |
0.00001628216003155103 s |
1.03 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.00001424614000825386 s |
0.000017381139996359708 s |
0.82 |
hlo_ffi / PartOpt / cpu / PreRev |
0.00001521206000688835 s |
0.000015519260014116298 s |
0.98 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000014135860010355828 s |
0.000015806079964022502 s |
0.89 |
hlo_ffi / PartOpt / cpu / BothRev |
0.00001444904000891256 s |
0.000015853700006118743 s |
0.91 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000014789319975534452 s |
0.000015407880000566364 s |
0.96 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000014395599964700524 s |
0.000015501520019824967 s |
0.93 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000014712419988427427 s |
0.00001576622002176009 s |
0.93 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000014385320000656063 s |
0.000015992219960025978 s |
0.90 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000014271599993662676 s |
0.000015379819997178858 s |
0.93 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000014359319993673126 s |
0.000015584500006298184 s |
0.92 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.00001522316001683066 s |
0.000015718799995738665 s |
0.97 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000014725319979334016 s |
0.000015333300034399144 s |
0.96 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000014237479999792412 s |
0.000015546340018772754 s |
0.92 |
hlo_ffi / JaXPipe / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / Jax / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / HLOOpt / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / PartOpt / cuda / Primal |
0.000001984 s |
0.000001983 s |
1.00 |
hlo_ffi / IPartOpt / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / DefOpt / cuda / Primal |
0.000001983 s |
0.000001983 s |
1 |
hlo_ffi / IDefOpt / cuda / Primal |
0.000001983 s |
0.000001984 s |
1.00 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002047 s |
0.00000208 s |
0.98 |
hlo_ffi / Jax / cuda / Forward |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002047 s |
0.00000208 s |
0.98 |
hlo_ffi / PartOpt / cuda / Forward |
0.000002047 s |
0.00000208 s |
0.98 |
hlo_ffi / IPartOpt / cuda / Forward |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / Forward |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / IDefOpt / cuda / Forward |
0.000002047 s |
0.00000208 s |
0.98 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / Jax / cuda / BothRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002048 s |
0.000002047 s |
1.00 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002048 s |
0.000002048 s |
1 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002047 s |
0.000002048 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002047 s |
0.000002047 s |
1 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000017902999999999998 s |
0.000011426740011302172 s |
1.57 |
hlo_ffi / Jax / cpu / Primal |
0.000017721000000000002 s |
0.000011207280040252954 s |
1.58 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000017564 s |
0.000014693479970446788 s |
1.20 |
hlo_ffi / PartOpt / cpu / Primal |
0.00001789 s |
0.00001102379998883407 s |
1.62 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017809 s |
0.000010534919974816148 s |
1.69 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017356 s |
0.000010974839942718972 s |
1.58 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000017881 s |
0.00001072148001185269 s |
1.67 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000025247 s |
0.00001635991998227837 s |
1.54 |
hlo_ffi / Jax / cpu / Forward |
0.0000244 s |
0.00001635847996112716 s |
1.49 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000024894 s |
0.00001682942001025367 s |
1.48 |
hlo_ffi / PartOpt / cpu / Forward |
0.000024733 s |
0.00001639388000512554 s |
1.51 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000024511000000000003 s |
0.000016555979991608184 s |
1.48 |
hlo_ffi / DefOpt / cpu / Forward |
0.000024324 s |
0.000017256500013900224 s |
1.41 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000024588 s |
0.000016987700000754557 s |
1.45 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.000025242 s |
0.000015459200012628573 s |
1.63 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000023934 s |
0.0000153673000386334 s |
1.56 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000024908 s |
0.000015725039938843112 s |
1.58 |
hlo_ffi / Jax / cpu / BothRev |
0.000024242 s |
0.000015293040023607317 s |
1.59 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000024488 s |
0.000015225179995468352 s |
1.61 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000024313000000000003 s |
0.00001628216003155103 s |
1.49 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000024791 s |
0.000017381139996359708 s |
1.43 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000025005 s |
0.000015519260014116298 s |
1.61 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000023793 s |
0.000015806079964022502 s |
1.51 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000024208 s |
0.000015853700006118743 s |
1.53 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000024715 s |
0.000015407880000566364 s |
1.60 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000023874 s |
0.000015501520019824967 s |
1.54 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000023745 s |
0.00001576622002176009 s |
1.51 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000024638 s |
0.000015992219960025978 s |
1.54 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000023884 s |
0.000015379819997178858 s |
1.55 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000024693 s |
0.000015584500006298184 s |
1.58 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000024905 s |
0.000015718799995738665 s |
1.58 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000024796 s |
0.000015333300034399144 s |
1.62 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000024745 s |
0.000015546340018772754 s |
1.59 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.000909142599994 s |
0.0011389066000447 s |
0.80 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0008930682000936 s |
0.0009614714001145 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0009642207999604 s |
0.0009669938000115 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0008773352000389 s |
0.0009036465999088 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0008999962000416 s |
0.0009796298000765 s |
0.92 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009689775999504 s |
0.0009802251999644 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0009589201999915 s |
0.0009544247999656 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0021403331998953 s |
0.0029485540000678 s |
0.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0022495545998935 s |
0.0024281695999889 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0022536387999025 s |
0.0023517207999248 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.0022493579999718 s |
0.0023108835999664 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.002200862600057 s |
0.0023455092001313 s |
0.94 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0022139668000818 s |
0.0025466422000135 s |
0.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.002138706200003 s |
0.0022995338001237 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0053438298000401 s |
0.0056316531999982 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.0057192146000488 s |
0.0058624548000807 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0054770018000454 s |
0.0052789269999266 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0056400045999907 s |
0.0053872159998718 s |
1.05 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0060005948000252 s |
0.0053421269999489 s |
1.12 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.0054377167999518 s |
0.0050962933999471 s |
1.07 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0051659220000146 s |
0.0053552024000055 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0058729204000883 s |
0.0045724182000412 s |
1.28 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0053620283999407 s |
0.0057914356001674 s |
0.93 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0056775981999635 s |
0.0035167963999811 s |
1.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0052306380000118 s |
0.0053226364001602 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0055487057999016 s |
0.003688061199864 s |
1.50 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0053289516000404 s |
0.0053972733999216 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0054750196001805 s |
0.004328001200065 s |
1.27 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0054679216001204 s |
0.0051587899999503 s |
1.06 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0054783073999715 s |
0.0035836637999636 s |
1.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.0047155254001154 s |
0.0053208182000162 s |
0.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0052271308001763 s |
0.0035166720000233 s |
1.49 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.004650854799911 s |
0.0053604224000082 s |
0.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.00028336 s |
0.000273697 s |
1.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000282209 s |
0.000272704 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000289536 s |
0.000287073 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000281633 s |
0.00027264 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000282048 s |
0.000273408 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.0002893439999999 s |
0.000287297 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000289952 s |
0.000286977 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000559297 s |
0.000557409 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000540929 s |
0.000539138 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000559649 s |
0.000557666 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.000559777 s |
0.000557762 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.000558817 s |
0.00055853 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000560225 s |
0.000558146 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.000558625 s |
0.000557857 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.001032802 s |
0.001022467 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.000991427 s |
0.000985283 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001030434 s |
0.001019234 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.000994146 s |
0.000979234 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001019138 s |
0.001006947 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.0010429139999999 s |
0.001029091 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001018145 s |
0.001006211 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.00103213 s |
0.001020899 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.000982497 s |
0.000970883 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.0010322579999999 s |
0.001020387 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001032578 s |
0.0010213779999999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.000978913 s |
0.000972674 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.001030849 s |
0.001019586 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.0010284489999999 s |
0.001016707 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000965665 s |
0.0009542429999999 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.00102973 s |
0.001016771 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.00102669 s |
0.001017827 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001027106 s |
0.001017539 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.00102733 s |
0.001015939 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002059492 s |
0.0011389066000447 s |
1.81 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.002013479 s |
0.0009614714001145 s |
2.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.002000168 s |
0.0009669938000115 s |
2.07 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0022017259999999 s |
0.0009036465999088 s |
2.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.002330934 s |
0.0009796298000765 s |
2.38 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.002076994 s |
0.0009802251999644 s |
2.12 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.002070974 s |
0.0009544247999656 s |
2.17 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.005512387 s |
0.0029485540000678 s |
1.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.005161339 s |
0.0024281695999889 s |
2.13 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.005069082 s |
0.0023517207999248 s |
2.16 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.005971315 s |
0.0023108835999664 s |
2.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.005181494 s |
0.0023455092001313 s |
2.21 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005311074 s |
0.0025466422000135 s |
2.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0053584889999999 s |
0.0022995338001237 s |
2.33 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.008667888 s |
0.0056316531999982 s |
1.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.008521015 s |
0.0058624548000807 s |
1.45 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.008189748 s |
0.0052789269999266 s |
1.55 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.009757756 s |
0.0053872159998718 s |
1.81 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.008251333 s |
0.0053421269999489 s |
1.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.00786702 s |
0.0050962933999471 s |
1.54 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0107104909999999 s |
0.0053552024000055 s |
2.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.007837765 s |
0.0045724182000412 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.008512463 s |
0.0057914356001674 s |
1.47 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.008575216 s |
0.0035167963999811 s |
2.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.008703698 s |
0.0053226364001602 s |
1.64 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.008490905 s |
0.003688061199864 s |
2.30 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.008134477 s |
0.0053972733999216 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0080831079999999 s |
0.004328001200065 s |
1.87 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0078451919999999 s |
0.0051587899999503 s |
1.52 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.009053359 s |
0.0035836637999636 s |
2.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.008424411 s |
0.0053208182000162 s |
1.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.00832173 s |
0.0035166720000233 s |
2.37 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0081787919999999 s |
0.0053604224000082 s |
1.53 |
scatter_sum / JaXPipe / cpu / Primal |
0.00000972508000813832 s |
0.00001004172002467385 s |
0.97 |
scatter_sum / Jax / cpu / Primal |
0.000008742420031921938 s |
0.000009407980005562422 s |
0.93 |
scatter_sum / HLOOpt / cpu / Primal |
0.00000850580001497292 s |
0.000012474039995140628 s |
0.68 |
scatter_sum / PartOpt / cpu / Primal |
0.000008469579979646369 s |
0.000008274519987025996 s |
1.02 |
scatter_sum / IPartOpt / cpu / Primal |
0.000008990079995783163 s |
0.000008602779998909683 s |
1.05 |
scatter_sum / DefOpt / cpu / Primal |
0.00000817566001387604 s |
0.000008273100002043066 s |
0.99 |
scatter_sum / IDefOpt / cpu / Primal |
0.00000844290001623449 s |
0.000008372360016437596 s |
1.01 |
scatter_sum / JaXPipe / cpu / Forward |
0.00001362705997962621 s |
0.000013005980044908935 s |
1.05 |
scatter_sum / Jax / cpu / Forward |
0.00001384859994686849 s |
0.00001275019997592608 s |
1.09 |
scatter_sum / HLOOpt / cpu / Forward |
0.00001388544001201808 s |
0.00001813943997149181 s |
0.77 |
scatter_sum / PartOpt / cpu / Forward |
0.000013663940017067943 s |
0.00001831355997637729 s |
0.75 |
scatter_sum / IPartOpt / cpu / Forward |
0.000014187879969540518 s |
0.000012779299959220224 s |
1.11 |
scatter_sum / DefOpt / cpu / Forward |
0.000013570700039053918 s |
0.000018687720003072174 s |
0.73 |
scatter_sum / IDefOpt / cpu / Forward |
0.000014151500017760556 s |
0.00001276049996704387 s |
1.11 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000013773059963568812 s |
0.000013078879983368096 s |
1.05 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000014089119977143128 s |
0.000012596560036399751 s |
1.12 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000014170099993862096 s |
0.000012585200020112095 s |
1.13 |
scatter_sum / Jax / cpu / BothRev |
0.000013238119981906491 s |
0.00001235963996805367 s |
1.07 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000013299800011736806 s |
0.00001306199997998192 s |
1.02 |
scatter_sum / HLOOpt / cpu / PostRev |
0.00001598886000465427 s |
0.00001713214001938468 s |
0.93 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000013662780002050568 s |
0.00002021306001552148 s |
0.68 |
scatter_sum / PartOpt / cpu / PreRev |
0.000013532059974750154 s |
0.000012858079999205077 s |
1.05 |
scatter_sum / PartOpt / cpu / PostRev |
0.000013680439978998038 s |
0.000013021439999647554 s |
1.05 |
scatter_sum / PartOpt / cpu / BothRev |
0.000013951259998066234 s |
0.00001270738001949212 s |
1.10 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000013349480013857828 s |
0.000013217800005804748 s |
1.01 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000013620720028484356 s |
0.000013108480006849276 s |
1.04 |
scatter_sum / IPartOpt / cpu / BothRev |
0.00001436140000805608 s |
0.000012461719979910412 s |
1.15 |
scatter_sum / DefOpt / cpu / PreRev |
0.000013089760004731945 s |
0.00001264913998056727 s |
1.03 |
scatter_sum / DefOpt / cpu / PostRev |
0.000013845579996996091 s |
0.0000127456999689457 s |
1.09 |
scatter_sum / DefOpt / cpu / BothRev |
0.000013906799958931516 s |
0.0000126665800507908 s |
1.10 |
scatter_sum / IDefOpt / cpu / PreRev |
0.00001309300004322722 s |
0.0000127709199932724 s |
1.03 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000014226699931896292 s |
0.00001324225998359907 s |
1.07 |
scatter_sum / IDefOpt / cpu / BothRev |
0.00001410080002642644 s |
0.000013151660004950828 s |
1.07 |
scatter_sum / JaXPipe / cuda / Primal |
0.000009567 s |
0.000010464 s |
0.91 |
scatter_sum / Jax / cuda / Primal |
0.000009792 s |
0.000010464 s |
0.94 |
scatter_sum / HLOOpt / cuda / Primal |
0.000009889 s |
0.00001024 s |
0.97 |
scatter_sum / PartOpt / cuda / Primal |
0.000010016 s |
0.000010368 s |
0.97 |
scatter_sum / IPartOpt / cuda / Primal |
0.000009984 s |
0.000010048 s |
0.99 |
scatter_sum / DefOpt / cuda / Primal |
0.000009856 s |
0.000010176 s |
0.97 |
scatter_sum / IDefOpt / cuda / Primal |
0.0000112 s |
0.000010497 s |
1.07 |
scatter_sum / JaXPipe / cuda / Forward |
0.00001712 s |
0.000017472 s |
0.98 |
scatter_sum / Jax / cuda / Forward |
0.000016768000000000003 s |
0.00001744 s |
0.96 |
scatter_sum / HLOOpt / cuda / Forward |
0.000016288 s |
0.000017408 s |
0.94 |
scatter_sum / PartOpt / cuda / Forward |
0.000017024 s |
0.000017824 s |
0.96 |
scatter_sum / IPartOpt / cuda / Forward |
0.000016736 s |
0.000017569 s |
0.95 |
scatter_sum / DefOpt / cuda / Forward |
0.000017375999999999998 s |
0.000017984 s |
0.97 |
scatter_sum / IDefOpt / cuda / Forward |
0.000017408 s |
0.000018272 s |
0.95 |
scatter_sum / JaXPipe / cuda / PreRev |
0.0000168 s |
0.000018112 s |
0.93 |
scatter_sum / JaXPipe / cuda / PostRev |
0.000016832 s |
0.000018144 s |
0.93 |
scatter_sum / JaXPipe / cuda / BothRev |
0.000016704 s |
0.00002016 s |
0.83 |
scatter_sum / Jax / cuda / BothRev |
0.000016063999999999997 s |
0.000019776 s |
0.81 |
scatter_sum / HLOOpt / cuda / PreRev |
0.000017024999999999997 s |
0.000020128 s |
0.85 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000016192 s |
0.000018207 s |
0.89 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000016608 s |
0.000017919999999999998 s |
0.93 |
scatter_sum / PartOpt / cuda / PreRev |
0.000016832 s |
0.000018047 s |
0.93 |
scatter_sum / PartOpt / cuda / PostRev |
0.00001632 s |
0.000018144 s |
0.90 |
scatter_sum / PartOpt / cuda / BothRev |
0.000016704 s |
0.000018144 s |
0.92 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000016832 s |
0.000018208 s |
0.92 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000016864 s |
0.00001728 s |
0.98 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000016128 s |
0.000017696 s |
0.91 |
scatter_sum / DefOpt / cuda / PreRev |
0.000017312 s |
0.00002016 s |
0.86 |
scatter_sum / DefOpt / cuda / PostRev |
0.000016224 s |
0.000017793 s |
0.91 |
scatter_sum / DefOpt / cuda / BothRev |
0.000016736 s |
0.000018016 s |
0.93 |
scatter_sum / IDefOpt / cuda / PreRev |
0.000017184 s |
0.000018112 s |
0.95 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000016737 s |
0.000018016 s |
0.93 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000016512 s |
0.000018176 s |
0.91 |
scatter_sum / JaXPipe / cpu / Primal |
0.000015781 s |
0.00001004172002467385 s |
1.57 |
scatter_sum / Jax / cpu / Primal |
0.000015734000000000002 s |
0.000009407980005562422 s |
1.67 |
scatter_sum / HLOOpt / cpu / Primal |
0.000015612 s |
0.000012474039995140628 s |
1.25 |
scatter_sum / PartOpt / cpu / Primal |
0.000015632 s |
0.000008274519987025996 s |
1.89 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015929999999999998 s |
0.000008602779998909683 s |
1.85 |
scatter_sum / DefOpt / cpu / Primal |
0.000015642 s |
0.000008273100002043066 s |
1.89 |
scatter_sum / IDefOpt / cpu / Primal |
0.000015723000000000002 s |
0.000008372360016437596 s |
1.88 |
scatter_sum / JaXPipe / cpu / Forward |
0.000022744 s |
0.000013005980044908935 s |
1.75 |
scatter_sum / Jax / cpu / Forward |
0.000022946 s |
0.00001275019997592608 s |
1.80 |
scatter_sum / HLOOpt / cpu / Forward |
0.000022973 s |
0.00001813943997149181 s |
1.27 |
scatter_sum / PartOpt / cpu / Forward |
0.000023566 s |
0.00001831355997637729 s |
1.29 |
scatter_sum / IPartOpt / cpu / Forward |
0.000023046 s |
0.000012779299959220224 s |
1.80 |
scatter_sum / DefOpt / cpu / Forward |
0.000023324 s |
0.000018687720003072174 s |
1.25 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022714000000000003 s |
0.00001276049996704387 s |
1.78 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000022472 s |
0.000013078879983368096 s |
1.72 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000022528 s |
0.000012596560036399751 s |
1.79 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000022815 s |
0.000012585200020112095 s |
1.81 |
scatter_sum / Jax / cpu / BothRev |
0.000022344 s |
0.00001235963996805367 s |
1.81 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022969 s |
0.00001306199997998192 s |
1.76 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000023275 s |
0.00001713214001938468 s |
1.36 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000033007000000000004 s |
0.00002021306001552148 s |
1.63 |
scatter_sum / PartOpt / cpu / PreRev |
0.00002243 s |
0.000012858079999205077 s |
1.74 |
scatter_sum / PartOpt / cpu / PostRev |
0.000022138 s |
0.000013021439999647554 s |
1.70 |
scatter_sum / PartOpt / cpu / BothRev |
0.000022731 s |
0.00001270738001949212 s |
1.79 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000023048 s |
0.000013217800005804748 s |
1.74 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000022638 s |
0.000013108480006849276 s |
1.73 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000022448 s |
0.000012461719979910412 s |
1.80 |
scatter_sum / DefOpt / cpu / PreRev |
0.000023491000000000003 s |
0.00001264913998056727 s |
1.86 |
scatter_sum / DefOpt / cpu / PostRev |
0.000022474 s |
0.0000127456999689457 s |
1.76 |
scatter_sum / DefOpt / cpu / BothRev |
0.000022499000000000003 s |
0.0000126665800507908 s |
1.78 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000022935 s |
0.0000127709199932724 s |
1.80 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000023147 s |
0.00001324225998359907 s |
1.75 |
scatter_sum / IDefOpt / cpu / BothRev |
0.00002302 s |
0.000013151660004950828 s |
1.75 |
slicing / JaXPipe / cpu / Primal |
0.000006961219942240859 s |
0.000007570659972770954 s |
0.92 |
slicing / Jax / cpu / Primal |
0.000006587300040337141 s |
0.000006614699987039785 s |
1.00 |
slicing / HLOOpt / cpu / Primal |
0.000007017339985395665 s |
0.000010874300032810425 s |
0.65 |
slicing / PartOpt / cpu / Primal |
0.000006776899945180048 s |
0.000006771719990865677 s |
1.00 |
slicing / IPartOpt / cpu / Primal |
0.000007537599976785714 s |
0.000007026199973552139 s |
1.07 |
slicing / DefOpt / cpu / Primal |
0.0000071145999936561565 s |
0.000011422439993111766 s |
0.62 |
slicing / IDefOpt / cpu / Primal |
0.00000717582001016126 s |
0.000006992520020503435 s |
1.03 |
slicing / JaXPipe / cpu / Forward |
0.00001087349996851117 s |
0.000010335900005884469 s |
1.05 |
slicing / Jax / cpu / Forward |
0.000010363239989601424 s |
0.0000114975600172329 s |
0.90 |
slicing / HLOOpt / cpu / Forward |
0.000010758779981188126 s |
0.000014506579973385669 s |
0.74 |
slicing / PartOpt / cpu / Forward |
0.000010333779982829585 s |
0.000014642859960076748 s |
0.71 |
slicing / IPartOpt / cpu / Forward |
0.000010521219983274933 s |
0.000010182800015172688 s |
1.03 |
slicing / DefOpt / cpu / Forward |
0.000011079019977842108 s |
0.000014841600022919013 s |
0.75 |
slicing / IDefOpt / cpu / Forward |
0.00001045469996824977 s |
0.0000099242199848959 s |
1.05 |
slicing / JaXPipe / cpu / PreRev |
0.00001092337996851711 s |
0.000011109560018667253 s |
0.98 |
slicing / JaXPipe / cpu / PostRev |
0.000011358179963281145 s |
0.00001135981993684254 s |
1.00 |
slicing / JaXPipe / cpu / BothRev |
0.000011218099998586695 s |
0.000014937100022507366 s |
0.75 |
slicing / Jax / cpu / BothRev |
0.000010701860001063323 s |
0.000011023059978469973 s |
0.97 |
slicing / HLOOpt / cpu / PreRev |
0.00001125627998590062 s |
0.000011034760009351884 s |
1.02 |
slicing / HLOOpt / cpu / PostRev |
0.000013229220021457876 s |
0.00001136896004936716 s |
1.16 |
slicing / HLOOpt / cpu / BothRev |
0.000011079880050601786 s |
0.000012524459989435857 s |
0.88 |
slicing / PartOpt / cpu / PreRev |
0.00001139243999205064 s |
0.000010715039989008802 s |
1.06 |
slicing / PartOpt / cpu / PostRev |
0.000011565360027816496 s |
0.000011674460020003608 s |
0.99 |
slicing / PartOpt / cpu / BothRev |
0.00001119515998652787 s |
0.000010953879991575375 s |
1.02 |
slicing / IPartOpt / cpu / PreRev |
0.00001151464000940905 s |
0.00001082678002603643 s |
1.06 |
slicing / IPartOpt / cpu / PostRev |
0.00001172587998553354 s |
0.000011218379995625584 s |
1.05 |
slicing / IPartOpt / cpu / BothRev |
0.000010626199991747853 s |
0.00001056629998856806 s |
1.01 |
slicing / DefOpt / cpu / PreRev |
0.000011276040040684166 s |
0.00001046529998347978 s |
1.08 |
slicing / DefOpt / cpu / PostRev |
0.00001134040002398251 s |
0.000011711559973264229 s |
0.97 |
slicing / DefOpt / cpu / BothRev |
0.000010638159947120583 s |
0.000010657140037437783 s |
1.00 |
slicing / IDefOpt / cpu / PreRev |
0.000011101899990535458 s |
0.00001072921995728393 s |
1.03 |
slicing / IDefOpt / cpu / PostRev |
0.00001081748002434324 s |
0.000011978920001638471 s |
0.90 |
slicing / IDefOpt / cpu / BothRev |
0.000011017480046575656 s |
0.000010620360008033458 s |
1.04 |
slicing / JaXPipe / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / Jax / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / HLOOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
slicing / PartOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
slicing / IPartOpt / cuda / Primal |
0.000001887 s |
0.000001887 s |
1 |
slicing / DefOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
slicing / IDefOpt / cuda / Primal |
0.000001888 s |
0.000001887 s |
1.00 |
slicing / JaXPipe / cuda / Forward |
0.000009696 s |
0.000010368 s |
0.94 |
slicing / Jax / cuda / Forward |
0.000009536 s |
0.000010144 s |
0.94 |
slicing / HLOOpt / cuda / Forward |
0.000009856 s |
0.00000976 s |
1.01 |
slicing / PartOpt / cuda / Forward |
0.000009792 s |
0.000010368 s |
0.94 |
slicing / IPartOpt / cuda / Forward |
0.00000992 s |
0.000010177 s |
0.97 |
slicing / DefOpt / cuda / Forward |
0.0000096 s |
0.000011681 s |
0.82 |
slicing / IDefOpt / cuda / Forward |
0.000009824 s |
0.000010016 s |
0.98 |
slicing / JaXPipe / cuda / PreRev |
0.000009567 s |
0.000010656 s |
0.90 |
slicing / JaXPipe / cuda / PostRev |
0.000010432 s |
0.000010176 s |
1.03 |
slicing / JaXPipe / cuda / BothRev |
0.000009537 s |
0.000010848 s |
0.88 |
slicing / Jax / cuda / BothRev |
0.000010048 s |
0.000011616 s |
0.87 |
slicing / HLOOpt / cuda / PreRev |
0.000010112 s |
0.000010625 s |
0.95 |
slicing / HLOOpt / cuda / PostRev |
0.000009856 s |
0.000010656 s |
0.92 |
slicing / HLOOpt / cuda / BothRev |
0.000009504 s |
0.000011616 s |
0.82 |
slicing / PartOpt / cuda / PreRev |
0.000010048 s |
0.000010593 s |
0.95 |
slicing / PartOpt / cuda / PostRev |
0.000010017 s |
0.000009792 s |
1.02 |
slicing / PartOpt / cuda / BothRev |
0.000010047 s |
0.000010816 s |
0.93 |
slicing / IPartOpt / cuda / PreRev |
0.000010176 s |
0.00001056 s |
0.96 |
slicing / IPartOpt / cuda / PostRev |
0.000010016 s |
0.000011744 s |
0.85 |
slicing / IPartOpt / cuda / BothRev |
0.000010016 s |
0.00001184 s |
0.85 |
slicing / DefOpt / cuda / PreRev |
0.000009887 s |
0.000010687 s |
0.93 |
slicing / DefOpt / cuda / PostRev |
0.000009696 s |
0.000009536 s |
1.02 |
slicing / DefOpt / cuda / BothRev |
0.000009824 s |
0.000010529 s |
0.93 |
slicing / IDefOpt / cuda / PreRev |
0.000010048 s |
0.00001056 s |
0.95 |
slicing / IDefOpt / cuda / PostRev |
0.000009696 s |
0.000010496 s |
0.92 |
slicing / IDefOpt / cuda / BothRev |
0.000010016 s |
0.00001056 s |
0.95 |
slicing / JaXPipe / cpu / Primal |
0.000012674 s |
0.000007570659972770954 s |
1.67 |
slicing / Jax / cpu / Primal |
0.000020345 s |
0.000006614699987039785 s |
3.08 |
slicing / HLOOpt / cpu / Primal |
0.000012514 s |
0.000010874300032810425 s |
1.15 |
slicing / PartOpt / cpu / Primal |
0.000012582 s |
0.000006771719990865677 s |
1.86 |
slicing / IPartOpt / cpu / Primal |
0.000012498 s |
0.000007026199973552139 s |
1.78 |
slicing / DefOpt / cpu / Primal |
0.000012673 s |
0.000011422439993111766 s |
1.11 |
slicing / IDefOpt / cpu / Primal |
0.000012555 s |
0.000006992520020503435 s |
1.80 |
slicing / JaXPipe / cpu / Forward |
0.000017147 s |
0.000010335900005884469 s |
1.66 |
slicing / Jax / cpu / Forward |
0.000017 s |
0.0000114975600172329 s |
1.48 |
slicing / HLOOpt / cpu / Forward |
0.000017046 s |
0.000014506579973385669 s |
1.18 |
slicing / PartOpt / cpu / Forward |
0.000016743999999999998 s |
0.000014642859960076748 s |
1.14 |
slicing / IPartOpt / cpu / Forward |
0.000016868999999999997 s |
0.000010182800015172688 s |
1.66 |
slicing / DefOpt / cpu / Forward |
0.000017109 s |
0.000014841600022919013 s |
1.15 |
slicing / IDefOpt / cpu / Forward |
0.000016910000000000002 s |
0.0000099242199848959 s |
1.70 |
slicing / JaXPipe / cpu / PreRev |
0.000017714 s |
0.000011109560018667253 s |
1.59 |
slicing / JaXPipe / cpu / PostRev |
0.000017565000000000002 s |
0.00001135981993684254 s |
1.55 |
slicing / JaXPipe / cpu / BothRev |
0.00001726 s |
0.000014937100022507366 s |
1.16 |
slicing / Jax / cpu / BothRev |
0.000017538 s |
0.000011023059978469973 s |
1.59 |
slicing / HLOOpt / cpu / PreRev |
0.000017463 s |
0.000011034760009351884 s |
1.58 |
slicing / HLOOpt / cpu / PostRev |
0.0000174 s |
0.00001136896004936716 s |
1.53 |
slicing / HLOOpt / cpu / BothRev |
0.000017322 s |
0.000012524459989435857 s |
1.38 |
slicing / PartOpt / cpu / PreRev |
0.000017204 s |
0.000010715039989008802 s |
1.61 |
slicing / PartOpt / cpu / PostRev |
0.000017019 s |
0.000011674460020003608 s |
1.46 |
slicing / PartOpt / cpu / BothRev |
0.000017534000000000002 s |
0.000010953879991575375 s |
1.60 |
slicing / IPartOpt / cpu / PreRev |
0.000017353 s |
0.00001082678002603643 s |
1.60 |
slicing / IPartOpt / cpu / PostRev |
0.000017593000000000002 s |
0.000011218379995625584 s |
1.57 |
slicing / IPartOpt / cpu / BothRev |
0.000017426 s |
0.00001056629998856806 s |
1.65 |
slicing / DefOpt / cpu / PreRev |
0.000017433 s |
0.00001046529998347978 s |
1.67 |
slicing / DefOpt / cpu / PostRev |
0.000017373 s |
0.000011711559973264229 s |
1.48 |
slicing / DefOpt / cpu / BothRev |
0.000017459 s |
0.000010657140037437783 s |
1.64 |
slicing / IDefOpt / cpu / PreRev |
0.000017423999999999998 s |
0.00001072921995728393 s |
1.62 |
slicing / IDefOpt / cpu / PostRev |
0.000017361999999999997 s |
0.000011978920001638471 s |
1.45 |
slicing / IDefOpt / cpu / BothRev |
0.000017389000000000002 s |
0.000010620360008033458 s |
1.64 |
sum / JaXPipe / cpu / Primal |
0.000008106539953587343 s |
0.00000879642001564207 s |
0.92 |
sum / Jax / cpu / Primal |
0.000008307740017698961 s |
0.000007896020024418249 s |
1.05 |
sum / HLOOpt / cpu / Primal |
0.00000815605998468527 s |
0.00001246118001290597 s |
0.65 |
sum / PartOpt / cpu / Primal |
0.000008652859987705597 s |
0.000008450039977105917 s |
1.02 |
sum / IPartOpt / cpu / Primal |
0.000008399719999943045 s |
0.000008476499997414067 s |
0.99 |
sum / DefOpt / cpu / Primal |
0.000008127160026560886 s |
0.000012441299986676311 s |
0.65 |
sum / IDefOpt / cpu / Primal |
0.000007980719983606832 s |
0.000008302099977299805 s |
0.96 |
sum / JaXPipe / cpu / Forward |
0.000012733200028378631 s |
0.000012332720007179888 s |
1.03 |
sum / Jax / cpu / Forward |
0.000011883179986398318 s |
0.000012614460001714178 s |
0.94 |
sum / HLOOpt / cpu / Forward |
0.000013075720053166151 s |
0.00001719820002108463 s |
0.76 |
sum / PartOpt / cpu / Forward |
0.000012884019979537695 s |
0.000016864640010680886 s |
0.76 |
sum / IPartOpt / cpu / Forward |
0.000012988420012334244 s |
0.000012690599969573669 s |
1.02 |
sum / DefOpt / cpu / Forward |
0.000012722919982479652 s |
0.000017216219976035064 s |
0.74 |
sum / IDefOpt / cpu / Forward |
0.000012474299983296078 s |
0.000012815299996873363 s |
0.97 |
sum / JaXPipe / cpu / PreRev |
0.00001254475999303395 s |
0.00001255094000043755 s |
1.00 |
sum / JaXPipe / cpu / PostRev |
0.000012328680013524715 s |
0.00001229197998327436 s |
1.00 |
sum / JaXPipe / cpu / BothRev |
0.000012259679979251814 s |
0.00001570249994074402 s |
0.78 |
sum / Jax / cpu / BothRev |
0.000011711559955074337 s |
0.00001219255997966684 s |
0.96 |
sum / HLOOpt / cpu / PreRev |
0.000012747319970003443 s |
0.00001153561995124619 s |
1.11 |
sum / HLOOpt / cpu / PostRev |
0.000013985319974381129 s |
0.00001561490000312915 s |
0.90 |
sum / HLOOpt / cpu / BothRev |
0.000012278380017960444 s |
0.000013388559946179155 s |
0.92 |
sum / PartOpt / cpu / PreRev |
0.00001254334003533586 s |
0.00001177519997327181 s |
1.07 |
sum / PartOpt / cpu / PostRev |
0.000012013340019620954 s |
0.000011917339970750615 s |
1.01 |
sum / PartOpt / cpu / BothRev |
0.00001297390002036991 s |
0.000011633939984676544 s |
1.12 |
sum / IPartOpt / cpu / PreRev |
0.000012279720012884354 s |
0.000015914320001684246 s |
0.77 |
sum / IPartOpt / cpu / PostRev |
0.000012229979984113016 s |
0.00001144826000199828 s |
1.07 |
sum / IPartOpt / cpu / BothRev |
0.000012115379986425978 s |
0.000011945719979848943 s |
1.01 |
sum / DefOpt / cpu / PreRev |
0.00001228570000421314 s |
0.0000116967799840495 s |
1.05 |
sum / DefOpt / cpu / PostRev |
0.000011966960018980898 s |
0.000012046660003761644 s |
0.99 |
sum / DefOpt / cpu / BothRev |
0.00001214937998156529 s |
0.00001183411999591044 s |
1.03 |
sum / IDefOpt / cpu / PreRev |
0.000011881639993589488 s |
0.000011886359998243278 s |
1.00 |
sum / IDefOpt / cpu / PostRev |
0.000012083139999958804 s |
0.000012097099952370627 s |
1.00 |
sum / IDefOpt / cpu / BothRev |
0.000012212680021548294 s |
0.000011305200005153892 s |
1.08 |
sum / JaXPipe / cuda / Primal |
0.00000208 s |
0.000002047 s |
1.02 |
sum / Jax / cuda / Primal |
0.00000208 s |
0.000002048 s |
1.02 |
sum / HLOOpt / cuda / Primal |
0.00000208 s |
0.000002048 s |
1.02 |
sum / PartOpt / cuda / Primal |
0.00000208 s |
0.000002047 s |
1.02 |
sum / IPartOpt / cuda / Primal |
0.00000208 s |
0.000002047 s |
1.02 |
sum / DefOpt / cuda / Primal |
0.00000208 s |
0.000002047 s |
1.02 |
sum / IDefOpt / cuda / Primal |
0.000002049 s |
0.000002048 s |
1.00 |
sum / JaXPipe / cuda / Forward |
0.000010016 s |
0.00001072 s |
0.93 |
sum / Jax / cuda / Forward |
0.00000992 s |
0.00001056 s |
0.94 |
sum / HLOOpt / cuda / Forward |
0.000009888 s |
0.000010431 s |
0.95 |
sum / PartOpt / cuda / Forward |
0.000009984 s |
0.000010912 s |
0.91 |
sum / IPartOpt / cuda / Forward |
0.000009888 s |
0.000010368 s |
0.95 |
sum / DefOpt / cuda / Forward |
0.000009824 s |
0.00001072 s |
0.92 |
sum / IDefOpt / cuda / Forward |
0.00000992 s |
0.000010367 s |
0.96 |
sum / JaXPipe / cuda / PreRev |
0.000010048 s |
0.000009951 s |
1.01 |
sum / JaXPipe / cuda / PostRev |
0.000009856 s |
0.000009887 s |
1.00 |
sum / JaXPipe / cuda / BothRev |
0.000009888 s |
0.000010176 s |
0.97 |
sum / Jax / cuda / BothRev |
0.000009984 s |
0.0000104 s |
0.96 |
sum / HLOOpt / cuda / PreRev |
0.000010016 s |
0.000010335 s |
0.97 |
sum / HLOOpt / cuda / PostRev |
0.00000992 s |
0.000011712 s |
0.85 |
sum / HLOOpt / cuda / BothRev |
0.000009664 s |
0.000011169 s |
0.87 |
sum / PartOpt / cuda / PreRev |
0.000009984 s |
0.000015966999999999998 s |
0.63 |
sum / PartOpt / cuda / PostRev |
0.000010208 s |
0.000010592 s |
0.96 |
sum / PartOpt / cuda / BothRev |
0.000009408 s |
0.000010496 s |
0.90 |
sum / IPartOpt / cuda / PreRev |
0.000010048 s |
0.0000104 s |
0.97 |
sum / IPartOpt / cuda / PostRev |
0.000009824 s |
0.000010784 s |
0.91 |
sum / IPartOpt / cuda / BothRev |
0.000009952 s |
0.00001072 s |
0.93 |
sum / DefOpt / cuda / PreRev |
0.000009888 s |
0.000010528 s |
0.94 |
sum / DefOpt / cuda / PostRev |
0.000009984 s |
0.000010944 s |
0.91 |
sum / DefOpt / cuda / BothRev |
0.000010848 s |
0.000010464 s |
1.04 |
sum / IDefOpt / cuda / PreRev |
0.000009503 s |
0.000010528 s |
0.90 |
sum / IDefOpt / cuda / PostRev |
0.000009824 s |
0.000010464 s |
0.94 |
sum / IDefOpt / cuda / BothRev |
0.000009792 s |
0.000010368 s |
0.94 |
sum / JaXPipe / cpu / Primal |
0.000014528 s |
0.00000879642001564207 s |
1.65 |
sum / Jax / cpu / Primal |
0.000014329 s |
0.000007896020024418249 s |
1.81 |
sum / HLOOpt / cpu / Primal |
0.00001476 s |
0.00001246118001290597 s |
1.18 |
sum / PartOpt / cpu / Primal |
0.000014609 s |
0.000008450039977105917 s |
1.73 |
sum / IPartOpt / cpu / Primal |
0.000014703 s |
0.000008476499997414067 s |
1.73 |
sum / DefOpt / cpu / Primal |
0.000014557 s |
0.000012441299986676311 s |
1.17 |
sum / IDefOpt / cpu / Primal |
0.000014835 s |
0.000008302099977299805 s |
1.79 |
sum / JaXPipe / cpu / Forward |
0.00002069 s |
0.000012332720007179888 s |
1.68 |
sum / Jax / cpu / Forward |
0.000035537 s |
0.000012614460001714178 s |
2.82 |
sum / HLOOpt / cpu / Forward |
0.00001928 s |
0.00001719820002108463 s |
1.12 |
sum / PartOpt / cpu / Forward |
0.000019881 s |
0.000016864640010680886 s |
1.18 |
sum / IPartOpt / cpu / Forward |
0.000019946 s |
0.000012690599969573669 s |
1.57 |
sum / DefOpt / cpu / Forward |
0.000019951 s |
0.000017216219976035064 s |
1.16 |
sum / IDefOpt / cpu / Forward |
0.000020123 s |
0.000012815299996873363 s |
1.57 |
sum / JaXPipe / cpu / PreRev |
0.000018555 s |
0.00001255094000043755 s |
1.48 |
sum / JaXPipe / cpu / PostRev |
0.000018874 s |
0.00001229197998327436 s |
1.54 |
sum / JaXPipe / cpu / BothRev |
0.00001903 s |
0.00001570249994074402 s |
1.21 |
sum / Jax / cpu / BothRev |
0.00001902 s |
0.00001219255997966684 s |
1.56 |
sum / HLOOpt / cpu / PreRev |
0.000018475 s |
0.00001153561995124619 s |
1.60 |
sum / HLOOpt / cpu / PostRev |
0.000019677 s |
0.00001561490000312915 s |
1.26 |
sum / HLOOpt / cpu / BothRev |
0.000018748 s |
0.000013388559946179155 s |
1.40 |
sum / PartOpt / cpu / PreRev |
0.000018875 s |
0.00001177519997327181 s |
1.60 |
sum / PartOpt / cpu / PostRev |
0.000018892 s |
0.000011917339970750615 s |
1.59 |
sum / PartOpt / cpu / BothRev |
0.000018962 s |
0.000011633939984676544 s |
1.63 |
sum / IPartOpt / cpu / PreRev |
0.000019828 s |
0.000015914320001684246 s |
1.25 |
sum / IPartOpt / cpu / PostRev |
0.000018945 s |
0.00001144826000199828 s |
1.65 |
sum / IPartOpt / cpu / BothRev |
0.000019622 s |
0.000011945719979848943 s |
1.64 |
sum / DefOpt / cpu / PreRev |
0.000019229 s |
0.0000116967799840495 s |
1.64 |
sum / DefOpt / cpu / PostRev |
0.000018565000000000003 s |
0.000012046660003761644 s |
1.54 |
sum / DefOpt / cpu / BothRev |
0.000018885 s |
0.00001183411999591044 s |
1.60 |
sum / IDefOpt / cpu / PreRev |
0.000019138 s |
0.000011886359998243278 s |
1.61 |
sum / IDefOpt / cpu / PostRev |
0.000019005 s |
0.000012097099952370627 s |
1.57 |
sum / IDefOpt / cpu / BothRev |
0.000018705 s |
0.000011305200005153892 s |
1.65 |
value_and_grad / JaXPipe / cpu / Primal |
0.000015510279999944033 s |
0.00001624343995899835 s |
0.95 |
value_and_grad / Jax / cpu / Primal |
0.000015272459968400654 s |
0.00001564650000545953 s |
0.98 |
value_and_grad / HLOOpt / cpu / Primal |
0.000015540640015387906 s |
0.000015223200025502592 s |
1.02 |
value_and_grad / PartOpt / cpu / Primal |
0.000014516180017380976 s |
0.000015167499996096012 s |
0.96 |
value_and_grad / IPartOpt / cpu / Primal |
0.000014752859924556105 s |
0.000015559860030407436 s |
0.95 |
value_and_grad / DefOpt / cpu / Primal |
0.00001514715997473104 s |
0.00001555013997858623 s |
0.97 |
value_and_grad / IDefOpt / cpu / Primal |
0.000015440399993167374 s |
0.000015306759996747134 s |
1.01 |
value_and_grad / JaXPipe / cuda / Primal |
0.000033216 s |
0.000033759999999999995 s |
0.98 |
value_and_grad / Jax / cuda / Primal |
0.000033024 s |
0.000034496 s |
0.96 |
value_and_grad / HLOOpt / cuda / Primal |
0.000032512 s |
0.00003408 s |
0.95 |
value_and_grad / PartOpt / cuda / Primal |
0.000032288 s |
0.000033728 s |
0.96 |
value_and_grad / IPartOpt / cuda / Primal |
0.000032864 s |
0.000034048 s |
0.97 |
value_and_grad / DefOpt / cuda / Primal |
0.000032449 s |
0.000033984 s |
0.95 |
value_and_grad / IDefOpt / cuda / Primal |
0.000032832 s |
0.000033825 s |
0.97 |
value_and_grad / JaXPipe / cpu / Primal |
0.000022902 s |
0.00001624343995899835 s |
1.41 |
value_and_grad / Jax / cpu / Primal |
0.000022607 s |
0.00001564650000545953 s |
1.44 |
value_and_grad / HLOOpt / cpu / Primal |
0.000023194 s |
0.000015223200025502592 s |
1.52 |
value_and_grad / PartOpt / cpu / Primal |
0.00002286 s |
0.000015167499996096012 s |
1.51 |
value_and_grad / IPartOpt / cpu / Primal |
0.000022569 s |
0.000015559860030407436 s |
1.45 |
value_and_grad / DefOpt / cpu / Primal |
0.000023229 s |
0.00001555013997858623 s |
1.49 |
value_and_grad / IDefOpt / cpu / Primal |
0.000023317 s |
0.000015306759996747134 s |
1.52 |
jaxmd20 / JaXPipe / cuda / Primal |
0.001446722 s |
0.001498085 s |
0.97 |
jaxmd20 / Jax / cuda / Primal |
0.001479235 s |
0.00143706 s |
1.03 |
jaxmd20 / HLOOpt / cuda / Primal |
0.001084067 s |
0.001077793 s |
1.01 |
jaxmd20 / PartOpt / cuda / Primal |
0.0013091539999999 s |
0.0013310759999999 s |
0.98 |
jaxmd20 / IPartOpt / cuda / Primal |
0.002089961 s |
0.001321539 s |
1.58 |
jaxmd20 / DefOpt / cuda / Primal |
0.0005512 s |
0.000576802 s |
0.96 |
jaxmd20 / IDefOpt / cuda / Primal |
0.000517281 s |
0.000512705 s |
1.01 |
jaxmd20 / JaXPipe / cuda / Forward |
0.000864962 s |
0.00085533 s |
1.01 |
jaxmd20 / Jax / cuda / Forward |
0.0018664019999999 s |
0.001797765 s |
1.04 |
jaxmd20 / HLOOpt / cuda / Forward |
0.000882561 s |
0.000873794 s |
1.01 |
jaxmd20 / PartOpt / cuda / Forward |
0.000882657 s |
0.000860515 s |
1.03 |
jaxmd20 / IPartOpt / cuda / Forward |
0.000872229 s |
0.000874658 s |
1.00 |
jaxmd20 / DefOpt / cuda / Forward |
0.000868611 s |
0.000864994 s |
1.00 |
jaxmd20 / IDefOpt / cuda / Forward |
0.000866562 s |
0.0008576979999999 s |
1.01 |
jaxmd20 / JaXPipe / cuda / PreRev |
0.00177565 s |
0.0017367399999999 s |
1.02 |
jaxmd20 / JaXPipe / cuda / PostRev |
0.005371812 s |
0.005293839 s |
1.01 |
jaxmd20 / JaXPipe / cuda / BothRev |
0.001745891 s |
0.001768356 s |
0.99 |
jaxmd20 / Jax / cuda / BothRev |
0.005328612 s |
0.005272846 s |
1.01 |
jaxmd20 / HLOOpt / cuda / PreRev |
0.001734947 s |
0.001725541 s |
1.01 |
jaxmd20 / HLOOpt / cuda / PostRev |
0.005222755 s |
0.005171214 s |
1.01 |
jaxmd20 / HLOOpt / cuda / BothRev |
0.00164157 s |
0.001644709 s |
1.00 |
jaxmd20 / PartOpt / cuda / PreRev |
0.001836228 s |
0.0017877479999999 s |
1.03 |
jaxmd20 / PartOpt / cuda / PostRev |
0.005503913 s |
0.005330222 s |
1.03 |
jaxmd20 / PartOpt / cuda / BothRev |
0.001732389 s |
0.001718148 s |
1.01 |
jaxmd20 / IPartOpt / cuda / PreRev |
0.001803299 s |
0.001810405 s |
1.00 |
jaxmd20 / IPartOpt / cuda / PostRev |
0.005486759 s |
0.00536619 s |
1.02 |
jaxmd20 / IPartOpt / cuda / BothRev |
0.001743845 s |
0.001705605 s |
1.02 |
jaxmd20 / DefOpt / cuda / PreRev |
0.00182381 s |
0.001828677 s |
1.00 |
jaxmd20 / DefOpt / cuda / PostRev |
0.002778084 s |
0.002758728 s |
1.01 |
jaxmd20 / DefOpt / cuda / BothRev |
0.0017567079999999 s |
0.001713924 s |
1.02 |
jaxmd20 / IDefOpt / cuda / PreRev |
0.001821634 s |
0.001793538 s |
1.02 |
jaxmd20 / IDefOpt / cuda / PostRev |
0.002211588 s |
0.00220023 s |
1.01 |
jaxmd20 / IDefOpt / cuda / BothRev |
0.0017492179999999 s |
0.0017347889999999 s |
1.01 |
jaxmd40 / JaXPipe / cpu / Primal |
0.069749396 s |
0.078501014 s |
0.89 |
jaxmd40 / Jax / cpu / Primal |
0.068745719 s |
0.06135705 s |
1.12 |
jaxmd40 / HLOOpt / cpu / Primal |
0.083372136 s |
0.089335615 s |
0.93 |
jaxmd40 / PartOpt / cpu / Primal |
0.0693396429999999 s |
0.066812787 s |
1.04 |
jaxmd40 / IPartOpt / cpu / Primal |
0.067900529 s |
0.0664021269999999 s |
1.02 |
jaxmd40 / DefOpt / cpu / Primal |
0.085895901 s |
0.096352063 s |
0.89 |
jaxmd40 / IDefOpt / cpu / Primal |
0.087004714 s |
0.094918816 s |
0.92 |
jaxmd40 / JaXPipe / cpu / Forward |
0.153300959 s |
0.178721884 s |
0.86 |
jaxmd40 / Jax / cpu / Forward |
0.081985254 s |
0.085444761 s |
0.96 |
jaxmd40 / HLOOpt / cpu / Forward |
0.1615208729999999 s |
0.172861772 s |
0.93 |
jaxmd40 / PartOpt / cpu / Forward |
0.153138962 s |
0.164854612 s |
0.93 |
jaxmd40 / IPartOpt / cpu / Forward |
0.15558232 s |
0.175380794 s |
0.89 |
jaxmd40 / DefOpt / cpu / Forward |
0.15915119 s |
0.171326453 s |
0.93 |
jaxmd40 / IDefOpt / cpu / Forward |
0.155583368 s |
0.162296659 s |
0.96 |
jaxmd40 / JaXPipe / cpu / PreRev |
0.207109949 s |
0.225926733 s |
0.92 |
jaxmd40 / JaXPipe / cpu / PostRev |
0.136035715 s |
0.140691395 s |
0.97 |
jaxmd40 / JaXPipe / cpu / BothRev |
0.2290441999999999 s |
0.226341833 s |
1.01 |
jaxmd40 / Jax / cpu / BothRev |
0.133088624 s |
0.139427355 s |
0.95 |
jaxmd40 / HLOOpt / cpu / PreRev |
0.248671811 s |
0.233465711 s |
1.07 |
jaxmd40 / HLOOpt / cpu / PostRev |
0.174644899 s |
0.195115829 s |
0.90 |
jaxmd40 / HLOOpt / cpu / BothRev |
0.250072736 s |
0.252705333 s |
0.99 |
jaxmd40 / PartOpt / cpu / PreRev |
0.209427227 s |
0.229384134 s |
0.91 |
jaxmd40 / PartOpt / cpu / PostRev |
0.13048333 s |
0.139846213 s |
0.93 |
jaxmd40 / PartOpt / cpu / BothRev |
0.250974862 s |
0.262421069 s |
0.96 |
jaxmd40 / IPartOpt / cpu / PreRev |
0.215119747 s |
0.2264402069999999 s |
0.95 |
jaxmd40 / IPartOpt / cpu / PostRev |
0.134995106 s |
0.129780184 s |
1.04 |
jaxmd40 / IPartOpt / cpu / BothRev |
0.2495402689999999 s |
0.256815733 s |
0.97 |
jaxmd40 / DefOpt / cpu / PreRev |
0.221662037 s |
0.2222731839999999 s |
1.00 |
jaxmd40 / DefOpt / cpu / PostRev |
0.163847947 s |
0.1827078899999999 s |
0.90 |
jaxmd40 / DefOpt / cpu / BothRev |
0.237772896 s |
0.258181016 s |
0.92 |
jaxmd40 / IDefOpt / cpu / PreRev |
0.228257717 s |
0.213095599 s |
1.07 |
jaxmd40 / IDefOpt / cpu / PostRev |
0.172050083 s |
0.176822343 s |
0.97 |
jaxmd40 / IDefOpt / cpu / BothRev |
0.253138497 s |
0.27048417 s |
0.94 |
jaxley_l5pc / JaXPipe / cuda / Primal |
3.012432446499588 s |
||
jaxley_l5pc / Jax / cuda / Primal |
3.0137086390004697 s |
||
jaxley_l5pc / HLOOpt / cuda / Primal |
2.7824022770009837 s |
||
jaxley_l5pc / PartOpt / cuda / Primal |
3.360719225500361 s |
||
jaxley_l5pc / IPartOpt / cuda / Primal |
3.360365016000287 s |
||
jaxley_l5pc / DefOpt / cuda / Primal |
3.2233221834994765 s |
||
jaxley_l5pc / IDefOpt / cuda / Primal |
2.4428500339990933 s |
||
jaxley_l5pc / JaXPipe / cuda / Forward |
4.394162649498867 s |
||
jaxley_l5pc / Jax / cuda / Forward |
5.681939494999824 s |
||
jaxley_l5pc / HLOOpt / cuda / Forward |
4.5025541609993525 s |
||
jaxley_l5pc / PartOpt / cuda / Forward |
4.394435124000665 s |
||
jaxley_l5pc / IPartOpt / cuda / Forward |
4.392944320999959 s |
||
jaxley_l5pc / DefOpt / cuda / Forward |
4.399738644000536 s |
||
jaxley_l5pc / IDefOpt / cuda / Forward |
4.399920733500039 s |
||
jaxley_l5pc / JaXPipe / cpu / Primal |
1.1059799534999684 s |
||
jaxley_l5pc / Jax / cpu / Primal |
1.0846036344998993 s |
||
jaxley_l5pc / HLOOpt / cpu / Primal |
0.8523874545001036 s |
||
jaxley_l5pc / PartOpt / cpu / Primal |
1.1058975690000352 s |
||
jaxley_l5pc / IPartOpt / cpu / Primal |
1.1513098715000751 s |
||
jaxley_l5pc / DefOpt / cpu / Primal |
1.0940710224999748 s |
||
jaxley_l5pc / IDefOpt / cpu / Primal |
0.9035338644998774 s |
||
jaxley_l5pc / JaXPipe / cpu / Forward |
19.265770481000004 s |
||
jaxley_l5pc / Jax / cpu / Forward |
27.380352829000003 s |
||
jaxley_l5pc / HLOOpt / cpu / Forward |
19.282762799999887 s |
||
jaxley_l5pc / PartOpt / cpu / Forward |
19.30915174100005 s |
||
jaxley_l5pc / IPartOpt / cpu / Forward |
19.30350434249988 s |
||
jaxley_l5pc / DefOpt / cpu / Forward |
19.252223302500056 s |
||
jaxley_l5pc / IDefOpt / cpu / Forward |
19.140816490999896 s |
||
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.708051732 s |
1.702424068 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.710539104 s |
1.705451355 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.720696087 s |
1.715508605 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.700907091 s |
1.6967465709999998 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.698549633 s |
1.695191784 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.670448631 s |
1.665287495 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.917920298 s |
1.911164248 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.008861091 s |
6.319441286 s |
0.95 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.090234929999999 s |
6.091046684 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
5.983466196999999 s |
6.052236802 s |
0.99 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
5.937653672000001 s |
6.141555782999999 s |
0.97 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.04406637 s |
6.261373249 s |
0.97 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.3659161130000004 s |
2.439859331 s |
0.97 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.417996259 s |
6.681259111999999 s |
0.96 |
This comment was automatically generated by workflow using github-action-benchmark.
37aa1e7 to
d70e93d
Compare
Collaborator
Author
~25% speedup no bad... |
cbe14c6 to
42571bd
Compare
42571bd to
e158eb4
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.