-
Notifications
You must be signed in to change notification settings - Fork 26
feat: gather iota indexing optimizations #1903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
710321c to
768ce4d
Compare
768ce4d to
b6bdb77
Compare
b6bdb77 to
0018a68
Compare
wsmoses
approved these changes
Jan 9, 2026
Collaborator
Author
|
the failure is real. fixing in a follow up |
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnzymeJAX Benchmarks
Details
| Benchmark suite | Current: 0018a68 | Previous: 9749c28 | Ratio |
|---|---|---|---|
actmtch / JaXPipe / cpu / Primal |
0.000006480840029325918 s |
0.000007178499872679822 s |
0.90 |
actmtch / Jax / cpu / Primal |
0.000006448240037570941 s |
0.0000069215799157973376 s |
0.93 |
actmtch / HLOOpt / cpu / Primal |
0.0000069831399468967 s |
0.00000738472012017155 s |
0.95 |
actmtch / PartOpt / cpu / Primal |
0.000006246259972613188 s |
0.000006676359917037189 s |
0.94 |
actmtch / IPartOpt / cpu / Primal |
0.000006322779991023708 s |
0.0000066632801463129 s |
0.95 |
actmtch / DefOpt / cpu / Primal |
0.0000071219199890038 s |
0.0000070232199868769385 s |
1.01 |
actmtch / IDefOpt / cpu / Primal |
0.000007055779997244827 s |
0.0000071157399361254646 s |
0.99 |
actmtch / JaXPipe / cpu / Forward |
0.000010433720008222736 s |
0.00001071821985533461 s |
0.97 |
actmtch / Jax / cpu / Forward |
0.000009771019949766924 s |
0.000010183720041823107 s |
0.96 |
actmtch / HLOOpt / cpu / Forward |
0.000010915839957306162 s |
0.000011394539869797882 s |
0.96 |
actmtch / PartOpt / cpu / Forward |
0.000010746339958132012 s |
0.000010248900107399096 s |
1.05 |
actmtch / IPartOpt / cpu / Forward |
0.00001084615999388916 s |
0.000010885099873121365 s |
1.00 |
actmtch / DefOpt / cpu / Forward |
0.000010668199993233429 s |
0.000010958880120597312 s |
0.97 |
actmtch / IDefOpt / cpu / Forward |
0.000010595879984975907 s |
0.000010758259886642918 s |
0.98 |
actmtch / JaXPipe / cpu / PreRev |
0.00001059808001627971 s |
0.00001075559983291896 s |
0.99 |
actmtch / JaXPipe / cpu / PostRev |
0.0000099785200018232 s |
0.000010429819885757752 s |
0.96 |
actmtch / JaXPipe / cpu / BothRev |
0.00001115823998588894 s |
0.000011315619995002636 s |
0.99 |
actmtch / Jax / cpu / BothRev |
0.000009109599968724067 s |
0.000009239179926225915 s |
0.99 |
actmtch / HLOOpt / cpu / PreRev |
0.000010915519987975133 s |
0.000010561299859546123 s |
1.03 |
actmtch / HLOOpt / cpu / PostRev |
0.000012069239955962985 s |
0.000013761459958914202 s |
0.88 |
actmtch / HLOOpt / cpu / BothRev |
0.000010682180000003428 s |
0.000011175479849043769 s |
0.96 |
actmtch / PartOpt / cpu / PreRev |
0.00001047332001689938 s |
0.000010472680114617106 s |
1.00 |
actmtch / PartOpt / cpu / PostRev |
0.000009623299993108958 s |
0.000009276260025217198 s |
1.04 |
actmtch / PartOpt / cpu / BothRev |
0.000011377439986972604 s |
0.000011865220039908307 s |
0.96 |
actmtch / IPartOpt / cpu / PreRev |
0.000010196500015808852 s |
0.000011052040244976524 s |
0.92 |
actmtch / IPartOpt / cpu / PostRev |
0.000009512899996479972 s |
0.000010075359750771894 s |
0.94 |
actmtch / IPartOpt / cpu / BothRev |
0.000011297240016574506 s |
0.00001128298001276562 s |
1.00 |
actmtch / DefOpt / cpu / PreRev |
0.000010503240046091378 s |
0.000011099379953520838 s |
0.95 |
actmtch / DefOpt / cpu / PostRev |
0.00001128756003708986 s |
0.0000108453000211739 s |
1.04 |
actmtch / DefOpt / cpu / BothRev |
0.000010898359978455118 s |
0.000012118520025978796 s |
0.90 |
actmtch / IDefOpt / cpu / PreRev |
0.000010891380015891628 s |
0.00001125396007410018 s |
0.97 |
actmtch / IDefOpt / cpu / PostRev |
0.000011162919990965747 s |
0.000011109960032626988 s |
1.00 |
actmtch / IDefOpt / cpu / BothRev |
0.000011074839985667497 s |
0.000011985499986622017 s |
0.92 |
actmtch / JaXPipe / cuda / Primal |
0.0000024 s |
0.0000024 s |
1 |
actmtch / Jax / cuda / Primal |
0.0000024 s |
0.000002431 s |
0.99 |
actmtch / HLOOpt / cuda / Primal |
0.0000024 s |
0.000002431 s |
0.99 |
actmtch / PartOpt / cuda / Primal |
0.0000024 s |
0.0000024 s |
1 |
actmtch / IPartOpt / cuda / Primal |
0.0000024 s |
0.0000024 s |
1 |
actmtch / DefOpt / cuda / Primal |
0.0000024 s |
0.0000024 s |
1 |
actmtch / IDefOpt / cuda / Primal |
0.0000024 s |
0.0000024 s |
1 |
actmtch / JaXPipe / cuda / Forward |
0.00001056 s |
0.00001104 s |
0.96 |
actmtch / Jax / cuda / Forward |
0.000010112 s |
0.000010465 s |
0.97 |
actmtch / HLOOpt / cuda / Forward |
0.000010656 s |
0.000010527 s |
1.01 |
actmtch / PartOpt / cuda / Forward |
0.000010495 s |
0.000010912 s |
0.96 |
actmtch / IPartOpt / cuda / Forward |
0.000010432 s |
0.00001056 s |
0.99 |
actmtch / DefOpt / cuda / Forward |
0.000010976 s |
0.000010592 s |
1.04 |
actmtch / IDefOpt / cuda / Forward |
0.000010527 s |
0.00001088 s |
0.97 |
actmtch / JaXPipe / cuda / PreRev |
0.000010912 s |
0.000010592 s |
1.03 |
actmtch / JaXPipe / cuda / PostRev |
0.00001104 s |
0.000010848 s |
1.02 |
actmtch / JaXPipe / cuda / BothRev |
0.000010369 s |
0.000010688 s |
0.97 |
actmtch / Jax / cuda / BothRev |
0.00001072 s |
0.0000104 s |
1.03 |
actmtch / HLOOpt / cuda / PreRev |
0.000011904 s |
0.000011007 s |
1.08 |
actmtch / HLOOpt / cuda / PostRev |
0.000011071 s |
0.00001072 s |
1.03 |
actmtch / HLOOpt / cuda / BothRev |
0.000010656 s |
0.000010751 s |
0.99 |
actmtch / PartOpt / cuda / PreRev |
0.00001088 s |
0.000010848 s |
1.00 |
actmtch / PartOpt / cuda / PostRev |
0.000010528 s |
0.000011008 s |
0.96 |
actmtch / PartOpt / cuda / BothRev |
0.000010784 s |
0.000010656 s |
1.01 |
actmtch / IPartOpt / cuda / PreRev |
0.00001072 s |
0.000010816 s |
0.99 |
actmtch / IPartOpt / cuda / PostRev |
0.000010752 s |
0.000010912 s |
0.99 |
actmtch / IPartOpt / cuda / BothRev |
0.000010847 s |
0.000010847 s |
1 |
actmtch / DefOpt / cuda / PreRev |
0.000016414999999999998 s |
0.000011168 s |
1.47 |
actmtch / DefOpt / cuda / PostRev |
0.000011264 s |
0.000011136 s |
1.01 |
actmtch / DefOpt / cuda / BothRev |
0.000010848 s |
0.000011169 s |
0.97 |
actmtch / IDefOpt / cuda / PreRev |
0.00000992 s |
0.000011104 s |
0.89 |
actmtch / IDefOpt / cuda / PostRev |
0.000010912 s |
0.000010304 s |
1.06 |
actmtch / IDefOpt / cuda / BothRev |
0.000010592 s |
0.000010528 s |
1.01 |
actmtch / JaXPipe / tpu / Primal |
5.6365e-7 s |
5.823e-7 s |
0.97 |
actmtch / Jax / tpu / Primal |
5.967e-7 s |
5.63425e-7 s |
1.06 |
actmtch / HLOOpt / tpu / Primal |
0.000002115525 s |
0.00000216355 s |
0.98 |
actmtch / PartOpt / tpu / Primal |
5.971000000000001e-7 s |
5.63175e-7 s |
1.06 |
actmtch / IPartOpt / tpu / Primal |
5.528e-7 s |
5.7545e-7 s |
0.96 |
actmtch / DefOpt / tpu / Primal |
0.000002174575 s |
0.000002061875 s |
1.05 |
actmtch / IDefOpt / tpu / Primal |
0.000002107 s |
0.00000216755 s |
0.97 |
actmtch / JaXPipe / tpu / Forward |
0.000003833274999999999 s |
0.000003854575 s |
0.99 |
actmtch / Jax / tpu / Forward |
0.000001214475 s |
0.00000123035 s |
0.99 |
actmtch / HLOOpt / tpu / Forward |
0.000003931875 s |
0.0000036632 s |
1.07 |
actmtch / PartOpt / tpu / Forward |
0.000003914025 s |
0.000003898975 s |
1.00 |
actmtch / IPartOpt / tpu / Forward |
0.0000039373250000000005 s |
0.000003676975000000001 s |
1.07 |
actmtch / DefOpt / tpu / Forward |
0.0000039175 s |
0.000003894825 s |
1.01 |
actmtch / IDefOpt / tpu / Forward |
0.000003935675 s |
0.0000036601 s |
1.08 |
actmtch / JaXPipe / tpu / PreRev |
0.000003481625 s |
0.00000374785 s |
0.93 |
actmtch / JaXPipe / tpu / PostRev |
0.0000016329 s |
0.0000016244500000000002 s |
1.01 |
actmtch / JaXPipe / tpu / BothRev |
0.00000347945 s |
0.00000374185 s |
0.93 |
actmtch / Jax / tpu / BothRev |
0.000001641775 s |
0.0000016273499999999998 s |
1.01 |
actmtch / HLOOpt / tpu / PreRev |
0.0000034899 s |
0.000003769575 s |
0.93 |
actmtch / HLOOpt / tpu / PostRev |
0.00000340945 s |
0.000003456725 s |
0.99 |
actmtch / HLOOpt / tpu / BothRev |
0.000003499275 s |
0.000003745975 s |
0.93 |
actmtch / PartOpt / tpu / PreRev |
0.0000034107500000000003 s |
0.000003455375 s |
0.99 |
actmtch / PartOpt / tpu / PostRev |
0.000001583175 s |
0.00000167475 s |
0.95 |
actmtch / PartOpt / tpu / BothRev |
0.000003422175 s |
0.0000034385 s |
1.00 |
actmtch / IPartOpt / tpu / PreRev |
0.000003486325 s |
0.00000374925 s |
0.93 |
actmtch / IPartOpt / tpu / PostRev |
0.000001638625 s |
0.000001629225 s |
1.01 |
actmtch / IPartOpt / tpu / BothRev |
0.000003466625 s |
0.00000374965 s |
0.92 |
actmtch / DefOpt / tpu / PreRev |
0.000003415425 s |
0.00000344795 s |
0.99 |
actmtch / DefOpt / tpu / PostRev |
0.000003414125 s |
0.00000366565 s |
0.93 |
actmtch / DefOpt / tpu / BothRev |
0.000003403625 s |
0.000003442575 s |
0.99 |
actmtch / IDefOpt / tpu / PreRev |
0.0000034718500000000005 s |
0.000003745325 s |
0.93 |
actmtch / IDefOpt / tpu / PostRev |
0.00000341105 s |
0.0000034381 s |
0.99 |
actmtch / IDefOpt / tpu / BothRev |
0.0000034642 s |
0.0000037527500000000006 s |
0.92 |
actmtch / JaXPipe / cpu / Primal |
0.000013295 s |
0.000007178499872679822 s |
1.85 |
actmtch / Jax / cpu / Primal |
0.000013471 s |
0.0000069215799157973376 s |
1.95 |
actmtch / HLOOpt / cpu / Primal |
0.000013977 s |
0.00000738472012017155 s |
1.89 |
actmtch / PartOpt / cpu / Primal |
0.000013289 s |
0.000006676359917037189 s |
1.99 |
actmtch / IPartOpt / cpu / Primal |
0.000013722 s |
0.0000066632801463129 s |
2.06 |
actmtch / DefOpt / cpu / Primal |
0.000014444 s |
0.0000070232199868769385 s |
2.06 |
actmtch / IDefOpt / cpu / Primal |
0.00001408 s |
0.0000071157399361254646 s |
1.98 |
actmtch / JaXPipe / cpu / Forward |
0.00001995 s |
0.00001071821985533461 s |
1.86 |
actmtch / Jax / cpu / Forward |
0.000018292 s |
0.000010183720041823107 s |
1.80 |
actmtch / HLOOpt / cpu / Forward |
0.000019565 s |
0.000011394539869797882 s |
1.72 |
actmtch / PartOpt / cpu / Forward |
0.000019646 s |
0.000010248900107399096 s |
1.92 |
actmtch / IPartOpt / cpu / Forward |
0.000019301 s |
0.000010885099873121365 s |
1.77 |
actmtch / DefOpt / cpu / Forward |
0.000019767 s |
0.000010958880120597312 s |
1.80 |
actmtch / IDefOpt / cpu / Forward |
0.000019396 s |
0.000010758259886642918 s |
1.80 |
actmtch / JaXPipe / cpu / PreRev |
0.000019509 s |
0.00001075559983291896 s |
1.81 |
actmtch / JaXPipe / cpu / PostRev |
0.000017755 s |
0.000010429819885757752 s |
1.70 |
actmtch / JaXPipe / cpu / BothRev |
0.000019495 s |
0.000011315619995002636 s |
1.72 |
actmtch / Jax / cpu / BothRev |
0.000018214 s |
0.000009239179926225915 s |
1.97 |
actmtch / HLOOpt / cpu / PreRev |
0.00001873 s |
0.000010561299859546123 s |
1.77 |
actmtch / HLOOpt / cpu / PostRev |
0.000019458 s |
0.000013761459958914202 s |
1.41 |
actmtch / HLOOpt / cpu / BothRev |
0.000019206 s |
0.000011175479849043769 s |
1.72 |
actmtch / PartOpt / cpu / PreRev |
0.000019491 s |
0.000010472680114617106 s |
1.86 |
actmtch / PartOpt / cpu / PostRev |
0.000017958 s |
0.000009276260025217198 s |
1.94 |
actmtch / PartOpt / cpu / BothRev |
0.000019243 s |
0.000011865220039908307 s |
1.62 |
actmtch / IPartOpt / cpu / PreRev |
0.000019074 s |
0.000011052040244976524 s |
1.73 |
actmtch / IPartOpt / cpu / PostRev |
0.000018032 s |
0.000010075359750771894 s |
1.79 |
actmtch / IPartOpt / cpu / BothRev |
0.000019142 s |
0.00001128298001276562 s |
1.70 |
actmtch / DefOpt / cpu / PreRev |
0.000018969 s |
0.000011099379953520838 s |
1.71 |
actmtch / DefOpt / cpu / PostRev |
0.000019113 s |
0.0000108453000211739 s |
1.76 |
actmtch / DefOpt / cpu / BothRev |
0.000019417 s |
0.000012118520025978796 s |
1.60 |
actmtch / IDefOpt / cpu / PreRev |
0.000019257 s |
0.00001125396007410018 s |
1.71 |
actmtch / IDefOpt / cpu / PostRev |
0.000019382 s |
0.000011109960032626988 s |
1.74 |
actmtch / IDefOpt / cpu / BothRev |
0.000019575 s |
0.000011985499986622017 s |
1.63 |
add_one / JaXPipe / cpu / Primal |
0.000006749640024281689 s |
0.000006788360224163625 s |
0.99 |
add_one / Jax / cpu / Primal |
0.000006460519998654491 s |
0.000006639620187343098 s |
0.97 |
add_one / HLOOpt / cpu / Primal |
0.000006847339964224375 s |
0.000007346800084633287 s |
0.93 |
add_one / PartOpt / cpu / Primal |
0.000006429439990824904 s |
0.000006595760023628827 s |
0.97 |
add_one / IPartOpt / cpu / Primal |
0.00000661434001813177 s |
0.000006949459930183366 s |
0.95 |
add_one / DefOpt / cpu / Primal |
0.000006262640008571907 s |
0.0000064497800485696645 s |
0.97 |
add_one / IDefOpt / cpu / Primal |
0.000006418959983420791 s |
0.000006538660054502543 s |
0.98 |
add_one / JaXPipe / cpu / Forward |
0.000009555560000080732 s |
0.000010474200062162708 s |
0.91 |
add_one / Jax / cpu / Forward |
0.000011702879983204183 s |
0.00001017146001686342 s |
1.15 |
add_one / HLOOpt / cpu / Forward |
0.00001026835998345632 s |
0.000010015480002039113 s |
1.03 |
add_one / PartOpt / cpu / Forward |
0.000010552519997872876 s |
0.000010248180115013384 s |
1.03 |
add_one / IPartOpt / cpu / Forward |
0.000009876400008579367 s |
0.000010215500260528645 s |
0.97 |
add_one / DefOpt / cpu / Forward |
0.00000971111998296692 s |
0.000010283020164933987 s |
0.94 |
add_one / IDefOpt / cpu / Forward |
0.000009708180032248492 s |
0.000009787819981283976 s |
0.99 |
add_one / JaXPipe / cpu / PreRev |
0.000011194739990969535 s |
0.000011898379962076432 s |
0.94 |
add_one / JaXPipe / cpu / PostRev |
0.000010815499990712851 s |
0.000011255620083829854 s |
0.96 |
add_one / JaXPipe / cpu / BothRev |
0.000011414879982112324 s |
0.000011512960045365616 s |
0.99 |
add_one / Jax / cpu / BothRev |
0.000010819220005942042 s |
0.00001132955996581586 s |
0.95 |
add_one / HLOOpt / cpu / PreRev |
0.000011095739955635508 s |
0.00001178636000986444 s |
0.94 |
add_one / HLOOpt / cpu / PostRev |
0.00001317952000135847 s |
0.00001347255994915031 s |
0.98 |
add_one / HLOOpt / cpu / BothRev |
0.000011253259963268648 s |
0.0000117777798732277 s |
0.96 |
add_one / PartOpt / cpu / PreRev |
0.000011011799924744992 s |
0.000011587020017032046 s |
0.95 |
add_one / PartOpt / cpu / PostRev |
0.000010766780042104072 s |
0.00001128248004533816 s |
0.95 |
add_one / PartOpt / cpu / BothRev |
0.000011455880021458142 s |
0.000012154820069554262 s |
0.94 |
add_one / IPartOpt / cpu / PreRev |
0.00001155569999355066 s |
0.000011255660210736095 s |
1.03 |
add_one / IPartOpt / cpu / PostRev |
0.000011315200026729144 s |
0.000011514540019561535 s |
0.98 |
add_one / IPartOpt / cpu / BothRev |
0.000010946200000034878 s |
0.000011263940286880823 s |
0.97 |
add_one / DefOpt / cpu / PreRev |
0.000011241059974054224 s |
0.00001126040006056428 s |
1.00 |
add_one / DefOpt / cpu / PostRev |
0.000011185120047230158 s |
0.000011456360080046578 s |
0.98 |
add_one / DefOpt / cpu / BothRev |
0.00001101718000427354 s |
0.00001145955997344572 s |
0.96 |
add_one / IDefOpt / cpu / PreRev |
0.000010762839956441894 s |
0.00001153671990323346 s |
0.93 |
add_one / IDefOpt / cpu / PostRev |
0.000010986199995386416 s |
0.000011387340055080131 s |
0.96 |
add_one / IDefOpt / cpu / BothRev |
0.000011265280008956324 s |
0.00001162638007372152 s |
0.97 |
add_one / JaXPipe / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
add_one / Jax / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
add_one / HLOOpt / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
add_one / PartOpt / cuda / Primal |
0.000002303 s |
0.000002303 s |
1 |
add_one / IPartOpt / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
add_one / DefOpt / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
add_one / IDefOpt / cuda / Primal |
0.000002304 s |
0.000002303 s |
1.00 |
add_one / JaXPipe / cuda / Forward |
0.000011328 s |
0.000010848 s |
1.04 |
add_one / Jax / cuda / Forward |
0.00001104 s |
0.000010688 s |
1.03 |
add_one / HLOOpt / cuda / Forward |
0.00001072 s |
0.000010816 s |
0.99 |
add_one / PartOpt / cuda / Forward |
0.000010847 s |
0.000009952 s |
1.09 |
add_one / IPartOpt / cuda / Forward |
0.000010592 s |
0.000010816 s |
0.98 |
add_one / DefOpt / cuda / Forward |
0.00001072 s |
0.000010688 s |
1.00 |
add_one / IDefOpt / cuda / Forward |
0.000011009 s |
0.00001024 s |
1.08 |
add_one / JaXPipe / cuda / PreRev |
0.00002544 s |
0.000024704 s |
1.03 |
add_one / JaXPipe / cuda / PostRev |
0.000025536 s |
0.0000256 s |
1.00 |
add_one / JaXPipe / cuda / BothRev |
0.000025313 s |
0.000025824 s |
0.98 |
add_one / Jax / cuda / BothRev |
0.000025888 s |
0.00002592 s |
1.00 |
add_one / HLOOpt / cuda / PreRev |
0.000025312 s |
0.000026048 s |
0.97 |
add_one / HLOOpt / cuda / PostRev |
0.000025472000000000003 s |
0.000025696 s |
0.99 |
add_one / HLOOpt / cuda / BothRev |
0.000025088 s |
0.000025953 s |
0.97 |
add_one / PartOpt / cuda / PreRev |
0.000025952 s |
0.000025696 s |
1.01 |
add_one / PartOpt / cuda / PostRev |
0.000025408 s |
0.000025728 s |
0.99 |
add_one / PartOpt / cuda / BothRev |
0.000025312 s |
0.000025599 s |
0.99 |
add_one / IPartOpt / cuda / PreRev |
0.000025632 s |
0.000026016 s |
0.99 |
add_one / IPartOpt / cuda / PostRev |
0.000025504 s |
0.000025664 s |
0.99 |
add_one / IPartOpt / cuda / BothRev |
0.000025632 s |
0.000026144 s |
0.98 |
add_one / DefOpt / cuda / PreRev |
0.00002528 s |
0.000025312 s |
1.00 |
add_one / DefOpt / cuda / PostRev |
0.000025728 s |
0.000026432 s |
0.97 |
add_one / DefOpt / cuda / BothRev |
0.000025216 s |
0.000025792 s |
0.98 |
add_one / IDefOpt / cuda / PreRev |
0.000025536 s |
0.000026496 s |
0.96 |
add_one / IDefOpt / cuda / PostRev |
0.000025568 s |
0.000025632 s |
1.00 |
add_one / IDefOpt / cuda / BothRev |
0.00002608 s |
0.000025952 s |
1.00 |
add_one / JaXPipe / tpu / Primal |
0.0000014332749999999998 s |
0.000001445175 s |
0.99 |
add_one / Jax / tpu / Primal |
0.0000014069 s |
0.000001447025 s |
0.97 |
add_one / HLOOpt / tpu / Primal |
0.0000014319 s |
0.00000145205 s |
0.99 |
add_one / PartOpt / tpu / Primal |
0.0000014080999999999997 s |
0.0000014558 s |
0.97 |
add_one / IPartOpt / tpu / Primal |
0.000001424575 s |
0.0000014553 s |
0.98 |
add_one / DefOpt / tpu / Primal |
0.0000014011000000000002 s |
0.000001455125 s |
0.96 |
add_one / IDefOpt / tpu / Primal |
0.000001423725 s |
0.000001453475 s |
0.98 |
add_one / JaXPipe / tpu / Forward |
0.00000186385 s |
0.0000019106 s |
0.98 |
add_one / Jax / tpu / Forward |
0.0000018374 s |
0.000001863875 s |
0.99 |
add_one / HLOOpt / tpu / Forward |
0.00000184355 s |
0.000001903025 s |
0.97 |
add_one / PartOpt / tpu / Forward |
0.000001847 s |
0.000001868175 s |
0.99 |
add_one / IPartOpt / tpu / Forward |
0.000001848475 s |
0.0000019037 s |
0.97 |
add_one / DefOpt / tpu / Forward |
0.0000018374 s |
0.000001863 s |
0.99 |
add_one / IDefOpt / tpu / Forward |
0.000001844125 s |
0.0000019137 s |
0.96 |
add_one / JaXPipe / tpu / PreRev |
0.0000022361 s |
0.0000022631 s |
0.99 |
add_one / JaXPipe / tpu / PostRev |
0.00000223975 s |
0.000002302475 s |
0.97 |
add_one / JaXPipe / tpu / BothRev |
0.0000022392750000000003 s |
0.0000022539 s |
0.99 |
add_one / Jax / tpu / BothRev |
0.000002247575 s |
0.0000022978499999999995 s |
0.98 |
add_one / HLOOpt / tpu / PreRev |
0.000002236275 s |
0.000002257775 s |
0.99 |
add_one / HLOOpt / tpu / PostRev |
0.000002242225 s |
0.000002294675 s |
0.98 |
add_one / HLOOpt / tpu / BothRev |
0.0000022413 s |
0.00000225465 s |
0.99 |
add_one / PartOpt / tpu / PreRev |
0.0000022377 s |
0.000002291325 s |
0.98 |
add_one / PartOpt / tpu / PostRev |
0.000002236475 s |
0.00000225305 s |
0.99 |
add_one / PartOpt / tpu / BothRev |
0.000002235825 s |
0.000002290925 s |
0.98 |
add_one / IPartOpt / tpu / PreRev |
0.0000022377500000000003 s |
0.000002252175 s |
0.99 |
add_one / IPartOpt / tpu / PostRev |
0.000002241075 s |
0.000002288375 s |
0.98 |
add_one / IPartOpt / tpu / BothRev |
0.00000224 s |
0.00000225975 s |
0.99 |
add_one / DefOpt / tpu / PreRev |
0.000002240075 s |
0.000002292425 s |
0.98 |
add_one / DefOpt / tpu / PostRev |
0.00000224615 s |
0.000002253675 s |
1.00 |
add_one / DefOpt / tpu / BothRev |
0.000002238 s |
0.0000022969 s |
0.97 |
add_one / IDefOpt / tpu / PreRev |
0.00000224155 s |
0.000002254075 s |
0.99 |
add_one / IDefOpt / tpu / PostRev |
0.000002245125 s |
0.0000022885749999999995 s |
0.98 |
add_one / IDefOpt / tpu / BothRev |
0.0000022363 s |
0.000002253675 s |
0.99 |
add_one / JaXPipe / cpu / Primal |
0.00001317 s |
0.000006788360224163625 s |
1.94 |
add_one / Jax / cpu / Primal |
0.000013115 s |
0.000006639620187343098 s |
1.98 |
add_one / HLOOpt / cpu / Primal |
0.000013024 s |
0.000007346800084633287 s |
1.77 |
add_one / PartOpt / cpu / Primal |
0.00001302 s |
0.000006595760023628827 s |
1.97 |
add_one / IPartOpt / cpu / Primal |
0.00001333 s |
0.000006949459930183366 s |
1.92 |
add_one / DefOpt / cpu / Primal |
0.00001308 s |
0.0000064497800485696645 s |
2.03 |
add_one / IDefOpt / cpu / Primal |
0.000013126 s |
0.000006538660054502543 s |
2.01 |
add_one / JaXPipe / cpu / Forward |
0.000018068 s |
0.000010474200062162708 s |
1.73 |
add_one / Jax / cpu / Forward |
0.000018076 s |
0.00001017146001686342 s |
1.78 |
add_one / HLOOpt / cpu / Forward |
0.000017963 s |
0.000010015480002039113 s |
1.79 |
add_one / PartOpt / cpu / Forward |
0.000018022 s |
0.000010248180115013384 s |
1.76 |
add_one / IPartOpt / cpu / Forward |
0.000017843000000000002 s |
0.000010215500260528645 s |
1.75 |
add_one / DefOpt / cpu / Forward |
0.000017901 s |
0.000010283020164933987 s |
1.74 |
add_one / IDefOpt / cpu / Forward |
0.000018004 s |
0.000009787819981283976 s |
1.84 |
add_one / JaXPipe / cpu / PreRev |
0.000020385 s |
0.000011898379962076432 s |
1.71 |
add_one / JaXPipe / cpu / PostRev |
0.000020104 s |
0.000011255620083829854 s |
1.79 |
add_one / JaXPipe / cpu / BothRev |
0.000019999 s |
0.000011512960045365616 s |
1.74 |
add_one / Jax / cpu / BothRev |
0.000019641 s |
0.00001132955996581586 s |
1.73 |
add_one / HLOOpt / cpu / PreRev |
0.000019817 s |
0.00001178636000986444 s |
1.68 |
add_one / HLOOpt / cpu / PostRev |
0.000019691 s |
0.00001347255994915031 s |
1.46 |
add_one / HLOOpt / cpu / BothRev |
0.000020182 s |
0.0000117777798732277 s |
1.71 |
add_one / PartOpt / cpu / PreRev |
0.00002 s |
0.000011587020017032046 s |
1.73 |
add_one / PartOpt / cpu / PostRev |
0.000019811 s |
0.00001128248004533816 s |
1.76 |
add_one / PartOpt / cpu / BothRev |
0.000019871 s |
0.000012154820069554262 s |
1.63 |
add_one / IPartOpt / cpu / PreRev |
0.000019627 s |
0.000011255660210736095 s |
1.74 |
add_one / IPartOpt / cpu / PostRev |
0.000019969 s |
0.000011514540019561535 s |
1.73 |
add_one / IPartOpt / cpu / BothRev |
0.000019676 s |
0.000011263940286880823 s |
1.75 |
add_one / DefOpt / cpu / PreRev |
0.000019593 s |
0.00001126040006056428 s |
1.74 |
add_one / DefOpt / cpu / PostRev |
0.000033159 s |
0.000011456360080046578 s |
2.89 |
add_one / DefOpt / cpu / BothRev |
0.000019797 s |
0.00001145955997344572 s |
1.73 |
add_one / IDefOpt / cpu / PreRev |
0.000019587 s |
0.00001153671990323346 s |
1.70 |
add_one / IDefOpt / cpu / PostRev |
0.000019802 s |
0.000011387340055080131 s |
1.74 |
add_one / IDefOpt / cpu / BothRev |
0.000019737 s |
0.00001162638007372152 s |
1.70 |
add_two / JaXPipe / cpu / Primal |
0.000006744399961462477 s |
0.000007508040071115829 s |
0.90 |
add_two / Jax / cpu / Primal |
0.00000725338002666831 s |
0.000007063660050334875 s |
1.03 |
add_two / HLOOpt / cpu / Primal |
0.000007474579997506226 s |
0.000006841000285930932 s |
1.09 |
add_two / PartOpt / cpu / Primal |
0.000006766079968656413 s |
0.00000713966008333955 s |
0.95 |
add_two / IPartOpt / cpu / Primal |
0.000007332519980991492 s |
0.000007327299972530454 s |
1.00 |
add_two / DefOpt / cpu / Primal |
0.000006640560031883069 s |
0.0000073373199484194626 s |
0.91 |
add_two / IDefOpt / cpu / Primal |
0.000006745600012436626 s |
0.00000649896013783291 s |
1.04 |
add_two / JaXPipe / cpu / Forward |
0.000009772819994395832 s |
0.000010095840043504722 s |
0.97 |
add_two / Jax / cpu / Forward |
0.000010164500044993474 s |
0.000010318800086679402 s |
0.99 |
add_two / HLOOpt / cpu / Forward |
0.00001001997998173465 s |
0.000010510679967410397 s |
0.95 |
add_two / PartOpt / cpu / Forward |
0.000009908600031849344 s |
0.000010098139973706566 s |
0.98 |
add_two / IPartOpt / cpu / Forward |
0.00000988983999377524 s |
0.000010444340186950284 s |
0.95 |
add_two / DefOpt / cpu / Forward |
0.000009689580019767164 s |
0.000010026939889939968 s |
0.97 |
add_two / IDefOpt / cpu / Forward |
0.000010364240006310866 s |
0.000010137019962712655 s |
1.02 |
add_two / JaXPipe / cpu / PreRev |
0.00001313253998887376 s |
0.00001420562002749648 s |
0.92 |
add_two / JaXPipe / cpu / PostRev |
0.000014123119990472331 s |
0.000013980979929328896 s |
1.01 |
add_two / JaXPipe / cpu / BothRev |
0.00001397969997015025 s |
0.00001418544001353439 s |
0.99 |
add_two / Jax / cpu / BothRev |
0.000013415639969025506 s |
0.000013970599975436923 s |
0.96 |
add_two / HLOOpt / cpu / PreRev |
0.000013454099989758108 s |
0.000013817140097671652 s |
0.97 |
add_two / HLOOpt / cpu / PostRev |
0.000015322980043492862 s |
0.000016204279963858424 s |
0.95 |
add_two / HLOOpt / cpu / BothRev |
0.000014095079968683422 s |
0.000014592959960282316 s |
0.97 |
add_two / PartOpt / cpu / PreRev |
0.000013547700000344775 s |
0.000014299439899332355 s |
0.95 |
add_two / PartOpt / cpu / PostRev |
0.00001392940002915566 s |
0.000013849920032953375 s |
1.01 |
add_two / PartOpt / cpu / BothRev |
0.0000136702400050126 s |
0.000014024859992787242 s |
0.97 |
add_two / IPartOpt / cpu / PreRev |
0.000013101460026518908 s |
0.00001402448000590084 s |
0.93 |
add_two / IPartOpt / cpu / PostRev |
0.000014376560020537 s |
0.000014053559934836813 s |
1.02 |
add_two / IPartOpt / cpu / BothRev |
0.000013773699984085396 s |
0.000014002400166646112 s |
0.98 |
add_two / DefOpt / cpu / PreRev |
0.000013236279983175336 s |
0.000014023519870534074 s |
0.94 |
add_two / DefOpt / cpu / PostRev |
0.000013807539999106664 s |
0.000013425140059553086 s |
1.03 |
add_two / DefOpt / cpu / BothRev |
0.000013923879987487452 s |
0.000014358260050357786 s |
0.97 |
add_two / IDefOpt / cpu / PreRev |
0.00001337076004347182 s |
0.000014406740083359182 s |
0.93 |
add_two / IDefOpt / cpu / PostRev |
0.000014014559974384613 s |
0.0000142485800824943 s |
0.98 |
add_two / IDefOpt / cpu / BothRev |
0.000014098019955781635 s |
0.000014178099991113414 s |
0.99 |
add_two / JaXPipe / cuda / Primal |
0.000002432 s |
0.0000024 s |
1.01 |
add_two / Jax / cuda / Primal |
0.000002431 s |
0.0000024 s |
1.01 |
add_two / HLOOpt / cuda / Primal |
0.000002432 s |
0.000002431 s |
1.00 |
add_two / PartOpt / cuda / Primal |
0.000002431 s |
0.000002431 s |
1 |
add_two / IPartOpt / cuda / Primal |
0.000002431 s |
0.0000024 s |
1.01 |
add_two / DefOpt / cuda / Primal |
0.000002431 s |
0.0000024 s |
1.01 |
add_two / IDefOpt / cuda / Primal |
0.000002431 s |
0.0000024 s |
1.01 |
add_two / JaXPipe / cuda / Forward |
0.000010847 s |
0.00001104 s |
0.98 |
add_two / Jax / cuda / Forward |
0.000010624 s |
0.0000104 s |
1.02 |
add_two / HLOOpt / cuda / Forward |
0.000010367 s |
0.00001072 s |
0.97 |
add_two / PartOpt / cuda / Forward |
0.000010336 s |
0.000010335 s |
1.00 |
add_two / IPartOpt / cuda / Forward |
0.000010689 s |
0.000010368 s |
1.03 |
add_two / DefOpt / cuda / Forward |
0.000010752 s |
0.000010017 s |
1.07 |
add_two / IDefOpt / cuda / Forward |
0.000010912 s |
0.000010113 s |
1.08 |
add_two / JaXPipe / cuda / PreRev |
0.000032992 s |
0.000033152000000000004 s |
1.00 |
add_two / JaXPipe / cuda / PostRev |
0.000033344 s |
0.000033376 s |
1.00 |
add_two / JaXPipe / cuda / BothRev |
0.000032512 s |
0.000033088 s |
0.98 |
add_two / Jax / cuda / BothRev |
0.000033119999999999995 s |
0.000033023 s |
1.00 |
add_two / HLOOpt / cuda / PreRev |
0.000033729 s |
0.000033824 s |
1.00 |
add_two / HLOOpt / cuda / PostRev |
0.000032705 s |
0.000032576 s |
1.00 |
add_two / HLOOpt / cuda / BothRev |
0.00003344 s |
0.000033472 s |
1.00 |
add_two / PartOpt / cuda / PreRev |
0.000033088 s |
0.000033791 s |
0.98 |
add_two / PartOpt / cuda / PostRev |
0.000038016 s |
0.000033568 s |
1.13 |
add_two / PartOpt / cuda / BothRev |
0.000032863 s |
0.000033088 s |
0.99 |
add_two / IPartOpt / cuda / PreRev |
0.000034687 s |
0.000033696 s |
1.03 |
add_two / IPartOpt / cuda / PostRev |
0.000032864 s |
0.00003296 s |
1.00 |
add_two / IPartOpt / cuda / BothRev |
0.000033856 s |
0.000033376 s |
1.01 |
add_two / DefOpt / cuda / PreRev |
0.000033856 s |
0.000032704 s |
1.04 |
add_two / DefOpt / cuda / PostRev |
0.000033407 s |
0.000033089 s |
1.01 |
add_two / DefOpt / cuda / BothRev |
0.000033248 s |
0.000033536000000000006 s |
0.99 |
add_two / IDefOpt / cuda / PreRev |
0.000033664 s |
0.000032896000000000005 s |
1.02 |
add_two / IDefOpt / cuda / PostRev |
0.000033216 s |
0.000032032 s |
1.04 |
add_two / IDefOpt / cuda / BothRev |
0.000033504 s |
0.000034016 s |
0.98 |
add_two / JaXPipe / tpu / Primal |
0.000001429525 s |
0.0000013973499999999998 s |
1.02 |
add_two / Jax / tpu / Primal |
0.000001475975 s |
0.00000146445 s |
1.01 |
add_two / HLOOpt / tpu / Primal |
0.000001440375 s |
0.00000143 s |
1.01 |
add_two / PartOpt / tpu / Primal |
0.00000147945 s |
0.00000144125 s |
1.03 |
add_two / IPartOpt / tpu / Primal |
0.000001431275 s |
0.00000139755 s |
1.02 |
add_two / DefOpt / tpu / Primal |
0.00000147455 s |
0.000001449025 s |
1.02 |
add_two / IDefOpt / tpu / Primal |
0.000001429975 s |
0.0000013983750000000002 s |
1.02 |
add_two / JaXPipe / tpu / Forward |
0.000001828375 s |
0.000001801375 s |
1.01 |
add_two / Jax / tpu / Forward |
0.00000183035 s |
0.000001790175 s |
1.02 |
add_two / HLOOpt / tpu / Forward |
0.000001825 s |
0.000001801875 s |
1.01 |
add_two / PartOpt / tpu / Forward |
0.00000184285 s |
0.000001786925 s |
1.03 |
add_two / IPartOpt / tpu / Forward |
0.000001832175 s |
0.00000180065 s |
1.02 |
add_two / DefOpt / tpu / Forward |
0.000001826175 s |
0.0000017977000000000002 s |
1.02 |
add_two / IDefOpt / tpu / Forward |
0.0000018305 s |
0.00000180145 s |
1.02 |
add_two / JaXPipe / tpu / PreRev |
0.000002835625 s |
0.0000027991 s |
1.01 |
add_two / JaXPipe / tpu / PostRev |
0.00000274515 s |
0.000002727475 s |
1.01 |
add_two / JaXPipe / tpu / BothRev |
0.000002838075 s |
0.0000027946250000000004 s |
1.02 |
add_two / Jax / tpu / BothRev |
0.0000027635 s |
0.0000027328 s |
1.01 |
add_two / HLOOpt / tpu / PreRev |
0.000002850675 s |
0.000002806575 s |
1.02 |
add_two / HLOOpt / tpu / PostRev |
0.000002753975 s |
0.00000273025 s |
1.01 |
add_two / HLOOpt / tpu / BothRev |
0.000002825975 s |
0.00000280145 s |
1.01 |
add_two / PartOpt / tpu / PreRev |
0.00000275775 s |
0.000002737475 s |
1.01 |
add_two / PartOpt / tpu / PostRev |
0.000002830375 s |
0.000002805375 s |
1.01 |
add_two / PartOpt / tpu / BothRev |
0.0000027642 s |
0.000002730425 s |
1.01 |
add_two / IPartOpt / tpu / PreRev |
0.00000283665 s |
0.000002812925 s |
1.01 |
add_two / IPartOpt / tpu / PostRev |
0.0000027553 s |
0.0000027271 s |
1.01 |
add_two / IPartOpt / tpu / BothRev |
0.0000028352250000000005 s |
0.0000027982 s |
1.01 |
add_two / DefOpt / tpu / PreRev |
0.000002768925 s |
0.00000273005 s |
1.01 |
add_two / DefOpt / tpu / PostRev |
0.0000028393000000000005 s |
0.000002803275 s |
1.01 |
add_two / DefOpt / tpu / BothRev |
0.00000274915 s |
0.000002721425 s |
1.01 |
add_two / IDefOpt / tpu / PreRev |
0.00000283195 s |
0.00000279425 s |
1.01 |
add_two / IDefOpt / tpu / PostRev |
0.0000027539 s |
0.0000027338000000000004 s |
1.01 |
add_two / IDefOpt / tpu / BothRev |
0.00000285205 s |
0.000002794225 s |
1.02 |
add_two / JaXPipe / cpu / Primal |
0.000013656 s |
0.000007508040071115829 s |
1.82 |
add_two / Jax / cpu / Primal |
0.000013588 s |
0.000007063660050334875 s |
1.92 |
add_two / HLOOpt / cpu / Primal |
0.000013473 s |
0.000006841000285930932 s |
1.97 |
add_two / PartOpt / cpu / Primal |
0.00001361 s |
0.00000713966008333955 s |
1.91 |
add_two / IPartOpt / cpu / Primal |
0.000013457 s |
0.000007327299972530454 s |
1.84 |
add_two / DefOpt / cpu / Primal |
0.000013344 s |
0.0000073373199484194626 s |
1.82 |
add_two / IDefOpt / cpu / Primal |
0.000013304 s |
0.00000649896013783291 s |
2.05 |
add_two / JaXPipe / cpu / Forward |
0.00001855 s |
0.000010095840043504722 s |
1.84 |
add_two / Jax / cpu / Forward |
0.000018043 s |
0.000010318800086679402 s |
1.75 |
add_two / HLOOpt / cpu / Forward |
0.000018073 s |
0.000010510679967410397 s |
1.72 |
add_two / PartOpt / cpu / Forward |
0.000018139 s |
0.000010098139973706566 s |
1.80 |
add_two / IPartOpt / cpu / Forward |
0.000017959 s |
0.000010444340186950284 s |
1.72 |
add_two / DefOpt / cpu / Forward |
0.000018378 s |
0.000010026939889939968 s |
1.83 |
add_two / IDefOpt / cpu / Forward |
0.000018292 s |
0.000010137019962712655 s |
1.80 |
add_two / JaXPipe / cpu / PreRev |
0.000023766 s |
0.00001420562002749648 s |
1.67 |
add_two / JaXPipe / cpu / PostRev |
0.000023392 s |
0.000013980979929328896 s |
1.67 |
add_two / JaXPipe / cpu / BothRev |
0.000023323 s |
0.00001418544001353439 s |
1.64 |
add_two / Jax / cpu / BothRev |
0.000023279 s |
0.000013970599975436923 s |
1.67 |
add_two / HLOOpt / cpu / PreRev |
0.000023423 s |
0.000013817140097671652 s |
1.70 |
add_two / HLOOpt / cpu / PostRev |
0.000023699 s |
0.000016204279963858424 s |
1.46 |
add_two / HLOOpt / cpu / BothRev |
0.000023553 s |
0.000014592959960282316 s |
1.61 |
add_two / PartOpt / cpu / PreRev |
0.000023469 s |
0.000014299439899332355 s |
1.64 |
add_two / PartOpt / cpu / PostRev |
0.000023513 s |
0.000013849920032953375 s |
1.70 |
add_two / PartOpt / cpu / BothRev |
0.000023701 s |
0.000014024859992787242 s |
1.69 |
add_two / IPartOpt / cpu / PreRev |
0.000023258 s |
0.00001402448000590084 s |
1.66 |
add_two / IPartOpt / cpu / PostRev |
0.000023695 s |
0.000014053559934836813 s |
1.69 |
add_two / IPartOpt / cpu / BothRev |
0.000023846 s |
0.000014002400166646112 s |
1.70 |
add_two / DefOpt / cpu / PreRev |
0.000023455 s |
0.000014023519870534074 s |
1.67 |
add_two / DefOpt / cpu / PostRev |
0.000023561 s |
0.000013425140059553086 s |
1.75 |
add_two / DefOpt / cpu / BothRev |
0.000023595 s |
0.000014358260050357786 s |
1.64 |
add_two / IDefOpt / cpu / PreRev |
0.000023333 s |
0.000014406740083359182 s |
1.62 |
add_two / IDefOpt / cpu / PostRev |
0.000023667 s |
0.0000142485800824943 s |
1.66 |
add_two / IDefOpt / cpu / BothRev |
0.000023166 s |
0.000014178099991113414 s |
1.63 |
cache / JaXPipe / cpu / Primal |
0.000006524520022139768 s |
0.0000069787199026905 s |
0.93 |
cache / Jax / cpu / Primal |
0.000006105239990574774 s |
0.0000063129400223260745 s |
0.97 |
cache / HLOOpt / cpu / Primal |
0.000005932719996053493 s |
0.000006742240257153753 s |
0.88 |
cache / PartOpt / cpu / Primal |
0.000006358220016409177 s |
0.000005976620013825596 s |
1.06 |
cache / IPartOpt / cpu / Primal |
0.000006492120010079816 s |
0.000006029200048942584 s |
1.08 |
cache / DefOpt / cpu / Primal |
0.000006659280034000403 s |
0.000006931579846423119 s |
0.96 |
cache / IDefOpt / cpu / Primal |
0.000006107260005592252 s |
0.000006147040076029953 s |
0.99 |
cache / JaXPipe / cpu / Forward |
0.000014796080022279056 s |
0.000015323639890993944 s |
0.97 |
cache / Jax / cpu / Forward |
0.000014785600005779998 s |
0.00001486237993958639 s |
0.99 |
cache / HLOOpt / cpu / Forward |
0.00001533258002382354 s |
0.000015309420123230667 s |
1.00 |
cache / PartOpt / cpu / Forward |
0.000014471960002993 s |
0.000015313760195567737 s |
0.95 |
cache / IPartOpt / cpu / Forward |
0.00001546555995446397 s |
0.00001572162003867561 s |
0.98 |
cache / DefOpt / cpu / Forward |
0.000015041760034364416 s |
0.000015381539778900333 s |
0.98 |
cache / IDefOpt / cpu / Forward |
0.000014735720014869004 s |
0.000015069819964992347 s |
0.98 |
cache / JaXPipe / cpu / PreRev |
0.00001574644001266279 s |
0.000015864039996813518 s |
0.99 |
cache / JaXPipe / cpu / PostRev |
0.00002089581999825896 s |
0.000020334559703769628 s |
1.03 |
cache / JaXPipe / cpu / BothRev |
0.000016230299970629857 s |
0.00001632793999306159 s |
0.99 |
cache / Jax / cpu / BothRev |
0.00002066409995677532 s |
0.000021568059783021453 s |
0.96 |
cache / HLOOpt / cpu / PreRev |
0.00001645104005547182 s |
0.00001589912008057581 s |
1.03 |
cache / HLOOpt / cpu / PostRev |
0.000019381060019441067 s |
0.00001920374001201708 s |
1.01 |
cache / HLOOpt / cpu / BothRev |
0.00001641673997255566 s |
0.00001632200008316431 s |
1.01 |
cache / PartOpt / cpu / PreRev |
0.000016237680001722764 s |
0.000016088679876702373 s |
1.01 |
cache / PartOpt / cpu / PostRev |
0.000020833939988733615 s |
0.00002114913997502299 s |
0.99 |
cache / PartOpt / cpu / BothRev |
0.00001675482003520301 s |
0.00001660899994021747 s |
1.01 |
cache / IPartOpt / cpu / PreRev |
0.000016451159999633092 s |
0.00001628128004085738 s |
1.01 |
cache / IPartOpt / cpu / PostRev |
0.000020962800008419435 s |
0.00002147131999663543 s |
0.98 |
cache / IPartOpt / cpu / BothRev |
0.000015446720008185368 s |
0.000015473139937967063 s |
1.00 |
cache / DefOpt / cpu / PreRev |
0.000015555799973299144 s |
0.000016298939954140224 s |
0.95 |
cache / DefOpt / cpu / PostRev |
0.000015160580005613156 s |
0.00001588609997270396 s |
0.95 |
cache / DefOpt / cpu / BothRev |
0.0000157176199627429 s |
0.000016655639992677607 s |
0.94 |
cache / IDefOpt / cpu / PreRev |
0.000015464740008610534 s |
0.000016662720081512817 s |
0.93 |
cache / IDefOpt / cpu / PostRev |
0.000015548019955531345 s |
0.000015710999869043007 s |
0.99 |
cache / IDefOpt / cpu / BothRev |
0.000015369940037999185 s |
0.00001563078003528062 s |
0.98 |
cache / JaXPipe / cuda / Primal |
0.000002335 s |
0.000002336 s |
1.00 |
cache / Jax / cuda / Primal |
0.000002335 s |
0.000002336 s |
1.00 |
cache / HLOOpt / cuda / Primal |
0.000002335 s |
0.000002336 s |
1.00 |
cache / PartOpt / cuda / Primal |
0.000002335 s |
0.000002336 s |
1.00 |
cache / IPartOpt / cuda / Primal |
0.000002304 s |
0.000002335 s |
0.99 |
cache / DefOpt / cuda / Primal |
0.000002335 s |
0.000002336 s |
1.00 |
cache / IDefOpt / cuda / Primal |
0.000002335 s |
0.000002336 s |
1.00 |
cache / JaXPipe / cuda / Forward |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
cache / Jax / cuda / Forward |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
cache / HLOOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / PartOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / IPartOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
cache / DefOpt / cuda / Forward |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
cache / IDefOpt / cuda / Forward |
0.0000023670000000000004 s |
0.0000023670000000000004 s |
1 |
cache / JaXPipe / cuda / PreRev |
0.00001072 s |
0.000011488 s |
0.93 |
cache / JaXPipe / cuda / PostRev |
0.000010816 s |
0.000010816 s |
1 |
cache / JaXPipe / cuda / BothRev |
0.00001088 s |
0.000011296 s |
0.96 |
cache / Jax / cuda / BothRev |
0.000015745 s |
0.000011071 s |
1.42 |
cache / HLOOpt / cuda / PreRev |
0.00001376 s |
0.000013727 s |
1.00 |
cache / HLOOpt / cuda / PostRev |
0.000013696 s |
0.000013696 s |
1 |
cache / HLOOpt / cuda / BothRev |
0.00001376 s |
0.000013728 s |
1.00 |
cache / PartOpt / cuda / PreRev |
0.000010432 s |
0.0000112 s |
0.93 |
cache / PartOpt / cuda / PostRev |
0.000010752 s |
0.000010976 s |
0.98 |
cache / PartOpt / cuda / BothRev |
0.000010752 s |
0.000010592 s |
1.02 |
cache / IPartOpt / cuda / PreRev |
0.000011007 s |
0.000010848 s |
1.01 |
cache / IPartOpt / cuda / PostRev |
0.000011008 s |
0.000010752 s |
1.02 |
cache / IPartOpt / cuda / BothRev |
0.000011071 s |
0.000010785 s |
1.03 |
cache / DefOpt / cuda / PreRev |
0.000011136 s |
0.000010912 s |
1.02 |
cache / DefOpt / cuda / PostRev |
0.000011007 s |
0.00001088 s |
1.01 |
cache / DefOpt / cuda / BothRev |
0.000010784 s |
0.000010752 s |
1.00 |
cache / IDefOpt / cuda / PreRev |
0.000010592 s |
0.000011008 s |
0.96 |
cache / IDefOpt / cuda / PostRev |
0.000010976 s |
0.000011264 s |
0.97 |
cache / IDefOpt / cuda / BothRev |
0.000010783 s |
0.000011232 s |
0.96 |
cache / JaXPipe / tpu / Primal |
0.0000024626 s |
0.000002465675 s |
1.00 |
cache / Jax / tpu / Primal |
0.000002488075 s |
0.00000248255 s |
1.00 |
cache / HLOOpt / tpu / Primal |
0.000002462975 s |
0.0000024704750000000003 s |
1.00 |
cache / PartOpt / tpu / Primal |
0.0000024467500000000003 s |
0.00000246885 s |
0.99 |
cache / IPartOpt / tpu / Primal |
0.000002471 s |
0.000002460525 s |
1.00 |
cache / DefOpt / tpu / Primal |
0.000002467825 s |
0.000002455425 s |
1.01 |
cache / IDefOpt / tpu / Primal |
0.0000024628 s |
0.000002467525 s |
1.00 |
cache / JaXPipe / tpu / Forward |
0.0000035347250000000003 s |
0.000003551475 s |
1.00 |
cache / Jax / tpu / Forward |
0.000003525425 s |
0.00000355505 s |
0.99 |
cache / HLOOpt / tpu / Forward |
0.000003556325 s |
0.00000356615 s |
1.00 |
cache / PartOpt / tpu / Forward |
0.000003523275 s |
0.000003547725 s |
0.99 |
cache / IPartOpt / tpu / Forward |
0.00000356355 s |
0.0000035691 s |
1.00 |
cache / DefOpt / tpu / Forward |
0.0000035270249999999995 s |
0.00000354025 s |
1.00 |
cache / IDefOpt / tpu / Forward |
0.000003555475 s |
0.000003548275 s |
1.00 |
cache / JaXPipe / tpu / PreRev |
0.000005004425 s |
0.000004952975 s |
1.01 |
cache / JaXPipe / tpu / PostRev |
0.000005020075 s |
0.00000499115 s |
1.01 |
cache / JaXPipe / tpu / BothRev |
0.0000050448 s |
0.000004966925 s |
1.02 |
cache / Jax / tpu / BothRev |
0.000005026 s |
0.00000498985 s |
1.01 |
cache / HLOOpt / tpu / PreRev |
0.000003972825 s |
0.0000041277750000000005 s |
0.96 |
cache / HLOOpt / tpu / PostRev |
0.000004145175 s |
0.000004155575 s |
1.00 |
cache / HLOOpt / tpu / BothRev |
0.000003968475 s |
0.0000041377 s |
0.96 |
cache / PartOpt / tpu / PreRev |
0.000005037125 s |
0.000005022225 s |
1.00 |
cache / PartOpt / tpu / PostRev |
0.000005030425 s |
0.000004973099999999999 s |
1.01 |
cache / PartOpt / tpu / BothRev |
0.000005015825 s |
0.000005009025 s |
1.00 |
cache / IPartOpt / tpu / PreRev |
0.0000050363 s |
0.000004991375 s |
1.01 |
cache / IPartOpt / tpu / PostRev |
0.000005014475000000001 s |
0.000005010125 s |
1.00 |
cache / IPartOpt / tpu / BothRev |
0.000005016375 s |
0.00000499415 s |
1.00 |
cache / DefOpt / tpu / PreRev |
0.000005016925 s |
0.000005018575 s |
1.00 |
cache / DefOpt / tpu / PostRev |
0.000005009675 s |
0.000004991775 s |
1.00 |
cache / DefOpt / tpu / BothRev |
0.000005021849999999999 s |
0.0000050015 s |
1.00 |
cache / IDefOpt / tpu / PreRev |
0.000005022075 s |
0.00000499745 s |
1.00 |
cache / IDefOpt / tpu / PostRev |
0.000005037175000000001 s |
0.0000050139500000000005 s |
1.00 |
cache / IDefOpt / tpu / BothRev |
0.000005025875 s |
0.0000049622 s |
1.01 |
cache / JaXPipe / cpu / Primal |
0.000013006 s |
0.0000069787199026905 s |
1.86 |
cache / Jax / cpu / Primal |
0.000013179 s |
0.0000063129400223260745 s |
2.09 |
cache / HLOOpt / cpu / Primal |
0.000012862 s |
0.000006742240257153753 s |
1.91 |
cache / PartOpt / cpu / Primal |
0.000012926 s |
0.000005976620013825596 s |
2.16 |
cache / IPartOpt / cpu / Primal |
0.00001283 s |
0.000006029200048942584 s |
2.13 |
cache / DefOpt / cpu / Primal |
0.000013358 s |
0.000006931579846423119 s |
1.93 |
cache / IDefOpt / cpu / Primal |
0.000012958 s |
0.000006147040076029953 s |
2.11 |
cache / JaXPipe / cpu / Forward |
0.000024319 s |
0.000015323639890993944 s |
1.59 |
cache / Jax / cpu / Forward |
0.000023024 s |
0.00001486237993958639 s |
1.55 |
cache / HLOOpt / cpu / Forward |
0.000024887 s |
0.000015309420123230667 s |
1.63 |
cache / PartOpt / cpu / Forward |
0.000026892 s |
0.000015313760195567737 s |
1.76 |
cache / IPartOpt / cpu / Forward |
0.000023056 s |
0.00001572162003867561 s |
1.47 |
cache / DefOpt / cpu / Forward |
0.000023201 s |
0.000015381539778900333 s |
1.51 |
cache / IDefOpt / cpu / Forward |
0.00002603 s |
0.000015069819964992347 s |
1.73 |
cache / JaXPipe / cpu / PreRev |
0.000018183 s |
0.000015864039996813518 s |
1.15 |
cache / JaXPipe / cpu / PostRev |
0.000019487 s |
0.000020334559703769628 s |
0.96 |
cache / JaXPipe / cpu / BothRev |
0.00001841 s |
0.00001632793999306159 s |
1.13 |
cache / Jax / cpu / BothRev |
0.000020272 s |
0.000021568059783021453 s |
0.94 |
cache / HLOOpt / cpu / PreRev |
0.000018374 s |
0.00001589912008057581 s |
1.16 |
cache / HLOOpt / cpu / PostRev |
0.000018352 s |
0.00001920374001201708 s |
0.96 |
cache / HLOOpt / cpu / BothRev |
0.000017967 s |
0.00001632200008316431 s |
1.10 |
cache / PartOpt / cpu / PreRev |
0.000018041 s |
0.000016088679876702373 s |
1.12 |
cache / PartOpt / cpu / PostRev |
0.000019922 s |
0.00002114913997502299 s |
0.94 |
cache / PartOpt / cpu / BothRev |
0.000017879 s |
0.00001660899994021747 s |
1.08 |
cache / IPartOpt / cpu / PreRev |
0.00001817 s |
0.00001628128004085738 s |
1.12 |
cache / IPartOpt / cpu / PostRev |
0.000020734 s |
0.00002147131999663543 s |
0.97 |
cache / IPartOpt / cpu / BothRev |
0.000017749999999999998 s |
0.000015473139937967063 s |
1.15 |
cache / DefOpt / cpu / PreRev |
0.000017511 s |
0.000016298939954140224 s |
1.07 |
cache / DefOpt / cpu / PostRev |
0.000018119 s |
0.00001588609997270396 s |
1.14 |
cache / DefOpt / cpu / BothRev |
0.000017336 s |
0.000016655639992677607 s |
1.04 |
cache / IDefOpt / cpu / PreRev |
0.000017749999999999998 s |
0.000016662720081512817 s |
1.07 |
cache / IDefOpt / cpu / PostRev |
0.000025914 s |
0.000015710999869043007 s |
1.65 |
cache / IDefOpt / cpu / BothRev |
0.000034429 s |
0.00001563078003528062 s |
2.20 |
Concat / JaXPipe / cpu / Primal |
0.000006781759993828018 s |
0.000006810739942011423 s |
1.00 |
Concat / Jax / cpu / Primal |
0.000006882879988552304 s |
0.00000690752003720263 s |
1.00 |
Concat / HLOOpt / cpu / Primal |
0.000006927140047991998 s |
0.000006830919955973513 s |
1.01 |
Concat / PartOpt / cpu / Primal |
0.0000066143200274382255 s |
0.000006688800240226555 s |
0.99 |
Concat / IPartOpt / cpu / Primal |
0.000006680340020466247 s |
0.000006490459891210776 s |
1.03 |
Concat / DefOpt / cpu / Primal |
0.000006325279991870048 s |
0.000007036420065560378 s |
0.90 |
Concat / IDefOpt / cpu / Primal |
0.000006676539969703299 s |
0.0000067557199145085175 s |
0.99 |
Concat / JaXPipe / cpu / Forward |
0.000010183839949604587 s |
0.000010362099965277594 s |
0.98 |
Concat / Jax / cpu / Forward |
0.0000101644799906353 s |
0.000009953659937309568 s |
1.02 |
Concat / HLOOpt / cpu / Forward |
0.000009553699992466136 s |
0.00000976933995843865 s |
0.98 |
Concat / PartOpt / cpu / Forward |
0.000009925180029313196 s |
0.000009767280134838076 s |
1.02 |
Concat / IPartOpt / cpu / Forward |
0.000010219480036539608 s |
0.000010285560092597734 s |
0.99 |
Concat / DefOpt / cpu / Forward |
0.000009551540015309002 s |
0.000009621979988878594 s |
0.99 |
Concat / IDefOpt / cpu / Forward |
0.000009999760022765258 s |
0.000009680279981694185 s |
1.03 |
Concat / JaXPipe / cpu / PreRev |
0.00001126398004089424 s |
0.000012034819956170395 s |
0.94 |
Concat / JaXPipe / cpu / PostRev |
0.000010704339965741384 s |
0.000011304079926048871 s |
0.95 |
Concat / JaXPipe / cpu / BothRev |
0.000011370240044925595 s |
0.000011185099720023572 s |
1.02 |
Concat / Jax / cpu / BothRev |
0.000011071879980590892 s |
0.000011534079858392944 s |
0.96 |
Concat / HLOOpt / cpu / PreRev |
0.000011769080001613477 s |
0.000011462079819466452 s |
1.03 |
Concat / HLOOpt / cpu / PostRev |
0.000013192899987188866 s |
0.000013215480030339678 s |
1.00 |
Concat / HLOOpt / cpu / BothRev |
0.00001161439999123104 s |
0.000011233880068175494 s |
1.03 |
Concat / PartOpt / cpu / PreRev |
0.000011185499952262034 s |
0.000011222520042792892 s |
1.00 |
Concat / PartOpt / cpu / PostRev |
0.000011117320027551614 s |
0.000011747020071197769 s |
0.95 |
Concat / PartOpt / cpu / BothRev |
0.000011325219975333312 s |
0.000012005360076727811 s |
0.94 |
Concat / IPartOpt / cpu / PreRev |
0.000011885480007549632 s |
0.000011245579880778678 s |
1.06 |
Concat / IPartOpt / cpu / PostRev |
0.000011170639982083233 s |
0.00001154162015154725 s |
0.97 |
Concat / IPartOpt / cpu / BothRev |
0.000011304319959890563 s |
0.000011165699870616665 s |
1.01 |
Concat / DefOpt / cpu / PreRev |
0.00001165049992778222 s |
0.000011291160044493152 s |
1.03 |
Concat / DefOpt / cpu / PostRev |
0.000011115980023532756 s |
0.00001136698014306603 s |
0.98 |
Concat / DefOpt / cpu / BothRev |
0.000010888340011661057 s |
0.00001205559976369841 s |
0.90 |
Concat / IDefOpt / cpu / PreRev |
0.000011905620012839793 s |
0.000011354359921824652 s |
1.05 |
Concat / IDefOpt / cpu / PostRev |
0.000011301979966447109 s |
0.00001100361994758714 s |
1.03 |
Concat / IDefOpt / cpu / BothRev |
0.000011669060004351197 s |
0.000011541079984453971 s |
1.01 |
Concat / JaXPipe / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
Concat / Jax / cuda / Primal |
0.000002464 s |
0.000002431 s |
1.01 |
Concat / HLOOpt / cuda / Primal |
0.000002432 s |
0.000002431 s |
1.00 |
Concat / PartOpt / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
Concat / IPartOpt / cuda / Primal |
0.000002431 s |
0.000002432 s |
1.00 |
Concat / DefOpt / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
Concat / IDefOpt / cuda / Primal |
0.000002432 s |
0.000002431 s |
1.00 |
Concat / JaXPipe / cuda / Forward |
0.000010688 s |
0.000010912 s |
0.98 |
Concat / Jax / cuda / Forward |
0.000010656 s |
0.000010944 s |
0.97 |
Concat / HLOOpt / cuda / Forward |
0.000011136 s |
0.000010496 s |
1.06 |
Concat / PartOpt / cuda / Forward |
0.000010784 s |
0.000010752 s |
1.00 |
Concat / IPartOpt / cuda / Forward |
0.000010848 s |
0.000010464 s |
1.04 |
Concat / DefOpt / cuda / Forward |
0.00001104 s |
0.000010719 s |
1.03 |
Concat / IDefOpt / cuda / Forward |
0.000010816 s |
0.000010848 s |
1.00 |
Concat / JaXPipe / cuda / PreRev |
0.00001728 s |
0.000016896000000000002 s |
1.02 |
Concat / JaXPipe / cuda / PostRev |
0.00001712 s |
0.000016896000000000002 s |
1.01 |
Concat / JaXPipe / cuda / BothRev |
0.0000168 s |
0.00001696 s |
0.99 |
Concat / Jax / cuda / BothRev |
0.000017345 s |
0.0000168 s |
1.03 |
Concat / HLOOpt / cuda / PreRev |
0.000017472 s |
0.000016832 s |
1.04 |
Concat / HLOOpt / cuda / PostRev |
0.000017088 s |
0.000016958999999999998 s |
1.01 |
Concat / HLOOpt / cuda / BothRev |
0.000017024 s |
0.000016670999999999997 s |
1.02 |
Concat / PartOpt / cuda / PreRev |
0.00001712 s |
0.000017184 s |
1.00 |
Concat / PartOpt / cuda / PostRev |
0.000017408 s |
0.000016672 s |
1.04 |
Concat / PartOpt / cuda / BothRev |
0.000017344 s |
0.000016927999999999998 s |
1.02 |
Concat / IPartOpt / cuda / PreRev |
0.000017184 s |
0.000017088 s |
1.01 |
Concat / IPartOpt / cuda / PostRev |
0.00001728 s |
0.000016607 s |
1.04 |
Concat / IPartOpt / cuda / BothRev |
0.000017312 s |
0.000016768000000000003 s |
1.03 |
Concat / DefOpt / cuda / PreRev |
0.00001712 s |
0.000016896000000000002 s |
1.01 |
Concat / DefOpt / cuda / PostRev |
0.000017152 s |
0.000016608 s |
1.03 |
Concat / DefOpt / cuda / BothRev |
0.000016929 s |
0.000016832 s |
1.01 |
Concat / IDefOpt / cuda / PreRev |
0.000017216 s |
0.000017312 s |
0.99 |
Concat / IDefOpt / cuda / PostRev |
0.000016768000000000003 s |
0.000017408 s |
0.96 |
Concat / IDefOpt / cuda / BothRev |
0.00001728 s |
0.00001696 s |
1.02 |
Concat / JaXPipe / tpu / Primal |
0.00000152335 s |
0.00000151785 s |
1.00 |
Concat / Jax / tpu / Primal |
0.000001542425 s |
0.00000151125 s |
1.02 |
Concat / HLOOpt / tpu / Primal |
0.000001531725 s |
0.000001529975 s |
1.00 |
Concat / PartOpt / tpu / Primal |
0.000001529025 s |
0.00000151725 s |
1.01 |
Concat / IPartOpt / tpu / Primal |
0.0000015254499999999995 s |
0.0000015205 s |
1.00 |
Concat / DefOpt / tpu / Primal |
0.00000152395 s |
0.0000015108750000000002 s |
1.01 |
Concat / IDefOpt / tpu / Primal |
0.000001522575 s |
0.0000015263 s |
1.00 |
Concat / JaXPipe / tpu / Forward |
0.0000015812999999999998 s |
0.000001552875 s |
1.02 |
Concat / Jax / tpu / Forward |
0.0000015659 s |
0.000001551275 s |
1.01 |
Concat / HLOOpt / tpu / Forward |
0.00000158265 s |
0.000001546475 s |
1.02 |
Concat / PartOpt / tpu / Forward |
0.00000155405 s |
0.000001565425 s |
0.99 |
Concat / IPartOpt / tpu / Forward |
0.000001572875 s |
0.000001556725 s |
1.01 |
Concat / DefOpt / tpu / Forward |
0.00000156155 s |
0.0000015515499999999998 s |
1.01 |
Concat / IDefOpt / tpu / Forward |
0.0000015759 s |
0.000001565875 s |
1.01 |
Concat / JaXPipe / tpu / PreRev |
0.00000200615 s |
0.000002029675 s |
0.99 |
Concat / JaXPipe / tpu / PostRev |
0.00000206715 s |
0.000002016 s |
1.03 |
Concat / JaXPipe / tpu / BothRev |
0.0000019972750000000003 s |
0.000002037175 s |
0.98 |
Concat / Jax / tpu / BothRev |
0.000002062275 s |
0.000001993125 s |
1.03 |
Concat / HLOOpt / tpu / PreRev |
0.000001999275 s |
0.0000020260500000000003 s |
0.99 |
Concat / HLOOpt / tpu / PostRev |
0.00000206455 s |
0.000002003875 s |
1.03 |
Concat / HLOOpt / tpu / BothRev |
0.0000020139250000000004 s |
0.000002037225 s |
0.99 |
Concat / PartOpt / tpu / PreRev |
0.00000206385 s |
0.000002002225 s |
1.03 |
Concat / PartOpt / tpu / PostRev |
0.0000020095 s |
0.0000020269 s |
0.99 |
Concat / PartOpt / tpu / BothRev |
0.0000020593 s |
0.000002004175 s |
1.03 |
Concat / IPartOpt / tpu / PreRev |
0.000002004825 s |
0.000002021725 s |
0.99 |
Concat / IPartOpt / tpu / PostRev |
0.0000020631250000000003 s |
0.000001998375 s |
1.03 |
Concat / IPartOpt / tpu / BothRev |
0.000002001675 s |
0.0000020281750000000003 s |
0.99 |
Concat / DefOpt / tpu / PreRev |
0.0000020721 s |
0.000001993525 s |
1.04 |
Concat / DefOpt / tpu / PostRev |
0.000001994075 s |
0.0000020290250000000004 s |
0.98 |
Concat / DefOpt / tpu / BothRev |
0.000002064275 s |
0.0000019899 s |
1.04 |
Concat / IDefOpt / tpu / PreRev |
0.0000019979 s |
0.0000020158250000000003 s |
0.99 |
Concat / IDefOpt / tpu / PostRev |
0.000002064975 s |
0.0000020024 s |
1.03 |
Concat / IDefOpt / tpu / BothRev |
0.000002001275 s |
0.0000020226 s |
0.99 |
Concat / JaXPipe / cpu / Primal |
0.00001284 s |
0.000006810739942011423 s |
1.89 |
Concat / Jax / cpu / Primal |
0.00001278 s |
0.00000690752003720263 s |
1.85 |
Concat / HLOOpt / cpu / Primal |
0.000012964 s |
0.000006830919955973513 s |
1.90 |
Concat / PartOpt / cpu / Primal |
0.000013193 s |
0.000006688800240226555 s |
1.97 |
Concat / IPartOpt / cpu / Primal |
0.000013143 s |
0.000006490459891210776 s |
2.02 |
Concat / DefOpt / cpu / Primal |
0.000013458 s |
0.000007036420065560378 s |
1.91 |
Concat / IDefOpt / cpu / Primal |
0.000012896 s |
0.0000067557199145085175 s |
1.91 |
Concat / JaXPipe / cpu / Forward |
0.000018213 s |
0.000010362099965277594 s |
1.76 |
Concat / Jax / cpu / Forward |
0.000017612 s |
0.000009953659937309568 s |
1.77 |
Concat / HLOOpt / cpu / Forward |
0.000017639 s |
0.00000976933995843865 s |
1.81 |
Concat / PartOpt / cpu / Forward |
0.000017678 s |
0.000009767280134838076 s |
1.81 |
Concat / IPartOpt / cpu / Forward |
0.000017766 s |
0.000010285560092597734 s |
1.73 |
Concat / DefOpt / cpu / Forward |
0.00001746 s |
0.000009621979988878594 s |
1.81 |
Concat / IDefOpt / cpu / Forward |
0.000017569 s |
0.000009680279981694185 s |
1.81 |
Concat / JaXPipe / cpu / PreRev |
0.000020523 s |
0.000012034819956170395 s |
1.71 |
Concat / JaXPipe / cpu / PostRev |
0.000019097 s |
0.000011304079926048871 s |
1.69 |
Concat / JaXPipe / cpu / BothRev |
0.000020148 s |
0.000011185099720023572 s |
1.80 |
Concat / Jax / cpu / BothRev |
0.000020085 s |
0.000011534079858392944 s |
1.74 |
Concat / HLOOpt / cpu / PreRev |
0.000019797 s |
0.000011462079819466452 s |
1.73 |
Concat / HLOOpt / cpu / PostRev |
0.000019392 s |
0.000013215480030339678 s |
1.47 |
Concat / HLOOpt / cpu / BothRev |
0.000020144 s |
0.000011233880068175494 s |
1.79 |
Concat / PartOpt / cpu / PreRev |
0.000020108 s |
0.000011222520042792892 s |
1.79 |
Concat / PartOpt / cpu / PostRev |
0.000019754 s |
0.000011747020071197769 s |
1.68 |
Concat / PartOpt / cpu / BothRev |
0.000019933 s |
0.000012005360076727811 s |
1.66 |
Concat / IPartOpt / cpu / PreRev |
0.000019959 s |
0.000011245579880778678 s |
1.77 |
Concat / IPartOpt / cpu / PostRev |
0.000019629 s |
0.00001154162015154725 s |
1.70 |
Concat / IPartOpt / cpu / BothRev |
0.000019989 s |
0.000011165699870616665 s |
1.79 |
Concat / DefOpt / cpu / PreRev |
0.000019958 s |
0.000011291160044493152 s |
1.77 |
Concat / DefOpt / cpu / PostRev |
0.000032514 s |
0.00001136698014306603 s |
2.86 |
Concat / DefOpt / cpu / BothRev |
0.000020309 s |
0.00001205559976369841 s |
1.68 |
Concat / IDefOpt / cpu / PreRev |
0.000020037 s |
0.000011354359921824652 s |
1.76 |
Concat / IDefOpt / cpu / PostRev |
0.000019829 s |
0.00001100361994758714 s |
1.80 |
Concat / IDefOpt / cpu / BothRev |
0.000020216 s |
0.000011541079984453971 s |
1.75 |
const_scatter / JaXPipe / cpu / Primal |
0.000006475779982793028 s |
0.000006286120042204857 s |
1.03 |
const_scatter / Jax / cpu / Primal |
0.000006203679959071451 s |
0.000006213160122570116 s |
1.00 |
const_scatter / HLOOpt / cpu / Primal |
0.000007251440028994693 s |
0.000007482039800379425 s |
0.97 |
const_scatter / PartOpt / cpu / Primal |
0.0000063402399973711 s |
0.000006517260080727283 s |
0.97 |
const_scatter / IPartOpt / cpu / Primal |
0.00000644616004137788 s |
0.0000066745399453793655 s |
0.97 |
const_scatter / DefOpt / cpu / Primal |
0.000006890359954923042 s |
0.000007434799845214002 s |
0.93 |
const_scatter / IDefOpt / cpu / Primal |
0.0000071125799786386776 s |
0.000007135140112950466 s |
1.00 |
const_scatter / JaXPipe / cpu / Forward |
0.000010735819951150917 s |
0.000010798659968713763 s |
0.99 |
const_scatter / Jax / cpu / Forward |
0.000009027119958773257 s |
0.000008980739839898889 s |
1.01 |
const_scatter / HLOOpt / cpu / Forward |
0.000010950879941447056 s |
0.000010923260015260894 s |
1.00 |
const_scatter / PartOpt / cpu / Forward |
0.00001035103997310216 s |
0.00001075406002200907 s |
0.96 |
const_scatter / IPartOpt / cpu / Forward |
0.000010979480039168266 s |
0.00001102800011722138 s |
1.00 |
const_scatter / DefOpt / cpu / Forward |
0.0000106474999756756 s |
0.000010180959907302168 s |
1.05 |
const_scatter / IDefOpt / cpu / Forward |
0.0000105772000097204 s |
0.000010711239956435748 s |
0.99 |
const_scatter / JaXPipe / cpu / PreRev |
0.0002891748000274 s |
0.0002903115599838 s |
1.00 |
const_scatter / JaXPipe / cpu / PostRev |
0.0002829711799859 s |
0.0002942251400963 s |
0.96 |
const_scatter / JaXPipe / cpu / BothRev |
0.0002858953199938 s |
0.0002862006801296 s |
1.00 |
const_scatter / Jax / cpu / BothRev |
0.0002804655800173 s |
0.0002829291398666 s |
0.99 |
const_scatter / HLOOpt / cpu / PreRev |
0.000283232699985 s |
0.0002868426199711 s |
0.99 |
const_scatter / HLOOpt / cpu / PostRev |
0.0003201737000108 s |
0.0002869060600642 s |
1.12 |
const_scatter / HLOOpt / cpu / BothRev |
0.000283079860028 s |
0.0003001028399376 s |
0.94 |
const_scatter / PartOpt / cpu / PreRev |
0.000282718760036 s |
0.0002874072000849 s |
0.98 |
const_scatter / PartOpt / cpu / PostRev |
0.0002814333599781 s |
0.0002842899800089 s |
0.99 |
const_scatter / PartOpt / cpu / BothRev |
0.0002842633000182 s |
0.0002859568200437 s |
0.99 |
const_scatter / IPartOpt / cpu / PreRev |
0.0002845302599689 s |
0.0002868205001141 s |
0.99 |
const_scatter / IPartOpt / cpu / PostRev |
0.0002792865399806 s |
0.0002999288201317 s |
0.93 |
const_scatter / IPartOpt / cpu / BothRev |
0.0002827281999634 s |
0.0002863051198437 s |
0.99 |
const_scatter / DefOpt / cpu / PreRev |
0.0002973312000267 s |
0.000284554440077 s |
1.04 |
const_scatter / DefOpt / cpu / PostRev |
0.0002825952600414 s |
0.000285246339954 s |
0.99 |
const_scatter / DefOpt / cpu / BothRev |
0.0002797493400339 s |
0.0002855436599202 s |
0.98 |
const_scatter / IDefOpt / cpu / PreRev |
0.0002827880599943 s |
0.0002843137198942 s |
0.99 |
const_scatter / IDefOpt / cpu / PostRev |
0.0002823460400031 s |
0.0002881772199543 s |
0.98 |
const_scatter / IDefOpt / cpu / BothRev |
0.0002871784999661 s |
0.0002883722399201 s |
1.00 |
const_scatter / JaXPipe / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
const_scatter / Jax / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
const_scatter / HLOOpt / cuda / Primal |
0.000002463 s |
0.000002432 s |
1.01 |
const_scatter / PartOpt / cuda / Primal |
0.000002463 s |
0.000002432 s |
1.01 |
const_scatter / IPartOpt / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
const_scatter / DefOpt / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
const_scatter / IDefOpt / cuda / Primal |
0.000002463 s |
0.000002431 s |
1.01 |
const_scatter / JaXPipe / cuda / Forward |
0.00001088 s |
0.000010624 s |
1.02 |
const_scatter / Jax / cuda / Forward |
0.000011072 s |
0.000010847 s |
1.02 |
const_scatter / HLOOpt / cuda / Forward |
0.000011232 s |
0.000010496 s |
1.07 |
const_scatter / PartOpt / cuda / Forward |
0.000010496 s |
0.000010464 s |
1.00 |
const_scatter / IPartOpt / cuda / Forward |
0.000011168 s |
0.000011168 s |
1 |
const_scatter / DefOpt / cuda / Forward |
0.00001088 s |
0.000010687 s |
1.02 |
const_scatter / IDefOpt / cuda / Forward |
0.000010656 s |
0.00001088 s |
0.98 |
const_scatter / JaXPipe / cuda / PreRev |
0.000017503999999999997 s |
0.00001696 s |
1.03 |
const_scatter / JaXPipe / cuda / PostRev |
0.000017056 s |
0.000017024 s |
1.00 |
const_scatter / JaXPipe / cuda / BothRev |
0.000017664 s |
0.0000184 s |
0.96 |
const_scatter / Jax / cuda / BothRev |
0.000017344 s |
0.000016639 s |
1.04 |
const_scatter / HLOOpt / cuda / PreRev |
0.000016895 s |
0.000016831 s |
1.00 |
const_scatter / HLOOpt / cuda / PostRev |
0.000017216 s |
0.000017344 s |
0.99 |
const_scatter / HLOOpt / cuda / BothRev |
0.000019232 s |
0.000017024 s |
1.13 |
const_scatter / PartOpt / cuda / PreRev |
0.00001744 s |
0.000017152 s |
1.02 |
const_scatter / PartOpt / cuda / PostRev |
0.000017503999999999997 s |
0.00001728 s |
1.01 |
const_scatter / PartOpt / cuda / BothRev |
0.000018304 s |
0.000017632 s |
1.04 |
const_scatter / IPartOpt / cuda / PreRev |
0.0000176 s |
0.000016544 s |
1.06 |
const_scatter / IPartOpt / cuda / PostRev |
0.000017152 s |
0.000016673 s |
1.03 |
const_scatter / IPartOpt / cuda / BothRev |
0.000017056 s |
0.000016992 s |
1.00 |
const_scatter / DefOpt / cuda / PreRev |
0.000017632 s |
0.000016832 s |
1.05 |
const_scatter / DefOpt / cuda / PostRev |
0.00001728 s |
0.00001728 s |
1 |
const_scatter / DefOpt / cuda / BothRev |
0.000017344 s |
0.000017536 s |
0.99 |
const_scatter / IDefOpt / cuda / PreRev |
0.000017632 s |
0.000017534999999999997 s |
1.01 |
const_scatter / IDefOpt / cuda / PostRev |
0.000017249 s |
0.000017536 s |
0.98 |
const_scatter / IDefOpt / cuda / BothRev |
0.00001744 s |
0.0000176 s |
0.99 |
const_scatter / JaXPipe / tpu / Primal |
0.0000038054 s |
0.00000379845 s |
1.00 |
const_scatter / Jax / tpu / Primal |
0.000003818625 s |
0.000003824975 s |
1.00 |
const_scatter / HLOOpt / tpu / Primal |
0.000003782175 s |
0.00000380035 s |
1.00 |
const_scatter / PartOpt / tpu / Primal |
0.000003810975 s |
0.000003851725 s |
0.99 |
const_scatter / IPartOpt / tpu / Primal |
0.000003799325 s |
0.000003791925 s |
1.00 |
const_scatter / DefOpt / tpu / Primal |
0.000003830575 s |
0.000003819925 s |
1.00 |
const_scatter / IDefOpt / tpu / Primal |
0.000003814100000000001 s |
0.00000378655 s |
1.01 |
const_scatter / JaXPipe / tpu / Forward |
0.0000064681 s |
0.000006479 s |
1.00 |
const_scatter / Jax / tpu / Forward |
0.000006506775 s |
0.00000648265 s |
1.00 |
const_scatter / HLOOpt / tpu / Forward |
0.000006485125 s |
0.0000064919000000000005 s |
1.00 |
const_scatter / PartOpt / tpu / Forward |
0.0000064809 s |
0.00000646155 s |
1.00 |
const_scatter / IPartOpt / tpu / Forward |
0.00000647215 s |
0.000006509575 s |
0.99 |
const_scatter / DefOpt / tpu / Forward |
0.000006481725000000001 s |
0.00000646095 s |
1.00 |
const_scatter / IDefOpt / tpu / Forward |
0.000006473125 s |
0.00000648355 s |
1.00 |
const_scatter / JaXPipe / tpu / PreRev |
0.0000066978000000000005 s |
0.000006650225 s |
1.01 |
const_scatter / JaXPipe / tpu / PostRev |
0.000006666975000000001 s |
0.0000066379250000000005 s |
1.00 |
const_scatter / JaXPipe / tpu / BothRev |
0.0000066444 s |
0.000006636999999999999 s |
1.00 |
const_scatter / Jax / tpu / BothRev |
0.000006644099999999999 s |
0.000006643 s |
1.00 |
const_scatter / HLOOpt / tpu / PreRev |
0.000006678575 s |
0.00000664195 s |
1.01 |
const_scatter / HLOOpt / tpu / PostRev |
0.000006644949999999999 s |
0.00000664285 s |
1.00 |
const_scatter / HLOOpt / tpu / BothRev |
0.000006663125 s |
0.000006624949999999999 s |
1.01 |
const_scatter / PartOpt / tpu / PreRev |
0.00000665045 s |
0.000006633075 s |
1.00 |
const_scatter / PartOpt / tpu / PostRev |
0.000006662850000000001 s |
0.00000661795 s |
1.01 |
const_scatter / PartOpt / tpu / BothRev |
0.00000667025 s |
0.0000066514 s |
1.00 |
const_scatter / IPartOpt / tpu / PreRev |
0.000006641025 s |
0.000006649350000000001 s |
1.00 |
const_scatter / IPartOpt / tpu / PostRev |
0.000006663900000000001 s |
0.0000066319 s |
1.00 |
const_scatter / IPartOpt / tpu / BothRev |
0.000006649825 s |
0.000006648775 s |
1.00 |
const_scatter / DefOpt / tpu / PreRev |
0.0000066514 s |
0.00000664805 s |
1.00 |
const_scatter / DefOpt / tpu / PostRev |
0.0000066691 s |
0.000006630575 s |
1.01 |
const_scatter / DefOpt / tpu / BothRev |
0.000006646025 s |
0.000006650775 s |
1.00 |
const_scatter / IDefOpt / tpu / PreRev |
0.0000066491 s |
0.000006623449999999999 s |
1.00 |
const_scatter / IDefOpt / tpu / PostRev |
0.000006645425 s |
0.000006627374999999999 s |
1.00 |
const_scatter / IDefOpt / tpu / BothRev |
0.000006681824999999999 s |
0.000006617875 s |
1.01 |
const_scatter / JaXPipe / cpu / Primal |
0.000013155 s |
0.000006286120042204857 s |
2.09 |
const_scatter / Jax / cpu / Primal |
0.000012858 s |
0.000006213160122570116 s |
2.07 |
const_scatter / HLOOpt / cpu / Primal |
0.000013416000000000002 s |
0.000007482039800379425 s |
1.79 |
const_scatter / PartOpt / cpu / Primal |
0.000012822 s |
0.000006517260080727283 s |
1.97 |
const_scatter / IPartOpt / cpu / Primal |
0.000013096999999999998 s |
0.0000066745399453793655 s |
1.96 |
const_scatter / DefOpt / cpu / Primal |
0.000013619 s |
0.000007434799845214002 s |
1.83 |
const_scatter / IDefOpt / cpu / Primal |
0.00001335 s |
0.000007135140112950466 s |
1.87 |
const_scatter / JaXPipe / cpu / Forward |
0.000018037 s |
0.000010798659968713763 s |
1.67 |
const_scatter / Jax / cpu / Forward |
0.00001698 s |
0.000008980739839898889 s |
1.89 |
const_scatter / HLOOpt / cpu / Forward |
0.000018241 s |
0.000010923260015260894 s |
1.67 |
const_scatter / PartOpt / cpu / Forward |
0.000017745 s |
0.00001075406002200907 s |
1.65 |
const_scatter / IPartOpt / cpu / Forward |
0.000017915 s |
0.00001102800011722138 s |
1.62 |
const_scatter / DefOpt / cpu / Forward |
0.000018022 s |
0.000010180959907302168 s |
1.77 |
const_scatter / IDefOpt / cpu / Forward |
0.000017807 s |
0.000010711239956435748 s |
1.66 |
const_scatter / JaXPipe / cpu / PreRev |
0.000542264 s |
0.0002903115599838 s |
1.87 |
const_scatter / JaXPipe / cpu / PostRev |
0.00052428 s |
0.0002942251400963 s |
1.78 |
const_scatter / JaXPipe / cpu / BothRev |
0.0005249979999999 s |
0.0002862006801296 s |
1.83 |
const_scatter / Jax / cpu / BothRev |
0.000528863 s |
0.0002829291398666 s |
1.87 |
const_scatter / HLOOpt / cpu / PreRev |
0.000524897 s |
0.0002868426199711 s |
1.83 |
const_scatter / HLOOpt / cpu / PostRev |
0.000524954 s |
0.0002869060600642 s |
1.83 |
const_scatter / HLOOpt / cpu / BothRev |
0.000525206 s |
0.0003001028399376 s |
1.75 |
const_scatter / PartOpt / cpu / PreRev |
0.000535813 s |
0.0002874072000849 s |
1.86 |
const_scatter / PartOpt / cpu / PostRev |
0.0005064889999999 s |
0.0002842899800089 s |
1.78 |
const_scatter / PartOpt / cpu / BothRev |
0.0005216239999999 s |
0.0002859568200437 s |
1.82 |
const_scatter / IPartOpt / cpu / PreRev |
0.000535832 s |
0.0002868205001141 s |
1.87 |
const_scatter / IPartOpt / cpu / PostRev |
0.0005254529999999 s |
0.0002999288201317 s |
1.75 |
const_scatter / IPartOpt / cpu / BothRev |
0.000515669 s |
0.0002863051198437 s |
1.80 |
const_scatter / DefOpt / cpu / PreRev |
0.000507963 s |
0.000284554440077 s |
1.79 |
const_scatter / DefOpt / cpu / PostRev |
0.000526204 s |
0.000285246339954 s |
1.84 |
const_scatter / DefOpt / cpu / BothRev |
0.000518865 s |
0.0002855436599202 s |
1.82 |
const_scatter / IDefOpt / cpu / PreRev |
0.000531266 s |
0.0002843137198942 s |
1.87 |
const_scatter / IDefOpt / cpu / PostRev |
0.000525134 s |
0.0002881772199543 s |
1.82 |
const_scatter / IDefOpt / cpu / BothRev |
0.00053395 s |
0.0002883722399201 s |
1.85 |
GenDot / JaXPipe / cpu / Primal |
0.000007058640039758757 s |
0.000007314039976336062 s |
0.97 |
GenDot / Jax / cpu / Primal |
0.000007158180023907335 s |
0.00000663925995468162 s |
1.08 |
GenDot / HLOOpt / cpu / Primal |
0.000007439219980369671 s |
0.000007459320186171681 s |
1.00 |
GenDot / PartOpt / cpu / Primal |
0.000006639479970544926 s |
0.000006850139907328412 s |
0.97 |
GenDot / IPartOpt / cpu / Primal |
0.000007283680006366922 s |
0.0000070201801645453086 s |
1.04 |
GenDot / DefOpt / cpu / Primal |
0.000007094319980751606 s |
0.00000712750006641727 s |
1.00 |
GenDot / IDefOpt / cpu / Primal |
0.000007065799982228782 s |
0.000007662799871468451 s |
0.92 |
GenDot / JaXPipe / cpu / Forward |
0.000011142140001538792 s |
0.000011181799964106176 s |
1.00 |
GenDot / Jax / cpu / Forward |
0.000010192559984716356 s |
0.000010393080046924296 s |
0.98 |
GenDot / HLOOpt / cpu / Forward |
0.000011260099981882376 s |
0.000011051219880755523 s |
1.02 |
GenDot / PartOpt / cpu / Forward |
0.000010684959997888654 s |
0.000010293439991073684 s |
1.04 |
GenDot / IPartOpt / cpu / Forward |
0.00001114266000513453 s |
0.000011467720323707908 s |
0.97 |
GenDot / DefOpt / cpu / Forward |
0.000010522500033403049 s |
0.000010869840079976712 s |
0.97 |
GenDot / IDefOpt / cpu / Forward |
0.000010230900024907895 s |
0.00001090998004656285 s |
0.94 |
GenDot / JaXPipe / cpu / PreRev |
0.00001133010000557988 s |
0.00001122697998653166 s |
1.01 |
GenDot / JaXPipe / cpu / PostRev |
0.00000938481997764029 s |
0.000010410219911136664 s |
0.90 |
GenDot / JaXPipe / cpu / BothRev |
0.000010959040009765886 s |
0.000010769860055006574 s |
1.02 |
GenDot / Jax / cpu / BothRev |
0.000010252279962514876 s |
0.000010883420036407188 s |
0.94 |
GenDot / HLOOpt / cpu / PreRev |
0.000011009320023731562 s |
0.000010963120112137404 s |
1.00 |
GenDot / HLOOpt / cpu / PostRev |
0.00001289864003410912 s |
0.00001305949997913558 s |
0.99 |
GenDot / HLOOpt / cpu / BothRev |
0.00001051152002219169 s |
0.000011048319975088816 s |
0.95 |
GenDot / PartOpt / cpu / PreRev |
0.000010857540037250146 s |
0.000011295960139250382 s |
0.96 |
GenDot / PartOpt / cpu / PostRev |
0.000009799860008570249 s |
0.000010292840015608817 s |
0.95 |
GenDot / PartOpt / cpu / BothRev |
0.000010963799977616871 s |
0.000010914960075751878 s |
1.00 |
GenDot / IPartOpt / cpu / PreRev |
0.000010451199968883885 s |
0.000011207640054635704 s |
0.93 |
GenDot / IPartOpt / cpu / PostRev |
0.000009857459999693676 s |
0.000010758139869722073 s |
0.92 |
GenDot / IPartOpt / cpu / BothRev |
0.000011122760024591117 s |
0.000010854679858312011 s |
1.02 |
GenDot / DefOpt / cpu / PreRev |
0.000010702600047807209 s |
0.000010493860172573476 s |
1.02 |
GenDot / DefOpt / cpu / PostRev |
0.000010777740017147152 s |
0.000011142780094814952 s |
0.97 |
GenDot / DefOpt / cpu / BothRev |
0.000010920439999608788 s |
0.000010720419995777774 s |
1.02 |
GenDot / IDefOpt / cpu / PreRev |
0.000010280499973305268 s |
0.00001113910027925158 s |
0.92 |
GenDot / IDefOpt / cpu / PostRev |
0.000010718200046540004 s |
0.000011556879871932325 s |
0.93 |
GenDot / IDefOpt / cpu / BothRev |
0.000010665100035112118 s |
0.000010867720084206666 s |
0.98 |
GenDot / JaXPipe / cuda / Primal |
0.000002528 s |
0.000002528 s |
1 |
GenDot / Jax / cuda / Primal |
0.000002527 s |
0.000002527 s |
1 |
GenDot / HLOOpt / cuda / Primal |
0.000002496 s |
0.000002527 s |
0.99 |
GenDot / PartOpt / cuda / Primal |
0.000002528 s |
0.000002527 s |
1.00 |
GenDot / IPartOpt / cuda / Primal |
0.000002528 s |
0.000002528 s |
1 |
GenDot / DefOpt / cuda / Primal |
0.000002527 s |
0.000002527 s |
1 |
GenDot / IDefOpt / cuda / Primal |
0.000002527 s |
0.000002527 s |
1 |
GenDot / JaXPipe / cuda / Forward |
0.000010624 s |
0.000012224 s |
0.87 |
GenDot / Jax / cuda / Forward |
0.000010848 s |
0.000010528 s |
1.03 |
GenDot / HLOOpt / cuda / Forward |
0.000010689 s |
0.000010687 s |
1.00 |
GenDot / PartOpt / cuda / Forward |
0.000011520000000000002 s |
0.000010464 s |
1.10 |
GenDot / IPartOpt / cuda / Forward |
0.00000992 s |
0.000010752 s |
0.92 |
GenDot / DefOpt / cuda / Forward |
0.000010592 s |
0.000012064 s |
0.88 |
GenDot / IDefOpt / cuda / Forward |
0.000010944 s |
0.000012224 s |
0.90 |
GenDot / JaXPipe / cuda / PreRev |
0.000010624 s |
0.000012288 s |
0.86 |
GenDot / JaXPipe / cuda / PostRev |
0.000010688 s |
0.000010656 s |
1.00 |
GenDot / JaXPipe / cuda / BothRev |
0.00001072 s |
0.00001072 s |
1 |
GenDot / Jax / cuda / BothRev |
0.0000112 s |
0.000010688 s |
1.05 |
GenDot / HLOOpt / cuda / PreRev |
0.000010784 s |
0.000010752 s |
1.00 |
GenDot / HLOOpt / cuda / PostRev |
0.00001072 s |
0.000010496 s |
1.02 |
GenDot / HLOOpt / cuda / BothRev |
0.000010432 s |
0.000010849 s |
0.96 |
GenDot / PartOpt / cuda / PreRev |
0.000010848 s |
0.000011071 s |
0.98 |
GenDot / PartOpt / cuda / PostRev |
0.000010624 s |
0.000010624 s |
1 |
GenDot / PartOpt / cuda / BothRev |
0.000010784 s |
0.000010592 s |
1.02 |
GenDot / IPartOpt / cuda / PreRev |
0.000010592 s |
0.000010848 s |
0.98 |
GenDot / IPartOpt / cuda / PostRev |
0.000010912 s |
0.00001072 s |
1.02 |
GenDot / IPartOpt / cuda / BothRev |
0.000010912 s |
0.00001088 s |
1.00 |
GenDot / DefOpt / cuda / PreRev |
0.000011168 s |
0.000010784 s |
1.04 |
GenDot / DefOpt / cuda / PostRev |
0.000010847 s |
0.000010784 s |
1.01 |
GenDot / DefOpt / cuda / BothRev |
0.000010816 s |
0.000010656 s |
1.02 |
GenDot / IDefOpt / cuda / PreRev |
0.000010623 s |
0.000010623 s |
1 |
GenDot / IDefOpt / cuda / PostRev |
0.000010688 s |
0.000010848 s |
0.99 |
GenDot / IDefOpt / cuda / BothRev |
0.000011039 s |
0.000010656 s |
1.04 |
GenDot / JaXPipe / tpu / Primal |
9.2685e-7 s |
9.43175e-7 s |
0.98 |
GenDot / Jax / tpu / Primal |
9.25575e-7 s |
9.3e-7 s |
1.00 |
GenDot / HLOOpt / tpu / Primal |
0.000001557925 s |
0.0000016040750000000002 s |
0.97 |
GenDot / PartOpt / tpu / Primal |
9.25925e-7 s |
9.303e-7 s |
1.00 |
GenDot / IPartOpt / tpu / Primal |
9.26075e-7 s |
9.436e-7 s |
0.98 |
GenDot / DefOpt / tpu / Primal |
0.0000014994 s |
0.0000014983 s |
1.00 |
GenDot / IDefOpt / tpu / Primal |
0.000001557325 s |
0.000001602275 s |
0.97 |
GenDot / JaXPipe / tpu / Forward |
0.000003170625 s |
0.0000030512 s |
1.04 |
GenDot / Jax / tpu / Forward |
0.000002311125 s |
0.000002273475 s |
1.02 |
GenDot / HLOOpt / tpu / Forward |
0.000003124975 s |
0.0000031171500000000003 s |
1.00 |
GenDot / PartOpt / tpu / Forward |
0.000003226675 s |
0.0000031349 s |
1.03 |
GenDot / IPartOpt / tpu / Forward |
0.000003113575 s |
0.00000311855 s |
1.00 |
GenDot / DefOpt / tpu / Forward |
0.00000321945 s |
0.000003145825 s |
1.02 |
GenDot / IDefOpt / tpu / Forward |
0.0000031173 s |
0.000003114475 s |
1.00 |
GenDot / JaXPipe / tpu / PreRev |
0.000002957 s |
0.0000030383 s |
0.97 |
GenDot / JaXPipe / tpu / PostRev |
0.000002416325 s |
0.0000023743000000000003 s |
1.02 |
GenDot / JaXPipe / tpu / BothRev |
0.0000029558 s |
0.00000302605 s |
0.98 |
GenDot / Jax / tpu / BothRev |
0.000002405025 s |
0.000002377425 s |
1.01 |
GenDot / HLOOpt / tpu / PreRev |
0.00000296925 s |
0.000003033975 s |
0.98 |
GenDot / HLOOpt / tpu / PostRev |
0.0000029382750000000003 s |
0.000002945275 s |
1.00 |
GenDot / HLOOpt / tpu / BothRev |
0.000002951775 s |
0.00000302865 s |
0.97 |
GenDot / PartOpt / tpu / PreRev |
0.0000029365999999999995 s |
0.0000029443000000000003 s |
1.00 |
GenDot / PartOpt / tpu / PostRev |
0.00000239175 s |
0.000002416025 s |
0.99 |
GenDot / PartOpt / tpu / BothRev |
0.0000029344 s |
0.000002944575 s |
1.00 |
GenDot / IPartOpt / tpu / PreRev |
0.0000029613250000000004 s |
0.000003024925 s |
0.98 |
GenDot / IPartOpt / tpu / PostRev |
0.0000024087000000000003 s |
0.00000238245 s |
1.01 |
GenDot / IPartOpt / tpu / BothRev |
0.0000029529 s |
0.00000302485 s |
0.98 |
GenDot / DefOpt / tpu / PreRev |
0.0000029382 s |
0.00000297045 s |
0.99 |
GenDot / DefOpt / tpu / PostRev |
0.00000295465 s |
0.0000030201 s |
0.98 |
GenDot / DefOpt / tpu / BothRev |
0.000002934975 s |
0.000002939575 s |
1.00 |
GenDot / IDefOpt / tpu / PreRev |
0.0000029626 s |
0.00000301605 s |
0.98 |
GenDot / IDefOpt / tpu / PostRev |
0.000002929725 s |
0.000002966875 s |
0.99 |
GenDot / IDefOpt / tpu / BothRev |
0.00000296775 s |
0.000003017975 s |
0.98 |
GenDot / JaXPipe / cpu / Primal |
0.000014994 s |
0.000007314039976336062 s |
2.05 |
GenDot / Jax / cpu / Primal |
0.000015317 s |
0.00000663925995468162 s |
2.31 |
GenDot / HLOOpt / cpu / Primal |
0.000014223 s |
0.000007459320186171681 s |
1.91 |
GenDot / PartOpt / cpu / Primal |
0.000015079 s |
0.000006850139907328412 s |
2.20 |
GenDot / IPartOpt / cpu / Primal |
0.000014777 s |
0.0000070201801645453086 s |
2.10 |
GenDot / DefOpt / cpu / Primal |
0.000014128 s |
0.00000712750006641727 s |
1.98 |
GenDot / IDefOpt / cpu / Primal |
0.000014334 s |
0.000007662799871468451 s |
1.87 |
GenDot / JaXPipe / cpu / Forward |
0.000019377 s |
0.000011181799964106176 s |
1.73 |
GenDot / Jax / cpu / Forward |
0.000020238 s |
0.000010393080046924296 s |
1.95 |
GenDot / HLOOpt / cpu / Forward |
0.000019147 s |
0.000011051219880755523 s |
1.73 |
GenDot / PartOpt / cpu / Forward |
0.000019468 s |
0.000010293439991073684 s |
1.89 |
GenDot / IPartOpt / cpu / Forward |
0.000019482 s |
0.000011467720323707908 s |
1.70 |
GenDot / DefOpt / cpu / Forward |
0.000019869 s |
0.000010869840079976712 s |
1.83 |
GenDot / IDefOpt / cpu / Forward |
0.000019446 s |
0.00001090998004656285 s |
1.78 |
GenDot / JaXPipe / cpu / PreRev |
0.000019747 s |
0.00001122697998653166 s |
1.76 |
GenDot / JaXPipe / cpu / PostRev |
0.000020508 s |
0.000010410219911136664 s |
1.97 |
GenDot / JaXPipe / cpu / BothRev |
0.000019463 s |
0.000010769860055006574 s |
1.81 |
GenDot / Jax / cpu / BothRev |
0.000021255 s |
0.000010883420036407188 s |
1.95 |
GenDot / HLOOpt / cpu / PreRev |
0.000019659 s |
0.000010963120112137404 s |
1.79 |
GenDot / HLOOpt / cpu / PostRev |
0.000019426 s |
0.00001305949997913558 s |
1.49 |
GenDot / HLOOpt / cpu / BothRev |
0.000019623 s |
0.000011048319975088816 s |
1.78 |
GenDot / PartOpt / cpu / PreRev |
0.000019093 s |
0.000011295960139250382 s |
1.69 |
GenDot / PartOpt / cpu / PostRev |
0.00002089 s |
0.000010292840015608817 s |
2.03 |
GenDot / PartOpt / cpu / BothRev |
0.000019675 s |
0.000010914960075751878 s |
1.80 |
GenDot / IPartOpt / cpu / PreRev |
0.000019429 s |
0.000011207640054635704 s |
1.73 |
GenDot / IPartOpt / cpu / PostRev |
0.000020497 s |
0.000010758139869722073 s |
1.91 |
GenDot / IPartOpt / cpu / BothRev |
0.000019604 s |
0.000010854679858312011 s |
1.81 |
GenDot / DefOpt / cpu / PreRev |
0.000019313 s |
0.000010493860172573476 s |
1.84 |
GenDot / DefOpt / cpu / PostRev |
0.000019731 s |
0.000011142780094814952 s |
1.77 |
GenDot / DefOpt / cpu / BothRev |
0.00001981 s |
0.000010720419995777774 s |
1.85 |
GenDot / IDefOpt / cpu / PreRev |
0.000019317 s |
0.00001113910027925158 s |
1.73 |
GenDot / IDefOpt / cpu / PostRev |
0.000019795 s |
0.000011556879871932325 s |
1.71 |
GenDot / IDefOpt / cpu / BothRev |
0.000019768 s |
0.000010867720084206666 s |
1.82 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000011248860027990304 s |
0.000011581580110942014 s |
0.97 |
hlo_ffi / Jax / cpu / Primal |
0.000010887139997066697 s |
0.000011033399787265809 s |
0.99 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000011678839946398512 s |
0.000011139259877381846 s |
1.05 |
hlo_ffi / PartOpt / cpu / Primal |
0.000011032420015908428 s |
0.00001092610018531559 s |
1.01 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000011074700041717734 s |
0.000011699480055540334 s |
0.95 |
hlo_ffi / DefOpt / cpu / Primal |
0.00001087645998268272 s |
0.000011188660319021437 s |
0.97 |
hlo_ffi / IDefOpt / cpu / Primal |
0.00001081174003047636 s |
0.000011229419942537787 s |
0.96 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000015464319976672413 s |
0.00001648509998631198 s |
0.94 |
hlo_ffi / Jax / cpu / Forward |
0.000016157620011654215 s |
0.0000156106400754652 s |
1.04 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000015860419989621732 s |
0.000016109379939734935 s |
0.98 |
hlo_ffi / PartOpt / cpu / Forward |
0.00001607590001185599 s |
0.00001583156001288444 s |
1.02 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000016245139995589853 s |
0.00001557716004754184 s |
1.04 |
hlo_ffi / DefOpt / cpu / Forward |
0.000015877420009928757 s |
0.00001562955996632809 s |
1.02 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000017157500005851036 s |
0.000015224300077534282 s |
1.13 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.00001564812001561222 s |
0.00001584945996000897 s |
0.99 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000014765619989702828 s |
0.00001478201989812078 s |
1.00 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000014673519999632844 s |
0.000015412480024679098 s |
0.95 |
hlo_ffi / Jax / cpu / BothRev |
0.000015155280016188044 s |
0.000016434639983344825 s |
0.92 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000016049400001065807 s |
0.000015840260020922868 s |
1.01 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000016979560014078744 s |
0.00001723093995678937 s |
0.99 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000014861960034977528 s |
0.000015616500058968084 s |
0.95 |
hlo_ffi / PartOpt / cpu / PreRev |
0.00001615130001482612 s |
0.000017233819999091793 s |
0.94 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000015035579999675972 s |
0.00001562794001074508 s |
0.96 |
hlo_ffi / PartOpt / cpu / BothRev |
0.00001524263998362585 s |
0.000015728639846201985 s |
0.97 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.00001541059998999117 s |
0.000016466500019305384 s |
0.94 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000014859400007480872 s |
0.000015306339992093854 s |
0.97 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000014832500000920846 s |
0.000015431739957421086 s |
0.96 |
hlo_ffi / DefOpt / cpu / PreRev |
0.00001579862007019983 s |
0.000016104480055219027 s |
0.98 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000015516859939452843 s |
0.0000156216798859532 s |
0.99 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000015102860015758778 s |
0.000015211660211207345 s |
0.99 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.00001593401998434274 s |
0.00001618880000023637 s |
0.98 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000015209540060823202 s |
0.000015539559790340718 s |
0.98 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.00001575933999447443 s |
0.00001603133998287376 s |
0.98 |
hlo_ffi / JaXPipe / cuda / Primal |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
hlo_ffi / Jax / cuda / Primal |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
hlo_ffi / HLOOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
hlo_ffi / PartOpt / cuda / Primal |
0.000002368 s |
0.000002368 s |
1 |
hlo_ffi / IPartOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000002368 s |
1.00 |
hlo_ffi / DefOpt / cuda / Primal |
0.000002368 s |
0.0000023670000000000004 s |
1.00 |
hlo_ffi / IDefOpt / cuda / Primal |
0.0000023670000000000004 s |
0.000002369 s |
1.00 |
hlo_ffi / JaXPipe / cuda / Forward |
0.000002463 s |
0.000002463 s |
1 |
hlo_ffi / Jax / cuda / Forward |
0.000002463 s |
0.000002463 s |
1 |
hlo_ffi / HLOOpt / cuda / Forward |
0.000002463 s |
0.000002463 s |
1 |
hlo_ffi / PartOpt / cuda / Forward |
0.000002463 s |
0.000002463 s |
1 |
hlo_ffi / IPartOpt / cuda / Forward |
0.000002463 s |
0.000002463 s |
1 |
hlo_ffi / DefOpt / cuda / Forward |
0.000002463 s |
0.000002463 s |
1 |
hlo_ffi / IDefOpt / cuda / Forward |
0.000002463 s |
0.000002464 s |
1.00 |
hlo_ffi / JaXPipe / cuda / PreRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / JaXPipe / cuda / PostRev |
0.000002432 s |
0.000002463 s |
0.99 |
hlo_ffi / JaXPipe / cuda / BothRev |
0.000002431 s |
0.000002463 s |
0.99 |
hlo_ffi / Jax / cuda / BothRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / HLOOpt / cuda / PreRev |
0.000002432 s |
0.000002433 s |
1.00 |
hlo_ffi / HLOOpt / cuda / PostRev |
0.000002432 s |
0.000002463 s |
0.99 |
hlo_ffi / HLOOpt / cuda / BothRev |
0.000002432 s |
0.000002463 s |
0.99 |
hlo_ffi / PartOpt / cuda / PreRev |
0.000002431 s |
0.000002432 s |
1.00 |
hlo_ffi / PartOpt / cuda / PostRev |
0.000002431 s |
0.000002463 s |
0.99 |
hlo_ffi / PartOpt / cuda / BothRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / IPartOpt / cuda / PreRev |
0.000002432 s |
0.000002463 s |
0.99 |
hlo_ffi / IPartOpt / cuda / PostRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / IPartOpt / cuda / BothRev |
0.000002431 s |
0.000002463 s |
0.99 |
hlo_ffi / DefOpt / cuda / PreRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / DefOpt / cuda / PostRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / DefOpt / cuda / BothRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / IDefOpt / cuda / PreRev |
0.000002431 s |
0.000002432 s |
1.00 |
hlo_ffi / IDefOpt / cuda / PostRev |
0.000002433 s |
0.000002432 s |
1.00 |
hlo_ffi / IDefOpt / cuda / BothRev |
0.000002432 s |
0.000002432 s |
1 |
hlo_ffi / JaXPipe / tpu / Primal |
9.27725e-7 s |
9.07525e-7 s |
1.02 |
hlo_ffi / Jax / tpu / Primal |
9.533e-7 s |
9.7945e-7 s |
0.97 |
hlo_ffi / HLOOpt / tpu / Primal |
9.0905e-7 s |
9.32225e-7 s |
0.98 |
hlo_ffi / PartOpt / tpu / Primal |
9.50925e-7 s |
9.72375e-7 s |
0.98 |
hlo_ffi / IPartOpt / tpu / Primal |
9.09225e-7 s |
9.4025e-7 s |
0.97 |
hlo_ffi / DefOpt / tpu / Primal |
9.571e-7 s |
9.73025e-7 s |
0.98 |
hlo_ffi / IDefOpt / tpu / Primal |
9.08225e-7 s |
9.38325e-7 s |
0.97 |
hlo_ffi / JaXPipe / tpu / Forward |
9.48975e-7 s |
9.494e-7 s |
1.00 |
hlo_ffi / Jax / tpu / Forward |
9.8135e-7 s |
9.8205e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / Forward |
9.743499999999998e-7 s |
9.744e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / Forward |
9.3415e-7 s |
9.5955e-7 s |
0.97 |
hlo_ffi / IPartOpt / tpu / Forward |
9.74025e-7 s |
9.74075e-7 s |
1.00 |
hlo_ffi / DefOpt / tpu / Forward |
9.335e-7 s |
9.59675e-7 s |
0.97 |
hlo_ffi / IDefOpt / tpu / Forward |
9.7445e-7 s |
9.74475e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / PreRev |
9.38475e-7 s |
9.481e-7 s |
0.99 |
hlo_ffi / JaXPipe / tpu / PostRev |
9.6545e-7 s |
9.65425e-7 s |
1.00 |
hlo_ffi / JaXPipe / tpu / BothRev |
9.62225e-7 s |
9.942499999999998e-7 s |
0.97 |
hlo_ffi / Jax / tpu / BothRev |
9.64425e-7 s |
9.64775e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / PreRev |
9.6195e-7 s |
9.950249999999998e-7 s |
0.97 |
hlo_ffi / HLOOpt / tpu / PostRev |
9.648e-7 s |
9.65125e-7 s |
1.00 |
hlo_ffi / HLOOpt / tpu / BothRev |
9.61375e-7 s |
9.94975e-7 s |
0.97 |
hlo_ffi / PartOpt / tpu / PreRev |
9.649e-7 s |
9.64675e-7 s |
1.00 |
hlo_ffi / PartOpt / tpu / PostRev |
9.612999999999998e-7 s |
9.9515e-7 s |
0.97 |
hlo_ffi / PartOpt / tpu / BothRev |
9.648e-7 s |
9.64725e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / PreRev |
9.61725e-7 s |
9.946249999999998e-7 s |
0.97 |
hlo_ffi / IPartOpt / tpu / PostRev |
9.651e-7 s |
9.6535e-7 s |
1.00 |
hlo_ffi / IPartOpt / tpu / BothRev |
9.625e-7 s |
9.95075e-7 s |
0.97 |
hlo_ffi / DefOpt / tpu / PreRev |
9.64825e-7 s |
9.64825e-7 s |
1 |
hlo_ffi / DefOpt / tpu / PostRev |
9.618500000000002e-7 s |
9.948e-7 s |
0.97 |
hlo_ffi / DefOpt / tpu / BothRev |
9.64825e-7 s |
9.654e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / PreRev |
9.6155e-7 s |
9.9475e-7 s |
0.97 |
hlo_ffi / IDefOpt / tpu / PostRev |
9.64675e-7 s |
9.646e-7 s |
1.00 |
hlo_ffi / IDefOpt / tpu / BothRev |
9.62175e-7 s |
9.949e-7 s |
0.97 |
hlo_ffi / JaXPipe / cpu / Primal |
0.000018173 s |
0.000011581580110942014 s |
1.57 |
hlo_ffi / Jax / cpu / Primal |
0.000017526 s |
0.000011033399787265809 s |
1.59 |
hlo_ffi / HLOOpt / cpu / Primal |
0.000017415 s |
0.000011139259877381846 s |
1.56 |
hlo_ffi / PartOpt / cpu / Primal |
0.000017746999999999998 s |
0.00001092610018531559 s |
1.62 |
hlo_ffi / IPartOpt / cpu / Primal |
0.000017413 s |
0.000011699480055540334 s |
1.49 |
hlo_ffi / DefOpt / cpu / Primal |
0.000017848 s |
0.000011188660319021437 s |
1.60 |
hlo_ffi / IDefOpt / cpu / Primal |
0.000017776 s |
0.000011229419942537787 s |
1.58 |
hlo_ffi / JaXPipe / cpu / Forward |
0.000025084 s |
0.00001648509998631198 s |
1.52 |
hlo_ffi / Jax / cpu / Forward |
0.00002387 s |
0.0000156106400754652 s |
1.53 |
hlo_ffi / HLOOpt / cpu / Forward |
0.000023866 s |
0.000016109379939734935 s |
1.48 |
hlo_ffi / PartOpt / cpu / Forward |
0.000024555 s |
0.00001583156001288444 s |
1.55 |
hlo_ffi / IPartOpt / cpu / Forward |
0.000023656 s |
0.00001557716004754184 s |
1.52 |
hlo_ffi / DefOpt / cpu / Forward |
0.00002411 s |
0.00001562955996632809 s |
1.54 |
hlo_ffi / IDefOpt / cpu / Forward |
0.000023745 s |
0.000015224300077534282 s |
1.56 |
hlo_ffi / JaXPipe / cpu / PreRev |
0.00002368 s |
0.00001584945996000897 s |
1.49 |
hlo_ffi / JaXPipe / cpu / PostRev |
0.000023558 s |
0.00001478201989812078 s |
1.59 |
hlo_ffi / JaXPipe / cpu / BothRev |
0.000023434 s |
0.000015412480024679098 s |
1.52 |
hlo_ffi / Jax / cpu / BothRev |
0.000023831000000000003 s |
0.000016434639983344825 s |
1.45 |
hlo_ffi / HLOOpt / cpu / PreRev |
0.000023874 s |
0.000015840260020922868 s |
1.51 |
hlo_ffi / HLOOpt / cpu / PostRev |
0.000024168 s |
0.00001723093995678937 s |
1.40 |
hlo_ffi / HLOOpt / cpu / BothRev |
0.000024111 s |
0.000015616500058968084 s |
1.54 |
hlo_ffi / PartOpt / cpu / PreRev |
0.000023619 s |
0.000017233819999091793 s |
1.37 |
hlo_ffi / PartOpt / cpu / PostRev |
0.000023751 s |
0.00001562794001074508 s |
1.52 |
hlo_ffi / PartOpt / cpu / BothRev |
0.000023997 s |
0.000015728639846201985 s |
1.53 |
hlo_ffi / IPartOpt / cpu / PreRev |
0.000023674 s |
0.000016466500019305384 s |
1.44 |
hlo_ffi / IPartOpt / cpu / PostRev |
0.000023993 s |
0.000015306339992093854 s |
1.57 |
hlo_ffi / IPartOpt / cpu / BothRev |
0.000024231 s |
0.000015431739957421086 s |
1.57 |
hlo_ffi / DefOpt / cpu / PreRev |
0.000023674 s |
0.000016104480055219027 s |
1.47 |
hlo_ffi / DefOpt / cpu / PostRev |
0.000023736 s |
0.0000156216798859532 s |
1.52 |
hlo_ffi / DefOpt / cpu / BothRev |
0.000023625 s |
0.000015211660211207345 s |
1.55 |
hlo_ffi / IDefOpt / cpu / PreRev |
0.000024441 s |
0.00001618880000023637 s |
1.51 |
hlo_ffi / IDefOpt / cpu / PostRev |
0.000023735 s |
0.000015539559790340718 s |
1.53 |
hlo_ffi / IDefOpt / cpu / BothRev |
0.000024226 s |
0.00001603133998287376 s |
1.51 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.0008727956001166 s |
0.0008776109996688 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.0009022051999636 s |
0.0008988255998701 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.0009360922001178 s |
0.0009498715997324 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.0008805690000372 s |
0.0008953810000093 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0008812449999823 s |
0.0009070822001376 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.0009272497999518 s |
0.0009808615999645 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.0009334880001006 s |
0.000965224200263 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.0021853263999219 s |
0.0021659400001226 s |
1.01 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.0022222202000193 s |
0.0022857151994685 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.0022111282000878 s |
0.002169238800343 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.002196060999995 s |
0.002201275599873 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.0021652914000696 s |
0.0022307005994662 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.0022615739999309 s |
0.0022027423998224 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.0021877831999518 s |
0.0021937772002274 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.0050008354000965 s |
0.0054776116001448 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.004994894200081 s |
0.0051365493993216 s |
0.97 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.0048707785999795 s |
0.0044106954002927 s |
1.10 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.0058284432000618 s |
0.0037571628003206 s |
1.55 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.0047370393999699 s |
0.0032911693997448 s |
1.44 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.0030746492000616 s |
0.0034393395995721 s |
0.89 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.0058646501999646 s |
0.0051362879999942 s |
1.14 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.0034704194000369 s |
0.0041970395999669 s |
0.83 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.0047814321999794 s |
0.003264855600355 s |
1.46 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.0048789258000397 s |
0.0033160025999677 s |
1.47 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.0032065564000731 s |
0.0032031649996497 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.006050337599936 s |
0.0036078554003324 s |
1.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.0038809566000963 s |
0.0031930009998177 s |
1.22 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.0046246357999734 s |
0.0035410561995377 s |
1.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.0047534366000036 s |
0.0030184160001226 s |
1.57 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0031858473999818 s |
0.0034935592000692 s |
0.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.004561921000004 s |
0.0032312440001987 s |
1.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.004580826000074 s |
0.003169360999891 s |
1.45 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.0047023729999637 s |
0.0039165231999504 s |
1.20 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Primal |
0.000299263 s |
0.0002999339999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Primal |
0.000299487 s |
0.00030019 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Primal |
0.000305535 s |
0.000306686 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Primal |
0.000299231 s |
0.00029987 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Primal |
0.000299071 s |
0.000300703 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Primal |
0.000306143 s |
0.000308191 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Primal |
0.000304896 s |
0.000306526 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / Forward |
0.000583166 s |
0.000583421 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / Forward |
0.000566142 s |
0.000567996 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / Forward |
0.000582558 s |
0.000583261 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / Forward |
0.0005831349999999 s |
0.000584317 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / Forward |
0.0005832939999999 s |
0.000583612 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / Forward |
0.000583807 s |
0.000582877 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / Forward |
0.0005828459999999 s |
0.000583357 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PreRev |
0.0010578529999999 s |
0.001059611 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / PostRev |
0.001012989 s |
0.001014075 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cuda / BothRev |
0.001052318 s |
0.001054907 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cuda / BothRev |
0.001006877 s |
0.0010080579999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PreRev |
0.001036317 s |
0.0010398019999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / PostRev |
0.001062973 s |
0.001065435 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cuda / BothRev |
0.001037758 s |
0.001042203 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PreRev |
0.001052029 s |
0.0010528269999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / PostRev |
0.001000253 s |
0.001002811 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cuda / BothRev |
0.001052093 s |
0.001053563 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PreRev |
0.001053725 s |
0.001056218 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / PostRev |
0.0010005729999999 s |
0.001001754 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cuda / BothRev |
0.001054012 s |
0.001055162 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PreRev |
0.001051549 s |
0.001056987 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / PostRev |
0.000986109 s |
0.000991611 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cuda / BothRev |
0.001052861 s |
0.0010569219999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PreRev |
0.001051997 s |
0.0010560259999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / PostRev |
0.001053789 s |
0.00105913 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cuda / BothRev |
0.001054078 s |
0.001058234 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Primal |
0.000124571 s |
0.000131029 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Primal |
0.0001269539999999 s |
0.00012435075 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Primal |
0.000152598 s |
0.000160308 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Primal |
0.00013434825 s |
0.0001308517499999 s |
1.03 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Primal |
0.00013122275 s |
0.000138425 s |
0.95 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Primal |
0.00014772825 s |
0.000145365 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Primal |
0.00015114425 s |
0.000158025 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / Forward |
0.0002117357499999 s |
0.00021332725 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / Forward |
0.00026111375 s |
0.000262598 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / Forward |
0.00021204625 s |
0.00022060475 s |
0.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / Forward |
0.00021834825 s |
0.00021477225 s |
1.02 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / Forward |
0.0002117345 s |
0.000215888 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / Forward |
0.00021829925 s |
0.0002181132499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / Forward |
0.0002119749999999 s |
0.0002159965 s |
0.98 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PreRev |
0.00035460125 s |
0.0003561094999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / PostRev |
0.00025655575 s |
0.0002560205 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / tpu / BothRev |
0.000354654 s |
0.00035627425 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / tpu / BothRev |
0.00025699525 s |
0.00025682575 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PreRev |
0.00035466125 s |
0.000355867 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / PostRev |
0.00029118825 s |
0.00029123525 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / tpu / BothRev |
0.0003546645 s |
0.00035601675 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PreRev |
0.000355458 s |
0.0003562984999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / PostRev |
0.0002713919999999 s |
0.0002720207499999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / tpu / BothRev |
0.00035549025 s |
0.0003566655 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PreRev |
0.0003546425 s |
0.000355987 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / PostRev |
0.00027209375 s |
0.00027238875 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / tpu / BothRev |
0.00035473575 s |
0.000356185 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PreRev |
0.00035783125 s |
0.0003581459999999 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / PostRev |
0.00028334675 s |
0.00028436275 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / tpu / BothRev |
0.000357901 s |
0.00035823625 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PreRev |
0.0003567242499999 s |
0.00035828 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / PostRev |
0.00030093925 s |
0.00030107525 s |
1.00 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / tpu / BothRev |
0.00035688425 s |
0.00035868325 s |
0.99 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Primal |
0.002113162 s |
0.0008776109996688 s |
2.41 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Primal |
0.00183245 s |
0.0008988255998701 s |
2.04 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Primal |
0.002195878 s |
0.0009498715997324 s |
2.31 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Primal |
0.002086306 s |
0.0008953810000093 s |
2.33 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Primal |
0.0017782009999999 s |
0.0009070822001376 s |
1.96 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Primal |
0.001677778 s |
0.0009808615999645 s |
1.71 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Primal |
0.001681002 s |
0.000965224200263 s |
1.74 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / Forward |
0.005081218 s |
0.0021659400001226 s |
2.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / Forward |
0.005390594 s |
0.0022857151994685 s |
2.36 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / Forward |
0.004539823 s |
0.002169238800343 s |
2.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / Forward |
0.004875755 s |
0.002201275599873 s |
2.21 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / Forward |
0.005059647 s |
0.0022307005994662 s |
2.27 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / Forward |
0.005465504 s |
0.0022027423998224 s |
2.48 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / Forward |
0.005162198 s |
0.0021937772002274 s |
2.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PreRev |
0.008817735 s |
0.0054776116001448 s |
1.61 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / PostRev |
0.008867063 s |
0.0051365493993216 s |
1.73 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / JaXPipe / cpu / BothRev |
0.009667969 s |
0.0044106954002927 s |
2.19 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / Jax / cpu / BothRev |
0.008053285 s |
0.0037571628003206 s |
2.14 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PreRev |
0.009588276 s |
0.0032911693997448 s |
2.91 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / PostRev |
0.007182244 s |
0.0034393395995721 s |
2.09 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / HLOOpt / cpu / BothRev |
0.008501049 s |
0.0051362879999942 s |
1.66 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PreRev |
0.007810628 s |
0.0041970395999669 s |
1.86 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / PostRev |
0.008251922 s |
0.003264855600355 s |
2.53 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / PartOpt / cpu / BothRev |
0.00788562 s |
0.0033160025999677 s |
2.38 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PreRev |
0.007469742 s |
0.0032031649996497 s |
2.33 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / PostRev |
0.0084722399999999 s |
0.0036078554003324 s |
2.35 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IPartOpt / cpu / BothRev |
0.007604206 s |
0.0031930009998177 s |
2.38 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PreRev |
0.009127958 s |
0.0035410561995377 s |
2.58 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / PostRev |
0.006877182 s |
0.0030184160001226 s |
2.28 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / DefOpt / cpu / BothRev |
0.0093124749999999 s |
0.0034935592000692 s |
2.67 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PreRev |
0.008663146 s |
0.0032312440001987 s |
2.68 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / PostRev |
0.0082441329999999 s |
0.003169360999891 s |
2.60 |
llama_dim_288_hidden_dim_768_n_layers_6_n_heads_6_n_kv_heads_6_vocab_size_32000_seq_len_256 / IDefOpt / cpu / BothRev |
0.007943999 s |
0.0039165231999504 s |
2.03 |
scatter_sum / JaXPipe / cpu / Primal |
0.00000752614005250507 s |
0.000008171760018740315 s |
0.92 |
scatter_sum / Jax / cpu / Primal |
0.0000074308400235167935 s |
0.000007970799961185549 s |
0.93 |
scatter_sum / HLOOpt / cpu / Primal |
0.000007615140011694166 s |
0.000007748800053377636 s |
0.98 |
scatter_sum / PartOpt / cpu / Primal |
0.000007615700023961836 s |
0.000007296480107470415 s |
1.04 |
scatter_sum / IPartOpt / cpu / Primal |
0.0000075831200501852435 s |
0.000007890339948062319 s |
0.96 |
scatter_sum / DefOpt / cpu / Primal |
0.000007356060013989917 s |
0.000007875180017435923 s |
0.93 |
scatter_sum / IDefOpt / cpu / Primal |
0.000007581100026072818 s |
0.00000782291990617523 s |
0.97 |
scatter_sum / JaXPipe / cpu / Forward |
0.00001171202001387428 s |
0.00001187668000056874 s |
0.99 |
scatter_sum / Jax / cpu / Forward |
0.000011680520001391417 s |
0.000012111020150769037 s |
0.96 |
scatter_sum / HLOOpt / cpu / Forward |
0.000012281220042495989 s |
0.000012101219908799976 s |
1.01 |
scatter_sum / PartOpt / cpu / Forward |
0.00001184676000775653 s |
0.000012010039936285466 s |
0.99 |
scatter_sum / IPartOpt / cpu / Forward |
0.00001199494002321444 s |
0.00001217569995787926 s |
0.99 |
scatter_sum / DefOpt / cpu / Forward |
0.000011832940008389417 s |
0.000011737019958673043 s |
1.01 |
scatter_sum / IDefOpt / cpu / Forward |
0.000011979159990005427 s |
0.000011661860007734503 s |
1.03 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000011260920009590336 s |
0.000011499279898998794 s |
0.98 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000011303319997750804 s |
0.00001180788007332012 s |
0.96 |
scatter_sum / JaXPipe / cpu / BothRev |
0.00001190994002172374 s |
0.000012415540113579482 s |
0.96 |
scatter_sum / Jax / cpu / BothRev |
0.000011323220014674007 s |
0.000011430880113039164 s |
0.99 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000011752840036933775 s |
0.000012566820078063755 s |
0.94 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000013000700009797583 s |
0.00001453337990824366 s |
0.89 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000011535159983395716 s |
0.000012527280123322273 s |
0.92 |
scatter_sum / PartOpt / cpu / PreRev |
0.000011647460014501122 s |
0.000011573640003916807 s |
1.01 |
scatter_sum / PartOpt / cpu / PostRev |
0.000011373460065442488 s |
0.000012084200061508456 s |
0.94 |
scatter_sum / PartOpt / cpu / BothRev |
0.000012076640023224171 s |
0.000012801519987988286 s |
0.94 |
scatter_sum / IPartOpt / cpu / PreRev |
0.00001152347999777703 s |
0.000011545180132088718 s |
1.00 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000011456920019554672 s |
0.000012068579999322537 s |
0.95 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000011455280018708436 s |
0.000012279979855520651 s |
0.93 |
scatter_sum / DefOpt / cpu / PreRev |
0.000011356320001141283 s |
0.000011816779915534424 s |
0.96 |
scatter_sum / DefOpt / cpu / PostRev |
0.000011563680009203382 s |
0.000012119859893573449 s |
0.95 |
scatter_sum / DefOpt / cpu / BothRev |
0.000011308919956718455 s |
0.00001199918013298884 s |
0.94 |
scatter_sum / IDefOpt / cpu / PreRev |
0.00001153621997218579 s |
0.000011675239911710375 s |
0.99 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000011533860006238684 s |
0.00001227513999765506 s |
0.94 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000011358179999660934 s |
0.000012509239968494513 s |
0.91 |
scatter_sum / JaXPipe / cuda / Primal |
0.000010847 s |
0.000010784 s |
1.01 |
scatter_sum / Jax / cuda / Primal |
0.000011936 s |
0.000010528 s |
1.13 |
scatter_sum / HLOOpt / cuda / Primal |
0.00001248 s |
0.000010912 s |
1.14 |
scatter_sum / PartOpt / cuda / Primal |
0.000010752 s |
0.000010528 s |
1.02 |
scatter_sum / IPartOpt / cuda / Primal |
0.000010785 s |
0.000010976 s |
0.98 |
scatter_sum / DefOpt / cuda / Primal |
0.000010911 s |
0.000012 s |
0.91 |
scatter_sum / IDefOpt / cuda / Primal |
0.000010976 s |
0.000011008 s |
1.00 |
scatter_sum / JaXPipe / cuda / Forward |
0.000018016 s |
0.000018176 s |
0.99 |
scatter_sum / Jax / cuda / Forward |
0.000017664 s |
0.000017824 s |
0.99 |
scatter_sum / HLOOpt / cuda / Forward |
0.000018016 s |
0.00001744 s |
1.03 |
scatter_sum / PartOpt / cuda / Forward |
0.000017760000000000003 s |
0.000019712 s |
0.90 |
scatter_sum / IPartOpt / cuda / Forward |
0.000017023 s |
0.000017823 s |
0.96 |
scatter_sum / DefOpt / cuda / Forward |
0.000018112 s |
0.000017472 s |
1.04 |
scatter_sum / IDefOpt / cuda / Forward |
0.000017760000000000003 s |
0.000019808 s |
0.90 |
scatter_sum / JaXPipe / cuda / PreRev |
0.000017536 s |
0.000017375999999999998 s |
1.01 |
scatter_sum / JaXPipe / cuda / PostRev |
0.0000176 s |
0.000017406999999999998 s |
1.01 |
scatter_sum / JaXPipe / cuda / BothRev |
0.00001728 s |
0.000017664 s |
0.98 |
scatter_sum / Jax / cuda / BothRev |
0.000017664 s |
0.000017472 s |
1.01 |
scatter_sum / HLOOpt / cuda / PreRev |
0.000017664 s |
0.000017567 s |
1.01 |
scatter_sum / HLOOpt / cuda / PostRev |
0.000017408 s |
0.000017408 s |
1 |
scatter_sum / HLOOpt / cuda / BothRev |
0.000016896000000000002 s |
0.000017375999999999998 s |
0.97 |
scatter_sum / PartOpt / cuda / PreRev |
0.000018112 s |
0.000017793 s |
1.02 |
scatter_sum / PartOpt / cuda / PostRev |
0.00001712 s |
0.000017184 s |
1.00 |
scatter_sum / PartOpt / cuda / BothRev |
0.000017760000000000003 s |
0.000017568000000000002 s |
1.01 |
scatter_sum / IPartOpt / cuda / PreRev |
0.000018208 s |
0.000017952 s |
1.01 |
scatter_sum / IPartOpt / cuda / PostRev |
0.000017824 s |
0.000016352 s |
1.09 |
scatter_sum / IPartOpt / cuda / BothRev |
0.000017568000000000002 s |
0.000018048 s |
0.97 |
scatter_sum / DefOpt / cuda / PreRev |
0.00001808 s |
0.000017409000000000002 s |
1.04 |
scatter_sum / DefOpt / cuda / PostRev |
0.000017536 s |
0.000017632 s |
0.99 |
scatter_sum / DefOpt / cuda / BothRev |
0.000017663 s |
0.000017345 s |
1.02 |
scatter_sum / IDefOpt / cuda / PreRev |
0.0000176 s |
0.000017919999999999998 s |
0.98 |
scatter_sum / IDefOpt / cuda / PostRev |
0.000017696 s |
0.000016864 s |
1.05 |
scatter_sum / IDefOpt / cuda / BothRev |
0.000017344 s |
0.000017471 s |
0.99 |
scatter_sum / JaXPipe / tpu / Primal |
0.000001350275 s |
0.000001344225 s |
1.00 |
scatter_sum / Jax / tpu / Primal |
0.000001342825 s |
0.000001358175 s |
0.99 |
scatter_sum / HLOOpt / tpu / Primal |
0.00000135045 s |
0.000001344275 s |
1.00 |
scatter_sum / PartOpt / tpu / Primal |
0.000001343625 s |
0.0000013579 s |
0.99 |
scatter_sum / IPartOpt / tpu / Primal |
0.000001350275 s |
0.0000013447 s |
1.00 |
scatter_sum / DefOpt / tpu / Primal |
0.000001343575 s |
0.000001357375 s |
0.99 |
scatter_sum / IDefOpt / tpu / Primal |
0.00000135085 s |
0.000001344475 s |
1.00 |
scatter_sum / JaXPipe / tpu / Forward |
0.000002691225 s |
0.0000027374500000000004 s |
0.98 |
scatter_sum / Jax / tpu / Forward |
0.000002728 s |
0.000002752225 s |
0.99 |
scatter_sum / HLOOpt / tpu / Forward |
0.0000026902 s |
0.000002741175 s |
0.98 |
scatter_sum / PartOpt / tpu / Forward |
0.000002693975 s |
0.000002711925 s |
0.99 |
scatter_sum / IPartOpt / tpu / Forward |
0.00000268195 s |
0.000002735925 s |
0.98 |
scatter_sum / DefOpt / tpu / Forward |
0.00000269585 s |
0.00000271585 s |
0.99 |
scatter_sum / IDefOpt / tpu / Forward |
0.000002681825 s |
0.00000273715 s |
0.98 |
scatter_sum / JaXPipe / tpu / PreRev |
0.000002692575 s |
0.00000271315 s |
0.99 |
scatter_sum / JaXPipe / tpu / PostRev |
0.000002687325 s |
0.00000273235 s |
0.98 |
scatter_sum / JaXPipe / tpu / BothRev |
0.000002710525 s |
0.0000027247 s |
0.99 |
scatter_sum / Jax / tpu / BothRev |
0.000002743 s |
0.000002794075 s |
0.98 |
scatter_sum / HLOOpt / tpu / PreRev |
0.00000270295 s |
0.0000027256750000000004 s |
0.99 |
scatter_sum / HLOOpt / tpu / PostRev |
0.0000027362999999999995 s |
0.0000028006 s |
0.98 |
scatter_sum / HLOOpt / tpu / BothRev |
0.0000027107750000000003 s |
0.00000272225 s |
1.00 |
scatter_sum / PartOpt / tpu / PreRev |
0.00000274275 s |
0.000002789825 s |
0.98 |
scatter_sum / PartOpt / tpu / PostRev |
0.0000027067000000000003 s |
0.0000027312 s |
0.99 |
scatter_sum / PartOpt / tpu / BothRev |
0.0000027333750000000003 s |
0.000002793825 s |
0.98 |
scatter_sum / IPartOpt / tpu / PreRev |
0.000002703975 s |
0.00000272725 s |
0.99 |
scatter_sum / IPartOpt / tpu / PostRev |
0.000002735 s |
0.00000279215 s |
0.98 |
scatter_sum / IPartOpt / tpu / BothRev |
0.000002713425 s |
0.000002726525 s |
1.00 |
scatter_sum / DefOpt / tpu / PreRev |
0.0000027333750000000003 s |
0.0000027877 s |
0.98 |
scatter_sum / DefOpt / tpu / PostRev |
0.000002706975 s |
0.000002721 s |
0.99 |
scatter_sum / DefOpt / tpu / BothRev |
0.0000027349000000000003 s |
0.00000279375 s |
0.98 |
scatter_sum / IDefOpt / tpu / PreRev |
0.00000270485 s |
0.000002733225 s |
0.99 |
scatter_sum / IDefOpt / tpu / PostRev |
0.000002736425 s |
0.0000027860750000000003 s |
0.98 |
scatter_sum / IDefOpt / tpu / BothRev |
0.0000027026 s |
0.0000027353250000000004 s |
0.99 |
scatter_sum / JaXPipe / cpu / Primal |
0.000015504 s |
0.000008171760018740315 s |
1.90 |
scatter_sum / Jax / cpu / Primal |
0.000025121 s |
0.000007970799961185549 s |
3.15 |
scatter_sum / HLOOpt / cpu / Primal |
0.00001567 s |
0.000007748800053377636 s |
2.02 |
scatter_sum / PartOpt / cpu / Primal |
0.000015734000000000002 s |
0.000007296480107470415 s |
2.16 |
scatter_sum / IPartOpt / cpu / Primal |
0.000015835 s |
0.000007890339948062319 s |
2.01 |
scatter_sum / DefOpt / cpu / Primal |
0.000015913 s |
0.000007875180017435923 s |
2.02 |
scatter_sum / IDefOpt / cpu / Primal |
0.00001551 s |
0.00000782291990617523 s |
1.98 |
scatter_sum / JaXPipe / cpu / Forward |
0.000023416 s |
0.00001187668000056874 s |
1.97 |
scatter_sum / Jax / cpu / Forward |
0.000023243 s |
0.000012111020150769037 s |
1.92 |
scatter_sum / HLOOpt / cpu / Forward |
0.000022345 s |
0.000012101219908799976 s |
1.85 |
scatter_sum / PartOpt / cpu / Forward |
0.000023351 s |
0.000012010039936285466 s |
1.94 |
scatter_sum / IPartOpt / cpu / Forward |
0.00002389 s |
0.00001217569995787926 s |
1.96 |
scatter_sum / DefOpt / cpu / Forward |
0.000024197 s |
0.000011737019958673043 s |
2.06 |
scatter_sum / IDefOpt / cpu / Forward |
0.000022677 s |
0.000011661860007734503 s |
1.94 |
scatter_sum / JaXPipe / cpu / PreRev |
0.000023244 s |
0.000011499279898998794 s |
2.02 |
scatter_sum / JaXPipe / cpu / PostRev |
0.000022621 s |
0.00001180788007332012 s |
1.92 |
scatter_sum / JaXPipe / cpu / BothRev |
0.000023003 s |
0.000012415540113579482 s |
1.85 |
scatter_sum / Jax / cpu / BothRev |
0.000022681 s |
0.000011430880113039164 s |
1.98 |
scatter_sum / HLOOpt / cpu / PreRev |
0.000022888 s |
0.000012566820078063755 s |
1.82 |
scatter_sum / HLOOpt / cpu / PostRev |
0.000022797 s |
0.00001453337990824366 s |
1.57 |
scatter_sum / HLOOpt / cpu / BothRev |
0.000022891 s |
0.000012527280123322273 s |
1.83 |
scatter_sum / PartOpt / cpu / PreRev |
0.000022542 s |
0.000011573640003916807 s |
1.95 |
scatter_sum / PartOpt / cpu / PostRev |
0.000022521 s |
0.000012084200061508456 s |
1.86 |
scatter_sum / PartOpt / cpu / BothRev |
0.00002303 s |
0.000012801519987988286 s |
1.80 |
scatter_sum / IPartOpt / cpu / PreRev |
0.000023163 s |
0.000011545180132088718 s |
2.01 |
scatter_sum / IPartOpt / cpu / PostRev |
0.000023029 s |
0.000012068579999322537 s |
1.91 |
scatter_sum / IPartOpt / cpu / BothRev |
0.000022954 s |
0.000012279979855520651 s |
1.87 |
scatter_sum / DefOpt / cpu / PreRev |
0.00002246 s |
0.000011816779915534424 s |
1.90 |
scatter_sum / DefOpt / cpu / PostRev |
0.000022644 s |
0.000012119859893573449 s |
1.87 |
scatter_sum / DefOpt / cpu / BothRev |
0.000022933 s |
0.00001199918013298884 s |
1.91 |
scatter_sum / IDefOpt / cpu / PreRev |
0.000023248 s |
0.000011675239911710375 s |
1.99 |
scatter_sum / IDefOpt / cpu / PostRev |
0.000022711 s |
0.00001227513999765506 s |
1.85 |
scatter_sum / IDefOpt / cpu / BothRev |
0.000023079 s |
0.000012509239968494513 s |
1.84 |
slicing / JaXPipe / cpu / Primal |
0.000006327620030788239 s |
0.0000061953599652042615 s |
1.02 |
slicing / Jax / cpu / Primal |
0.000006110100021032849 s |
0.000006929080154804979 s |
0.88 |
slicing / HLOOpt / cpu / Primal |
0.00000673963999361149 s |
0.0000068508998811012135 s |
0.98 |
slicing / PartOpt / cpu / Primal |
0.000006227499998203712 s |
0.000006121540027379524 s |
1.02 |
slicing / IPartOpt / cpu / Primal |
0.000006236400022316957 s |
0.000006725820057909005 s |
0.93 |
slicing / DefOpt / cpu / Primal |
0.00000610848000178521 s |
0.000007069700077408925 s |
0.86 |
slicing / IDefOpt / cpu / Primal |
0.000006536679984492367 s |
0.000006266299933486152 s |
1.04 |
slicing / JaXPipe / cpu / Forward |
0.000009629480000512558 s |
0.0000099524600955192 s |
0.97 |
slicing / Jax / cpu / Forward |
0.00000934474000132468 s |
0.000009137279994320124 s |
1.02 |
slicing / HLOOpt / cpu / Forward |
0.000010158279983443207 s |
0.000009985880024032668 s |
1.02 |
slicing / PartOpt / cpu / Forward |
0.000008899780004867352 s |
0.000009016019794216845 s |
0.99 |
slicing / IPartOpt / cpu / Forward |
0.000010137659965039349 s |
0.000009696300039649942 s |
1.05 |
slicing / DefOpt / cpu / Forward |
0.000009306539977842476 s |
0.0000097391199960839 s |
0.96 |
slicing / IDefOpt / cpu / Forward |
0.00000947069996072969 s |
0.000009139280082308688 s |
1.04 |
slicing / JaXPipe / cpu / PreRev |
0.0000100505200043699 s |
0.000010525399957259652 s |
0.95 |
slicing / JaXPipe / cpu / PostRev |
0.00001006576003419468 s |
0.00000999433996184962 s |
1.01 |
slicing / JaXPipe / cpu / BothRev |
0.000010081459977300256 s |
0.000010574719926808027 s |
0.95 |
slicing / Jax / cpu / BothRev |
0.000010107100042660022 s |
0.000010336499981349334 s |
0.98 |
slicing / HLOOpt / cpu / PreRev |
0.000010265639966746676 s |
0.000010184479833696967 s |
1.01 |
slicing / HLOOpt / cpu / PostRev |
0.000011905739975190952 s |
0.000012298240035306662 s |
0.97 |
slicing / HLOOpt / cpu / BothRev |
0.00001016011997307942 s |
0.00001021929976559477 s |
0.99 |
slicing / PartOpt / cpu / PreRev |
0.00000985252000646142 s |
0.000010392360090918371 s |
0.95 |
slicing / PartOpt / cpu / PostRev |
0.000010124879991053605 s |
0.000010358800100220831 s |
0.98 |
slicing / PartOpt / cpu / BothRev |
0.000009897120044115582 s |
0.000010531820043979678 s |
0.94 |
slicing / IPartOpt / cpu / PreRev |
0.000009764179985722876 s |
0.000009982420051528606 s |
0.98 |
slicing / IPartOpt / cpu / PostRev |
0.000009950640014722012 s |
0.000010332759920856917 s |
0.96 |
slicing / IPartOpt / cpu / BothRev |
0.000009960500019587924 s |
0.000010125199878530111 s |
0.98 |
slicing / DefOpt / cpu / PreRev |
0.00001003487997877528 s |
0.00001015016001474578 s |
0.99 |
slicing / DefOpt / cpu / PostRev |
0.00000988970001344569 s |
0.000010507639999559614 s |
0.94 |
slicing / DefOpt / cpu / BothRev |
0.000010041919977084034 s |
0.000009575999865774065 s |
1.05 |
slicing / IDefOpt / cpu / PreRev |
0.000009768779982550768 s |
0.000010578259971225634 s |
0.92 |
slicing / IDefOpt / cpu / PostRev |
0.00000974606001364009 s |
0.000009717359898786524 s |
1.00 |
slicing / IDefOpt / cpu / BothRev |
0.000010449660012454842 s |
0.000010080320062115788 s |
1.04 |
slicing / JaXPipe / cuda / Primal |
0.000002272 s |
0.000002304 s |
0.99 |
slicing / Jax / cuda / Primal |
0.000002272 s |
0.000002303 s |
0.99 |
slicing / HLOOpt / cuda / Primal |
0.000002272 s |
0.000002303 s |
0.99 |
slicing / PartOpt / cuda / Primal |
0.000002271 s |
0.000002303 s |
0.99 |
slicing / IPartOpt / cuda / Primal |
0.000002271 s |
0.000002303 s |
0.99 |
slicing / DefOpt / cuda / Primal |
0.000002272 s |
0.000002303 s |
0.99 |
slicing / IDefOpt / cuda / Primal |
0.000002272 s |
0.000002303 s |
0.99 |
slicing / JaXPipe / cuda / Forward |
0.000011008 s |
0.000010528 s |
1.05 |
slicing / Jax / cuda / Forward |
0.000012256 s |
0.000010656 s |
1.15 |
slicing / HLOOpt / cuda / Forward |
0.000010816 s |
0.000011616 s |
0.93 |
slicing / PartOpt / cuda / Forward |
0.000010784 s |
0.000010464 s |
1.03 |
slicing / IPartOpt / cuda / Forward |
0.000010624 s |
0.000011455999999999998 s |
0.93 |
slicing / DefOpt / cuda / Forward |
0.000010432 s |
0.0000104 s |
1.00 |
slicing / IDefOpt / cuda / Forward |
0.000010848 s |
0.000010656 s |
1.02 |
slicing / JaXPipe / cuda / PreRev |
0.000010688 s |
0.00001024 s |
1.04 |
slicing / JaXPipe / cuda / PostRev |
0.000011776 s |
0.000010688 s |
1.10 |
slicing / JaXPipe / cuda / BothRev |
0.000011648 s |
0.000010944 s |
1.06 |
slicing / Jax / cuda / BothRev |
0.000010816 s |
0.000010304 s |
1.05 |
slicing / HLOOpt / cuda / PreRev |
0.000010944 s |
0.000010879 s |
1.01 |
slicing / HLOOpt / cuda / PostRev |
0.000010303 s |
0.000010624 s |
0.97 |
slicing / HLOOpt / cuda / BothRev |
0.00001024 s |
0.000010208 s |
1.00 |
slicing / PartOpt / cuda / PreRev |
0.000010976 s |
0.0000104 s |
1.06 |
slicing / PartOpt / cuda / PostRev |
0.000010751 s |
0.0000104 s |
1.03 |
slicing / PartOpt / cuda / BothRev |
0.00001072 s |
0.000010752 s |
1.00 |
slicing / IPartOpt / cuda / PreRev |
0.000010336 s |
0.00001088 s |
0.95 |
slicing / IPartOpt / cuda / PostRev |
0.000010847 s |
0.000010208 s |
1.06 |
slicing / IPartOpt / cuda / BothRev |
0.000010528 s |
0.000010496 s |
1.00 |
slicing / DefOpt / cuda / PreRev |
0.000011071 s |
0.000010369 s |
1.07 |
slicing / DefOpt / cuda / PostRev |
0.000010496 s |
0.000010848 s |
0.97 |
slicing / DefOpt / cuda / BothRev |
0.000010368 s |
0.000010272 s |
1.01 |
slicing / IDefOpt / cuda / PreRev |
0.000010687 s |
0.00001056 s |
1.01 |
slicing / IDefOpt / cuda / PostRev |
0.0000104 s |
0.000010688 s |
0.97 |
slicing / IDefOpt / cuda / BothRev |
0.000010496 s |
0.000010816 s |
0.97 |
slicing / JaXPipe / tpu / Primal |
9.6305e-7 s |
9.56525e-7 s |
1.01 |
slicing / Jax / tpu / Primal |
9.635249999999998e-7 s |
9.6695e-7 s |
1.00 |
slicing / HLOOpt / tpu / Primal |
9.674e-7 s |
9.62375e-7 s |
1.01 |
slicing / PartOpt / tpu / Primal |
9.66325e-7 s |
9.6405e-7 s |
1.00 |
slicing / IPartOpt / tpu / Primal |
9.67375e-7 s |
9.49075e-7 s |
1.02 |
slicing / DefOpt / tpu / Primal |
9.61175e-7 s |
9.61675e-7 s |
1.00 |
slicing / IDefOpt / tpu / Primal |
9.649e-7 s |
9.4925e-7 s |
1.02 |
slicing / JaXPipe / tpu / Forward |
0.0000014130000000000005 s |
0.000001397875 s |
1.01 |
slicing / Jax / tpu / Forward |
0.000001422775 s |
0.0000014018 s |
1.01 |
slicing / HLOOpt / tpu / Forward |
0.000001524225 s |
0.000001504925 s |
1.01 |
slicing / PartOpt / tpu / Forward |
0.0000014355499999999995 s |
0.0000014259250000000002 s |
1.01 |
slicing / IPartOpt / tpu / Forward |
0.0000015263 s |
0.000001502825 s |
1.02 |
slicing / DefOpt / tpu / Forward |
0.0000014351249999999997 s |
0.000001418525 s |
1.01 |
slicing / IDefOpt / tpu / Forward |
0.0000015184 s |
0.000001504 s |
1.01 |
slicing / JaXPipe / tpu / PreRev |
0.00000233075 s |
0.000002326175 s |
1.00 |
slicing / JaXPipe / tpu / PostRev |
0.0000024985 s |
0.000002511375 s |
0.99 |
slicing / JaXPipe / tpu / BothRev |
0.000002351175 s |
0.000002344625 s |
1.00 |
slicing / Jax / tpu / BothRev |
0.0000025301 s |
0.000002525925 s |
1.00 |
slicing / HLOOpt / tpu / PreRev |
0.000002356325 s |
0.00000234655 s |
1.00 |
slicing / HLOOpt / tpu / PostRev |
0.000002519675 s |
0.0000025209 s |
1.00 |
slicing / HLOOpt / tpu / BothRev |
0.000002353675 s |
0.000002349525 s |
1.00 |
slicing / PartOpt / tpu / PreRev |
0.00000252335 s |
0.000002525675 s |
1.00 |
slicing / PartOpt / tpu / PostRev |
0.0000023431 s |
0.0000023427 s |
1.00 |
slicing / PartOpt / tpu / BothRev |
0.0000025238500000000004 s |
0.0000025337 s |
1.00 |
slicing / IPartOpt / tpu / PreRev |
0.0000023512 s |
0.00000235335 s |
1.00 |
slicing / IPartOpt / tpu / PostRev |
0.00000252095 s |
0.000002518825 s |
1.00 |
slicing / IPartOpt / tpu / BothRev |
0.0000023483 s |
0.000002355 s |
1.00 |
slicing / DefOpt / tpu / PreRev |
0.00000252625 s |
0.0000025295750000000003 s |
1.00 |
slicing / DefOpt / tpu / PostRev |
0.00000234735 s |
0.00000234935 s |
1.00 |
slicing / DefOpt / tpu / BothRev |
0.0000025253250000000003 s |
0.0000025281 s |
1.00 |
slicing / IDefOpt / tpu / PreRev |
0.000002352 s |
0.000002341675 s |
1.00 |
slicing / IDefOpt / tpu / PostRev |
0.000002530325 s |
0.00000253535 s |
1.00 |
slicing / IDefOpt / tpu / BothRev |
0.0000023613750000000003 s |
0.00000234925 s |
1.01 |
slicing / JaXPipe / cpu / Primal |
0.00001286 s |
0.0000061953599652042615 s |
2.08 |
slicing / Jax / cpu / Primal |
0.000012651 s |
0.000006929080154804979 s |
1.83 |
slicing / HLOOpt / cpu / Primal |
0.000012756 s |
0.0000068508998811012135 s |
1.86 |
slicing / PartOpt / cpu / Primal |
0.000012528 s |
0.000006121540027379524 s |
2.05 |
slicing / IPartOpt / cpu / Primal |
0.000012618 s |
0.000006725820057909005 s |
1.88 |
slicing / DefOpt / cpu / Primal |
0.000012664 s |
0.000007069700077408925 s |
1.79 |
slicing / IDefOpt / cpu / Primal |
0.000012729 s |
0.000006266299933486152 s |
2.03 |
slicing / JaXPipe / cpu / Forward |
0.000017395999999999997 s |
0.0000099524600955192 s |
1.75 |
slicing / Jax / cpu / Forward |
0.000016831 s |
0.000009137279994320124 s |
1.84 |
slicing / HLOOpt / cpu / Forward |
0.000017214 s |
0.000009985880024032668 s |
1.72 |
slicing / PartOpt / cpu / Forward |
0.000016893 s |
0.000009016019794216845 s |
1.87 |
slicing / IPartOpt / cpu / Forward |
0.000017074999999999998 s |
0.000009696300039649942 s |
1.76 |
slicing / DefOpt / cpu / Forward |
0.000016945 s |
0.0000097391199960839 s |
1.74 |
slicing / IDefOpt / cpu / Forward |
0.000017001 s |
0.000009139280082308688 s |
1.86 |
slicing / JaXPipe / cpu / PreRev |
0.000017879 s |
0.000010525399957259652 s |
1.70 |
slicing / JaXPipe / cpu / PostRev |
0.000017512 s |
0.00000999433996184962 s |
1.75 |
slicing / JaXPipe / cpu / BothRev |
0.000017447 s |
0.000010574719926808027 s |
1.65 |
slicing / Jax / cpu / BothRev |
0.000017292 s |
0.000010336499981349334 s |
1.67 |
slicing / HLOOpt / cpu / PreRev |
0.000017497 s |
0.000010184479833696967 s |
1.72 |
slicing / HLOOpt / cpu / PostRev |
0.000017397 s |
0.000012298240035306662 s |
1.41 |
slicing / HLOOpt / cpu / BothRev |
0.00001749 s |
0.00001021929976559477 s |
1.71 |
slicing / PartOpt / cpu / PreRev |
0.000017364000000000002 s |
0.000010392360090918371 s |
1.67 |
slicing / PartOpt / cpu / PostRev |
0.00001749 s |
0.000010358800100220831 s |
1.69 |
slicing / PartOpt / cpu / BothRev |
0.000017513 s |
0.000010531820043979678 s |
1.66 |
slicing / IPartOpt / cpu / PreRev |
0.000017696 s |
0.000009982420051528606 s |
1.77 |
slicing / IPartOpt / cpu / PostRev |
0.000017459 s |
0.000010332759920856917 s |
1.69 |
slicing / IPartOpt / cpu / BothRev |
0.000017384 s |
0.000010125199878530111 s |
1.72 |
slicing / DefOpt / cpu / PreRev |
0.000017818 s |
0.00001015016001474578 s |
1.76 |
slicing / DefOpt / cpu / PostRev |
0.000017578 s |
0.000010507639999559614 s |
1.67 |
slicing / DefOpt / cpu / BothRev |
0.000017409000000000002 s |
0.000009575999865774065 s |
1.82 |
slicing / IDefOpt / cpu / PreRev |
0.000017547 s |
0.000010578259971225634 s |
1.66 |
slicing / IDefOpt / cpu / PostRev |
0.000017590000000000003 s |
0.000009717359898786524 s |
1.81 |
slicing / IDefOpt / cpu / BothRev |
0.000017479 s |
0.000010080320062115788 s |
1.73 |
sum / JaXPipe / cpu / Primal |
0.000007554379990324378 s |
0.000007981139933690429 s |
0.95 |
sum / Jax / cpu / Primal |
0.000007960660022945376 s |
0.000007507980044465512 s |
1.06 |
sum / HLOOpt / cpu / Primal |
0.000007627059949300019 s |
0.000008308500073326285 s |
0.92 |
sum / PartOpt / cpu / Primal |
0.000007592980009576422 s |
0.000007578700096928514 s |
1.00 |
sum / IPartOpt / cpu / Primal |
0.000008399579983233706 s |
0.00000764742013416253 s |
1.10 |
sum / DefOpt / cpu / Primal |
0.000007867059975978918 s |
0.000007487280381610617 s |
1.05 |
sum / IDefOpt / cpu / Primal |
0.000007898500034571044 s |
0.000007432820129906759 s |
1.06 |
sum / JaXPipe / cpu / Forward |
0.000011029500001313863 s |
0.000011446299795352389 s |
0.96 |
sum / Jax / cpu / Forward |
0.000011038420034310547 s |
0.000011345720013196116 s |
0.97 |
sum / HLOOpt / cpu / Forward |
0.000011531520012795228 s |
0.000011818420098279605 s |
0.98 |
sum / PartOpt / cpu / Forward |
0.000011068879985032254 s |
0.000010789639927679672 s |
1.03 |
sum / IPartOpt / cpu / Forward |
0.00001204667993079056 s |
0.000011666479986160994 s |
1.03 |
sum / DefOpt / cpu / Forward |
0.000011732200018741425 s |
0.00001131830027588876 s |
1.04 |
sum / IDefOpt / cpu / Forward |
0.000011114459948657896 s |
0.000011210779812245164 s |
0.99 |
sum / JaXPipe / cpu / PreRev |
0.000010913880032603627 s |
0.000011084600300819147 s |
0.98 |
sum / JaXPipe / cpu / PostRev |
0.000011153299992656684 s |
0.000010880019872274717 s |
1.03 |
sum / JaXPipe / cpu / BothRev |
0.00001089274001060403 s |
0.000011212040044483727 s |
0.97 |
sum / Jax / cpu / BothRev |
0.00001085040001271409 s |
0.000010437400087539572 s |
1.04 |
sum / HLOOpt / cpu / PreRev |
0.000011326420017212513 s |
0.000011505000002216549 s |
0.98 |
sum / HLOOpt / cpu / PostRev |
0.000012447360013538855 s |
0.000012359059837763198 s |
1.01 |
sum / HLOOpt / cpu / BothRev |
0.000010493820009287446 s |
0.000010557379791862331 s |
0.99 |
sum / PartOpt / cpu / PreRev |
0.000010942959997919388 s |
0.000010827380028786138 s |
1.01 |
sum / PartOpt / cpu / PostRev |
0.000011010499993062694 s |
0.000011040499921364243 s |
1.00 |
sum / PartOpt / cpu / BothRev |
0.000011881060045197955 s |
0.000011394000066502486 s |
1.04 |
sum / IPartOpt / cpu / PreRev |
0.00001094047999686154 s |
0.000010900399975071197 s |
1.00 |
sum / IPartOpt / cpu / PostRev |
0.000010565060010776506 s |
0.000011152820188726765 s |
0.95 |
sum / IPartOpt / cpu / BothRev |
0.000010662920058166492 s |
0.000010597239997878205 s |
1.01 |
sum / DefOpt / cpu / PreRev |
0.000010670539977581938 s |
0.00001111023982957704 s |
0.96 |
sum / DefOpt / cpu / PostRev |
0.000010902719986916054 s |
0.000010991519957315177 s |
0.99 |
sum / DefOpt / cpu / BothRev |
0.000010851880006157444 s |
0.00001119037999160355 s |
0.97 |
sum / IDefOpt / cpu / PreRev |
0.000010496120039533708 s |
0.000010858259993256067 s |
0.97 |
sum / IDefOpt / cpu / PostRev |
0.000010656240028765751 s |
0.000010737400079960937 s |
0.99 |
sum / IDefOpt / cpu / BothRev |
0.000011423879996073084 s |
0.000010744440078269693 s |
1.06 |
sum / JaXPipe / cuda / Primal |
0.000002463 s |
0.000002464 s |
1.00 |
sum / Jax / cuda / Primal |
0.000002464 s |
0.000002464 s |
1 |
sum / HLOOpt / cuda / Primal |
0.000002463 s |
0.000002464 s |
1.00 |
sum / PartOpt / cuda / Primal |
0.000002463 s |
0.000002463 s |
1 |
sum / IPartOpt / cuda / Primal |
0.000002464 s |
0.000002464 s |
1 |
sum / DefOpt / cuda / Primal |
0.000002463 s |
0.000002463 s |
1 |
sum / IDefOpt / cuda / Primal |
0.000002463 s |
0.000002463 s |
1 |
sum / JaXPipe / cuda / Forward |
0.000011072 s |
0.000010656 s |
1.04 |
sum / Jax / cuda / Forward |
0.000010656 s |
0.0000104 s |
1.02 |
sum / HLOOpt / cuda / Forward |
0.000011296 s |
0.00001104 s |
1.02 |
sum / PartOpt / cuda / Forward |
0.00001088 s |
0.00001104 s |
0.99 |
sum / IPartOpt / cuda / Forward |
0.00001072 s |
0.000011136 s |
0.96 |
sum / DefOpt / cuda / Forward |
0.000010848 s |
0.000010624 s |
1.02 |
sum / IDefOpt / cuda / Forward |
0.00001088 s |
0.000010783 s |
1.01 |
sum / JaXPipe / cuda / PreRev |
0.000010304 s |
0.00000992 s |
1.04 |
sum / JaXPipe / cuda / PostRev |
0.000010592 s |
0.000010176 s |
1.04 |
sum / JaXPipe / cuda / BothRev |
0.00001024 s |
0.000010208 s |
1.00 |
sum / Jax / cuda / BothRev |
0.000010848 s |
0.000010496 s |
1.03 |
sum / HLOOpt / cuda / PreRev |
0.000010816 s |
0.000010688 s |
1.01 |
sum / HLOOpt / cuda / PostRev |
0.000010432 s |
0.000010432 s |
1 |
sum / HLOOpt / cuda / BothRev |
0.000010624 s |
0.000009376 s |
1.13 |
sum / PartOpt / cuda / PreRev |
0.000010432 s |
0.000010752 s |
0.97 |
sum / PartOpt / cuda / PostRev |
0.000010368 s |
0.000010304 s |
1.01 |
sum / PartOpt / cuda / BothRev |
0.000010176 s |
0.00001024 s |
0.99 |
sum / IPartOpt / cuda / PreRev |
0.000010688 s |
0.000010399 s |
1.03 |
sum / IPartOpt / cuda / PostRev |
0.000009984 s |
0.000010368 s |
0.96 |
sum / IPartOpt / cuda / BothRev |
0.000010304 s |
0.000010432 s |
0.99 |
sum / DefOpt / cuda / PreRev |
0.000010656 s |
0.00001056 s |
1.01 |
sum / DefOpt / cuda / PostRev |
0.000010272 s |
0.00001056 s |
0.97 |
sum / DefOpt / cuda / BothRev |
0.000010624 s |
0.00001024 s |
1.04 |
sum / IDefOpt / cuda / PreRev |
0.000010816 s |
0.000010592 s |
1.02 |
sum / IDefOpt / cuda / PostRev |
0.000010656 s |
0.000010912 s |
0.98 |
sum / IDefOpt / cuda / BothRev |
0.000010528 s |
0.000010497 s |
1.00 |
sum / JaXPipe / tpu / Primal |
5.034750000000001e-7 s |
5.168749999999999e-7 s |
0.97 |
sum / Jax / tpu / Primal |
5.471999999999999e-7 s |
5.471999999999999e-7 s |
1 |
sum / HLOOpt / tpu / Primal |
5.03425e-7 s |
5.170249999999999e-7 s |
0.97 |
sum / PartOpt / tpu / Primal |
5.471999999999999e-7 s |
5.4715e-7 s |
1.00 |
sum / IPartOpt / tpu / Primal |
5.0325e-7 s |
5.17125e-7 s |
0.97 |
sum / DefOpt / tpu / Primal |
5.47175e-7 s |
5.470250000000001e-7 s |
1.00 |
sum / IDefOpt / tpu / Primal |
5.02925e-7 s |
5.169000000000001e-7 s |
0.97 |
sum / JaXPipe / tpu / Forward |
0.0000015574 s |
0.000001552425 s |
1.00 |
sum / Jax / tpu / Forward |
0.000001490275 s |
0.0000015041 s |
0.99 |
sum / HLOOpt / tpu / Forward |
0.000001536175 s |
0.000001532025 s |
1.00 |
sum / PartOpt / tpu / Forward |
0.000001488025 s |
0.0000015017 s |
0.99 |
sum / IPartOpt / tpu / Forward |
0.000001533275 s |
0.000001537425 s |
1.00 |
sum / DefOpt / tpu / Forward |
0.0000014934 s |
0.000001507025 s |
0.99 |
sum / IDefOpt / tpu / Forward |
0.0000015309 s |
0.000001533225 s |
1.00 |
sum / JaXPipe / tpu / PreRev |
9.9215e-7 s |
0.000001 s |
0.99 |
sum / JaXPipe / tpu / PostRev |
0.0000010353 s |
0.00000103075 s |
1.00 |
sum / JaXPipe / tpu / BothRev |
0.000001008625 s |
0.00000100355 s |
1.01 |
sum / Jax / tpu / BothRev |
0.000001036775 s |
0.00000103535 s |
1.00 |
sum / HLOOpt / tpu / PreRev |
0.000001007275 s |
0.000001011625 s |
1.00 |
sum / HLOOpt / tpu / PostRev |
0.000001040675 s |
0.00000104265 s |
1.00 |
sum / HLOOpt / tpu / BothRev |
9.97275e-7 s |
0.000001003775 s |
0.99 |
sum / PartOpt / tpu / PreRev |
0.00000104875 s |
0.000001035525 s |
1.01 |
sum / PartOpt / tpu / PostRev |
9.91025e-7 s |
0.0000010048749999999998 s |
0.99 |
sum / PartOpt / tpu / BothRev |
0.000001039875 s |
0.000001032275 s |
1.01 |
sum / IPartOpt / tpu / PreRev |
9.9225e-7 s |
0.000001011075 s |
0.98 |
sum / IPartOpt / tpu / PostRev |
0.0000010369250000000002 s |
0.0000010369250000000002 s |
1 |
sum / IPartOpt / tpu / BothRev |
9.9275e-7 s |
0.0000010005 s |
0.99 |
sum / DefOpt / tpu / PreRev |
0.000001047175 s |
0.000001044675 s |
1.00 |
sum / DefOpt / tpu / PostRev |
9.91425e-7 s |
0.0000010031249999999998 s |
0.99 |
sum / DefOpt / tpu / BothRev |
0.000001033275 s |
0.00000103255 s |
1.00 |
sum / IDefOpt / tpu / PreRev |
9.9125e-7 s |
0.0000010112000000000002 s |
0.98 |
sum / IDefOpt / tpu / PostRev |
0.000001042375 s |
0.000001041925 s |
1.00 |
sum / IDefOpt / tpu / BothRev |
9.95775e-7 s |
0.000001004525 s |
0.99 |
sum / JaXPipe / cpu / Primal |
0.000014628 s |
0.000007981139933690429 s |
1.83 |
sum / Jax / cpu / Primal |
0.00001465 s |
0.000007507980044465512 s |
1.95 |
sum / HLOOpt / cpu / Primal |
0.00001424 s |
0.000008308500073326285 s |
1.71 |
sum / PartOpt / cpu / Primal |
0.000014231 s |
0.000007578700096928514 s |
1.88 |
sum / IPartOpt / cpu / Primal |
0.000014729 s |
0.00000764742013416253 s |
1.93 |
sum / DefOpt / cpu / Primal |
0.00001461 s |
0.000007487280381610617 s |
1.95 |
sum / IDefOpt / cpu / Primal |
0.000014898 s |
0.000007432820129906759 s |
2.00 |
sum / JaXPipe / cpu / Forward |
0.000020428 s |
0.000011446299795352389 s |
1.78 |
sum / Jax / cpu / Forward |
0.000020552 s |
0.000011345720013196116 s |
1.81 |
sum / HLOOpt / cpu / Forward |
0.000019971 s |
0.000011818420098279605 s |
1.69 |
sum / PartOpt / cpu / Forward |
0.00002012 s |
0.000010789639927679672 s |
1.86 |
sum / IPartOpt / cpu / Forward |
0.000019861 s |
0.000011666479986160994 s |
1.70 |
sum / DefOpt / cpu / Forward |
0.000019925 s |
0.00001131830027588876 s |
1.76 |
sum / IDefOpt / cpu / Forward |
0.000020319 s |
0.000011210779812245164 s |
1.81 |
sum / JaXPipe / cpu / PreRev |
0.000019488 s |
0.000011084600300819147 s |
1.76 |
sum / JaXPipe / cpu / PostRev |
0.000019078000000000003 s |
0.000010880019872274717 s |
1.75 |
sum / JaXPipe / cpu / BothRev |
0.000019369 s |
0.000011212040044483727 s |
1.73 |
sum / Jax / cpu / BothRev |
0.000019337 s |
0.000010437400087539572 s |
1.85 |
sum / HLOOpt / cpu / PreRev |
0.000019543 s |
0.000011505000002216549 s |
1.70 |
sum / HLOOpt / cpu / PostRev |
0.000019368 s |
0.000012359059837763198 s |
1.57 |
sum / HLOOpt / cpu / BothRev |
0.000019006 s |
0.000010557379791862331 s |
1.80 |
sum / PartOpt / cpu / PreRev |
0.000019406 s |
0.000010827380028786138 s |
1.79 |
sum / PartOpt / cpu / PostRev |
0.000018651 s |
0.000011040499921364243 s |
1.69 |
sum / PartOpt / cpu / BothRev |
0.000019137 s |
0.000011394000066502486 s |
1.68 |
sum / IPartOpt / cpu / PreRev |
0.000018736 s |
0.000010900399975071197 s |
1.72 |
sum / IPartOpt / cpu / PostRev |
0.000018639 s |
0.000011152820188726765 s |
1.67 |
sum / IPartOpt / cpu / BothRev |
0.000018917 s |
0.000010597239997878205 s |
1.79 |
sum / DefOpt / cpu / PreRev |
0.000019038 s |
0.00001111023982957704 s |
1.71 |
sum / DefOpt / cpu / PostRev |
0.000018629 s |
0.000010991519957315177 s |
1.69 |
sum / DefOpt / cpu / BothRev |
0.000019094 s |
0.00001119037999160355 s |
1.71 |
sum / IDefOpt / cpu / PreRev |
0.000019054 s |
0.000010858259993256067 s |
1.75 |
sum / IDefOpt / cpu / PostRev |
0.000018765 s |
0.000010737400079960937 s |
1.75 |
sum / IDefOpt / cpu / BothRev |
0.000019194 s |
0.000010744440078269693 s |
1.79 |
value_and_grad / JaXPipe / cpu / Primal |
0.000014259099998525926 s |
0.000014093239915382585 s |
1.01 |
value_and_grad / Jax / cpu / Primal |
0.00001363746003335109 s |
0.000013464080038829709 s |
1.01 |
value_and_grad / HLOOpt / cpu / Primal |
0.000013362600011532775 s |
0.000013915920135332271 s |
0.96 |
value_and_grad / PartOpt / cpu / Primal |
0.000013150839977242868 s |
0.000013872219788026997 s |
0.95 |
value_and_grad / IPartOpt / cpu / Primal |
0.000013315900005181904 s |
0.0000147368999387254 s |
0.90 |
value_and_grad / DefOpt / cpu / Primal |
0.000013488180038621066 s |
0.000013849619972461368 s |
0.97 |
value_and_grad / IDefOpt / cpu / Primal |
0.000013784360025965726 s |
0.000013435360015137122 s |
1.03 |
value_and_grad / JaXPipe / cuda / Primal |
0.000032606999999999995 s |
0.000033472 s |
0.97 |
value_and_grad / Jax / cuda / Primal |
0.000033248 s |
0.000034527 s |
0.96 |
value_and_grad / HLOOpt / cuda / Primal |
0.00003264 s |
0.00003456 s |
0.94 |
value_and_grad / PartOpt / cuda / Primal |
0.000033279000000000004 s |
0.000034527 s |
0.96 |
value_and_grad / IPartOpt / cuda / Primal |
0.00003328 s |
0.000034144000000000004 s |
0.97 |
value_and_grad / DefOpt / cuda / Primal |
0.000033184 s |
0.000034048 s |
0.97 |
value_and_grad / IDefOpt / cuda / Primal |
0.000033632 s |
0.000033888 s |
0.99 |
value_and_grad / JaXPipe / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / Jax / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / HLOOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / PartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IPartOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / DefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / IDefOpt / tpu / Primal |
0 s |
0 s |
1 |
value_and_grad / JaXPipe / cpu / Primal |
0.000022994 s |
0.000014093239915382585 s |
1.63 |
value_and_grad / Jax / cpu / Primal |
0.00002281 s |
0.000013464080038829709 s |
1.69 |
value_and_grad / HLOOpt / cpu / Primal |
0.000023392 s |
0.000013915920135332271 s |
1.68 |
value_and_grad / PartOpt / cpu / Primal |
0.000023 s |
0.000013872219788026997 s |
1.66 |
value_and_grad / IPartOpt / cpu / Primal |
0.000023049 s |
0.0000147368999387254 s |
1.56 |
value_and_grad / DefOpt / cpu / Primal |
0.000023094 s |
0.000013849619972461368 s |
1.67 |
value_and_grad / IDefOpt / cpu / Primal |
0.000023222 s |
0.000013435360015137122 s |
1.73 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / cuda / Primal |
1.7019069219999998 s |
1.703185884 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / cuda / Primal |
1.7052245339999998 s |
1.706024122 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / cuda / Primal |
1.716587634 s |
1.71937241 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / cuda / Primal |
1.695886676 s |
1.696651472 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / cuda / Primal |
1.693926152 s |
1.694673498 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / cuda / Primal |
1.665075052 s |
1.665913635 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / cuda / Primal |
1.911157946 s |
1.912403586 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / JaXPipe / tpu / Primal |
3.038948373125 s |
3.03822168 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / Jax / tpu / Primal |
3.03943760875 s |
3.03861887625 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / HLOOpt / tpu / Primal |
3.121902861875 s |
3.1209755175 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / PartOpt / tpu / Primal |
3.06036302125 s |
3.059448075 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IPartOpt / tpu / Primal |
3.060507205 s |
3.0596829975 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / DefOpt / tpu / Primal |
2.102436190625 s |
2.10219199125 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_24_outer_steps_4 / IDefOpt / tpu / Primal |
4.35665004375 s |
4.355976231875 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / JaXPipe / cpu / Primal |
6.009743265 s |
6.146210463 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / Jax / cpu / Primal |
6.056264656000001 s |
6.310200541 s |
0.96 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / HLOOpt / cpu / Primal |
6.09426268 s |
6.218582628 s |
0.98 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / PartOpt / cpu / Primal |
6.224692692 s |
6.2440171950000005 s |
1.00 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IPartOpt / cpu / Primal |
6.224546395 s |
6.188685470999999 s |
1.01 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / DefOpt / cpu / Primal |
2.379932348 s |
2.5103855910000004 s |
0.95 |
neuralgcm_v1/deterministic_2_8_deg_inner_steps_2_outer_steps_2 / IDefOpt / cpu / Primal |
6.598827396 s |
6.835344457 s |
0.97 |
This comment was automatically generated by workflow using github-action-benchmark.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.