04 Jun 01:31

hughperkins

701c11a

v1.0.2 Latest

Latest

Release v1.0.2

This release reverts 'fix ndarrays on data oriented' which caused a regression in Genesis.

What's Changed

[Type] Add unpacked form for qd.vector for indexed register access by @hughperkins in #718
[DataOriented] Revert "[DataOriented] Fix ndarrays on data oriented (#704)" by @hughperkins in #719

Full Changelog: v1.0.1...v1.0.2

Contributors

hughperkins

Assets 2

03 Jun 15:56

hughperkins

v1.0.1

d2fcd4c

v1.0.1

Release v1.0.1

This release adds axes= to ndrange, bitonic sort, and adds 32x32 Cholesky tiles.

What's Changed

Perf

[Perf] Move register-tile Cholesky optimizations from Genesis back into quadrants by @hughperkins in #714
[Perf] Add bitonic sort to subgroup ops by @hughperkins in #713

DataOriented

[DataOriented] Fix ndarrays on data oriented by @hughperkins in #704

Lang

[Lang] Add axes= to ndrange by @hughperkins in https://github.com/Genesis-Embodied-

CI

[CI] Add slow marker and remove un-necessary tests by @hughperkins in #711
[CI] Upgrade PR change report from composer 2 to composer 2.5 by @hughperkins in #716
AI/quadrants/pull/710

Doc

[Doc] Fix spelling of 'Ying' to 'Yin' in README by @zhouxian in #715

Test

[Test] Drop taichi xdist fork, use stock pytest-xdist by @hughperkins in #556

Full Changelog: v1.0.0...v1.0.1

Contributors

hughperkins and zhouxian

Assets 2

19 May 18:38

hughperkins

v1.0.1b2

8a7ead4

v1.0.1b2 Pre-release

Pre-release

Pre-release v1.0.1b2

Changes:

Data-oriented

[DataOriented] Fastcache, perf, pruning by @hughperkins in #705

Full Changelog: v1.0.0...v1.0.1b2

Contributors

hughperkins

Assets 2

19 May 18:24

hughperkins

v1.0.1b1

4398af7

v1.0.1b1 Pre-release

Pre-release

Pre-release v1.0.1b1

Changes:

Data-oriented

[DataOriented] Fastcache, perf, pruning by @hughperkins in #705

Full Changelog: v1.0.0...v1.0.1b1

Contributors

hughperkins

Assets 2

19 May 16:07

hughperkins

v1.0.0

5d68fc4

v1.0.0

Release v1.0.0

This release adds new device-level ops for QIPC, and volatile_load.

What's Changed

GPU

[GPU] New device-level ops for QIPC by @hughperkins in #693

Cleaning

[Cleaning] PrefixSumExecutor: drop unused GRID_SZ local by @hughperkins in #701
[Cleaning] sync(): fix unsupported-arch error message by @hughperkins in #700

Atomics

[Atomics] add qd.volatile_load primitive (closes #648) by @hughperkins in #702

AutoDiff

[AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info by @hughperkins in #708

Vulkan

[Vulkan] Declare GroupNonUniform SPIR-V caps and enable shaderSubgroupExtendedTypes by @hughperkins in #707

Full Changelog: v0.8.0...v1.0.0

Contributors

hughperkins

Assets 2

19 May 07:52

hughperkins

v0.8.1b2

f6c68d8

v0.8.1b2 Pre-release

Pre-release

Pre-release v0.8.1b2

This pre-release is to test a new faster more streamlined data_oriented class on Genesis.

What's Changed

GPU

[GPU] New device-level ops for QIPC by @hughperkins in #693

Cleaning

[Cleaning] PrefixSumExecutor: drop unused GRID_SZ local by @hughperkins in #701
[Cleaning] sync(): fix unsupported-arch error message by @hughperkins in #700

Atomics

[Atomics] add qd.volatile_load primitive (closes #648) by @hughperkins in #702

AutoDiff

[AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info by @hughperkins in #708

DataOriented

[DataOriented] Fastcache, perf, pruning by @hughperkins in #705

Full Changelog: v0.8.0...v0.8.1b2

Contributors

hughperkins

Assets 2

19 May 05:36

hughperkins

v0.8.1b1

8b71703

v0.8.1b1 Pre-release

Pre-release

Pre-release v0.8.1b1

This pre-release is to test a new faster more streamlined data_oriented class on Genesis.

What's Changed

GPU

[GPU] New device-level ops for QIPC by @hughperkins in #693

Cleaning

[Cleaning] PrefixSumExecutor: drop unused GRID_SZ local by @hughperkins in #701
[Cleaning] sync(): fix unsupported-arch error message by @hughperkins in #700

Atomics

[Atomics] add qd.volatile_load primitive (closes #648) by @hughperkins in #702

AutoDiff

[AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info by @hughperkins in #708

DataOriented

[DataOriented] Fastcache, perf, pruning by @hughperkins in #705

Full Changelog: v0.8.0...v0.8.1b1

Contributors

hughperkins

Assets 2

16 May 12:07

hughperkins

v0.8.0

a22cc2d

v0.8.0

Release v0.8.0

This release brings many cross-GPU SIMT primitives, at both subgroup and block level. Note that subgroup reductions no longer take log2_size parameter, which is a breaking change, hence the minor version bump. In addition, AMD always uses wave64 going forward, to simplify testing.

What's Changed

GPU

[GPU] Cross-GPU for grid ops by @hughperkins in #670
[GPU] New bit ops for QIPC by @hughperkins in #679
[GPU] Subgroup ops cross-gpu by @hughperkins in #665
[BREAKING][GPU] New QIPC ops for subgroups by @hughperkins in #676
[GPU] New QIPC ops for block by @hughperkins in #684

Math

[Math] Make bitop operations portable cross-gpu by @hughperkins in #662
[Math] New QIPC ops for single-threaded linalg by @hughperkins in #683

AMDGPU

[AMDGPU] Always use wave64, on both RDNA and CDNA by @hughperkins in #687
[AMDGPU] Use syncscope("agent") for atomix xor to avoid CAS livelock by @hughperkins in #672

Graph

[Graph] Rename CUDA Graph to Graph in docs by @hughperkins in #691
[Graph] HIP graph runtime support for @qd.kernel(graph=True) by @hughperkins in #692

Metal

[Metal] Fix FIFO-queue ordering when sharing command queue. by @duburcqa in #694

Atomics

[Atomics] New QIPC ops for atomics by @hughperkins in #690

Structs

[Structs] Pass dataclass sub-structs into qd.func by @hughperkins in #698

CI

[CI] Add per-file timing report to Mac Metal test job by @hughperkins in #695
[CI] Enable kernel disk cache during tests by @hughperkins in #696

Full Changelog: v0.7.8...v0.8.0

Contributors

hughperkins and duburcqa

Assets 2

11 May 12:31

hughperkins

v0.7.8

367984e

v0.7.8

Release v0.7.8

This release contains further autodiff optimizations; and generalizes block and atomic operations across all GPU architectures.

What's Changed

Perf

[Perf] Adstack max-reducer: launch cache + zero-copy result map; content-stable registry_id by @duburcqa in #671
[Perf] CPU LLVM adstack-cache: skip per-launch bump-writes + ndarray_shapes capture on forward-only handles by @duburcqa in #685

SPIR-V

[SPIR-V] dispatch_max_reducers: register each task with the real kernel name by @duburcqa in #675

AutoDiff

[AutoDiff] Debug-mode field/grad/dual: dtype, layout, and access-time invariants by @duburcqa in #677

Docs

[Docs] Add user-guide page for qd.algorithms.* device-wide algorithms by @hughperkins in #642
[Docs] Doc for existing atomics: switch support table to per-backend columns by @hughperkins in #657

GPU

[GPU] Cross gpu atomics by @hughperkins in #666
[GPU] Make block operations portable cross-gpu by @hughperkins in #664

Full Changelog: v0.7.6...v0.7.8

Contributors

hughperkins and duburcqa

Assets 2

10 May 14:35

duburcqa

v0.7.7

a257ece

v0.7.7

This release mainly targets autodiff. It fixes SPIR-V backends (Metal, Vulkan), significantly improves runtime speed (up to 30%), and add full support of debug mode.

What's Changed

AutoDiff

[Perf] Adstack max-reducer: launch cache + zero-copy result map; content-stable registry_id by @duburcqa in #671
[SPIR-V] dispatch_max_reducers: register each task with the real kernel name by @duburcqa in #675
[AutoDiff] Debug-mode field/grad/dual: dtype, layout, and access-time invariants by @duburcqa in #677

Full Changelog: v0.7.6...v0.7.7

Contributors

duburcqa

Assets 2

Releases: Genesis-Embodied-AI/quadrants

v1.0.2

Release v1.0.2

What's Changed

Contributors

Uh oh!

v1.0.1

Release v1.0.1

What's Changed

Perf

DataOriented

Lang

CI

Doc

Test

Contributors

Uh oh!

v1.0.1b2

Pre-release v1.0.1b2

Changes:

Data-oriented

Contributors

Uh oh!

v1.0.1b1

Pre-release v1.0.1b1

Changes:

Data-oriented

Contributors

Uh oh!

v1.0.0

Release v1.0.0

What's Changed

GPU

Cleaning

Atomics

AutoDiff

Vulkan

Contributors

Uh oh!

v0.8.1b2

Pre-release v0.8.1b2

What's Changed

GPU

Cleaning

Atomics

AutoDiff

DataOriented

Contributors

Uh oh!

v0.8.1b1

Pre-release v0.8.1b1

What's Changed

GPU

Cleaning

Atomics

AutoDiff

DataOriented

Contributors

Uh oh!

v0.8.0

Release v0.8.0

What's Changed

GPU

Math

AMDGPU

Graph

Metal

Atomics

Structs

CI

Contributors

Uh oh!

v0.7.8

Release v0.7.8

What's Changed

Perf

SPIR-V

AutoDiff

Docs

GPU