Skip to content

Releases: Genesis-Embodied-AI/quadrants

v1.0.2

04 Jun 01:31
701c11a

Choose a tag to compare

Release v1.0.2

This release reverts 'fix ndarrays on data oriented' which caused a regression in Genesis.

What's Changed

  • [Type] Add unpacked form for qd.vector for indexed register access by @hughperkins in #718
  • [DataOriented] Revert "[DataOriented] Fix ndarrays on data oriented (#704)" by @hughperkins in #719

Full Changelog: v1.0.1...v1.0.2

v1.0.1

03 Jun 15:56
d2fcd4c

Choose a tag to compare

Release v1.0.1

This release adds axes= to ndrange, bitonic sort, and adds 32x32 Cholesky tiles.

What's Changed

Perf

  • [Perf] Move register-tile Cholesky optimizations from Genesis back into quadrants by @hughperkins in #714
  • [Perf] Add bitonic sort to subgroup ops by @hughperkins in #713

DataOriented

Lang

CI

  • [CI] Add slow marker and remove un-necessary tests by @hughperkins in #711
  • [CI] Upgrade PR change report from composer 2 to composer 2.5 by @hughperkins in #716
    AI/quadrants/pull/710

Doc

  • [Doc] Fix spelling of 'Ying' to 'Yin' in README by @zhouxian in #715

Test

Full Changelog: v1.0.0...v1.0.1

v1.0.1b2

19 May 18:38

Choose a tag to compare

v1.0.1b2 Pre-release
Pre-release

Pre-release v1.0.1b2

Changes:

Data-oriented

Full Changelog: v1.0.0...v1.0.1b2

v1.0.1b1

19 May 18:24

Choose a tag to compare

v1.0.1b1 Pre-release
Pre-release

Pre-release v1.0.1b1

Changes:

Data-oriented

Full Changelog: v1.0.0...v1.0.1b1

v1.0.0

19 May 16:07
5d68fc4

Choose a tag to compare

Release v1.0.0

This release adds new device-level ops for QIPC, and volatile_load.

What's Changed

GPU

Cleaning

Atomics

AutoDiff

  • [AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info by @hughperkins in #708

Vulkan

  • [Vulkan] Declare GroupNonUniform SPIR-V caps and enable shaderSubgroupExtendedTypes by @hughperkins in #707

Full Changelog: v0.8.0...v1.0.0

v0.8.1b2

19 May 07:52

Choose a tag to compare

v0.8.1b2 Pre-release
Pre-release

Pre-release v0.8.1b2

This pre-release is to test a new faster more streamlined data_oriented class on Genesis.

What's Changed

GPU

Cleaning

Atomics

AutoDiff

  • [AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info by @hughperkins in #708

DataOriented

Full Changelog: v0.8.0...v0.8.1b2

v0.8.1b1

19 May 05:36
8b71703

Choose a tag to compare

v0.8.1b1 Pre-release
Pre-release

Pre-release v0.8.1b1

This pre-release is to test a new faster more streamlined data_oriented class on Genesis.

What's Changed

GPU

Cleaning

Atomics

AutoDiff

  • [AutoDiff] Reject recycled identity_key in AdStackCache::register_adstack_sizing_info by @hughperkins in #708

DataOriented

Full Changelog: v0.8.0...v0.8.1b1

v0.8.0

16 May 12:07
a22cc2d

Choose a tag to compare

Release v0.8.0

This release brings many cross-GPU SIMT primitives, at both subgroup and block level. Note that subgroup reductions no longer take log2_size parameter, which is a breaking change, hence the minor version bump. In addition, AMD always uses wave64 going forward, to simplify testing.

What's Changed

GPU

Math

AMDGPU

  • [AMDGPU] Always use wave64, on both RDNA and CDNA by @hughperkins in #687
  • [AMDGPU] Use syncscope("agent") for atomix xor to avoid CAS livelock by @hughperkins in #672

Graph

Metal

  • [Metal] Fix FIFO-queue ordering when sharing command queue. by @duburcqa in #694

Atomics

Structs

CI

Full Changelog: v0.7.8...v0.8.0

v0.7.8

11 May 12:31
367984e

Choose a tag to compare

Release v0.7.8

This release contains further autodiff optimizations; and generalizes block and atomic operations across all GPU architectures.

What's Changed

Perf

  • [Perf] Adstack max-reducer: launch cache + zero-copy result map; content-stable registry_id by @duburcqa in #671
  • [Perf] CPU LLVM adstack-cache: skip per-launch bump-writes + ndarray_shapes capture on forward-only handles by @duburcqa in #685

SPIR-V

  • [SPIR-V] dispatch_max_reducers: register each task with the real kernel name by @duburcqa in #675

AutoDiff

  • [AutoDiff] Debug-mode field/grad/dual: dtype, layout, and access-time invariants by @duburcqa in #677

Docs

  • [Docs] Add user-guide page for qd.algorithms.* device-wide algorithms by @hughperkins in #642
  • [Docs] Doc for existing atomics: switch support table to per-backend columns by @hughperkins in #657

GPU

Full Changelog: v0.7.6...v0.7.8

v0.7.7

10 May 14:35
a257ece

Choose a tag to compare

v0.7.7

This release mainly targets autodiff. It fixes SPIR-V backends (Metal, Vulkan), significantly improves runtime speed (up to 30%), and add full support of debug mode.

What's Changed

AutoDiff

  • [Perf] Adstack max-reducer: launch cache + zero-copy result map; content-stable registry_id by @duburcqa in #671
  • [SPIR-V] dispatch_max_reducers: register each task with the real kernel name by @duburcqa in #675
  • [AutoDiff] Debug-mode field/grad/dual: dtype, layout, and access-time invariants by @duburcqa in #677

Full Changelog: v0.7.6...v0.7.7