Skip to content

Releases: awslabs/awsome-distributed-ai

v2.0.0-pre-reorg — pre-reorganization snapshot

03 Jun 08:24
0edaf15

Choose a tag to compare

Snapshot of main immediately before the repository reorganization (#1056).

This release preserves the legacy numbered directory layout (0.docs/, 1.architectures/, 2.ami_and_containers/, 3.test_cases/, 4.validation_and_observability/, including 1.architectures/5.sagemaker-hyperpod/) at a stable ref, so external consumers can pin to it while main is reorganized.

Why this exists

The SageMaker HyperPod console reads cluster lifecycle scripts directly from numbered paths on main (e.g. 1.architectures/5.sagemaker-hyperpod/LifecycleScripts/base-config/). Pinning the console's CloudFormation to this tag keeps it working with zero downtime while the reorganization (PR #1119) removes the numeric prefixes and restructures examples/ into training/ + use-cases/.

Pinning anchor

https://github.com/awslabs/awsome-distributed-ai/tree/v2.0.0-pre-reorg

Raw lifecycle scripts remain available under, e.g.:

https://raw.githubusercontent.com/awslabs/awsome-distributed-ai/v2.0.0-pre-reorg/1.architectures/5.sagemaker-hyperpod/LifecycleScripts/base-config/...

Lifecycle

Once the HyperPod service is repointed to the new paths on main (post-reorg), this tag can be retired.

Targets main at commit 0edaf1526bc47185ce5f827100416f99bd7b69e1 (HEAD prior to the reorg).

v1.2.0

14 Feb 01:15

Choose a tag to compare

What's Changed

Read more

Release before the mass migration work

31 Mar 08:03

Choose a tag to compare

This release is pointing out the old directory structure + test cases.

This release creates a new "opt-in" openZFS filesystem as a home-directory on SageMaker HyperPod Slurm clusters, to address the Lots of Small Files (LoSF) issue encountered frequently when creating Conda Environments on default home directories where Lustre exists.

Release before re-organize

08 Feb 01:13

Choose a tag to compare