Expand distributed indexing, match numpy indexing scheme #938

ClaudiaComito · 2022-03-24T05:23:18Z

Description

DRAFT

Intro: main points from Numpy array indexing scheme

Heat: indexing massive, memory-distributed arrays

process-local indexing

distributed indexing

sorted key
non-sorted key

Memory footprint

Scaling behaviour

Issue/s resolved: #914 #918

Changes proposed:

feature extension in __process_key, getitem, and setitem methods
edge case handling
extensive comparison to numpy API in unittests

Type of change

Memory requirements

Performance

Due Diligence

All split configurations tested
Multiple dtypes tested in relevant functions
Documentation updated (if needed)
Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

yes / no

skip ci

…ndexing

…y slice-indexing. UNTESTED

…sition in the index_proxy

…ays (#937) * Create ci.yaml * Update ci.yaml * Update ci.yaml * Create CITATION.cff * Update CITATION.cff * Update ci.yaml different python and pytorch versions * Update ci.yaml * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Delete pre-commit.yml * Update ci.yaml * Update CITATION.cff * Update tutorial.ipynb delete example with different split axis * Delete logo_heAT.pdf Removal of old logo * ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays * Updated documentation and Unit-tests * replace x.larray with local_x * Code fixes * Fix return type of nonzero function and gout value * Made sure DNDarray meta-data is available to the tuple members * Transpose before if-branching + adjustments to accomodate it * Fixed global shape assignment * Updated changelog Co-authored-by: mtar <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel Coquelin <[email protected]> Co-authored-by: Markus Goetz <[email protected]> Co-authored-by: Claudia Comito <[email protected]>

…pe and new split axis

…r boolean mask

…splits

…oltz-analytics/heat into 914_adv-indexing-outshape-outsplit

github-actions · 2025-12-15T14:13:17Z

Thank you for the PR!

github-actions · 2025-12-15T15:45:51Z

Thank you for the PR!

github-actions · 2025-12-17T09:04:37Z

Thank you for the PR!

mrfh92

approve just to let CI matrix run.
No further checking done (!)

github-actions · 2025-12-20T12:06:08Z

Thank you for the PR!

github-actions · 2025-12-20T22:00:37Z

Thank you for the PR!

github-actions · 2026-01-05T07:56:15Z

Thank you for the PR!

github-actions · 2026-01-09T13:35:13Z

Thank you for the PR!

github-actions · 2026-01-09T14:28:14Z

Thank you for the PR!

JuanPedroGHM · 2026-01-09T16:17:28Z

heat/core/dndarray.py

@@ -879,6 +882,641 @@ def fill_diagonal(self, value: float) -> DNDarray:

        return self

+    def __process_key(


__process_key and __process_scalar_key do not use self, so they should not be declared inside the DNDArray.

Possibly move to indexing.py?

JuanPedroGHM · 2026-01-09T16:25:01Z

heat/core/dndarray.py

@@ -879,6 +882,641 @@ def fill_diagonal(self, value: float) -> DNDarray:

        return self

+    def __process_key(
+        arr: DNDarray,
+        key: Union[Tuple[int, ...], List[int, ...]],


Would be good to define a key type to make type definitions easier.

Index = int | slice | Ellipsis | None Indexer = Index | tuple[Index, ...]

And the apply it everywhere.

JuanPedroGHM · 2026-01-09T16:28:06Z

heat/core/dndarray.py

-    def __getitem__(self, key: Union[int, Tuple[int, ...], List[int, ...]]) -> DNDarray:
+    def __process_key(
+        arr: "DNDarray",
+        key: tuple[int, ...] | list[int],


We should define an index, indexer type for shorter type hints

Index = int | slice | Ellipsis | None Indexer = Index | tuple[Index, ...]

JuanPedroGHM · 2026-01-13T11:20:45Z

heat/core/dndarray.py

-            key = kst + slices + kend
-        else:
-            key = key + [slice(None)] * (self.ndim - len(key))
+        from .types import bool as ht_bool, uint8 as ht_uint8  # avoid circulars


We probably need a better solution, not sure what performance impact this could have over the long run.

JuanPedroGHM · 2026-01-13T11:23:03Z

heat/core/dndarray.py

-                for i in range(len(key[: self.split + 1])):
-                    if self.__key_is_singular(key, i, self_proxy):
-                        new_split = None if i == self.split else new_split - 1
+        def _normalize_index_component(comp):


What is the reason for the defining the function here?

Probably should be moved to the sanitation module.

JuanPedroGHM · 2026-01-13T11:26:59Z

heat/core/dndarray.py

+        if isinstance(key, DNDarray):
+            key = _normalize_index_component(key)
+        elif isinstance(key, (list, tuple)):
+            key = type(key)(_normalize_index_component(k) for k in key)


Double check if key is a tuple, and the normalization function just returns the list/tuple in most cases. Logic could be simplified.

Key also might always be a tuple? Need to actually check the entries of the first element.

ClaudiaComito and others added 11 commits February 17, 2022 13:40

Broken. __getitem__ refactoring in prep for distributed/non-ordered i…

445fc94

…ndexing

Preprocess key, workaround torch_proxy for advanced indexing, simplif…

6641d1e

…y slice-indexing. UNTESTED

put advanced index shape in the dimensions name to get the correct po…

cd78ecb

…sition in the index_proxy

first changes to setitem

7d97ea2

Expand __process_key() to address advanced indexing.

0c37abf

Address boolean indexing

b1508b9

separate advanced indexing on dim 0 from adv ind across dimensions

ae5af94

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

ace900a

Replace sanitize_in with try:...except: construct

0a8cb35

nonzero(): do not assume input DNDarray is load-balanced

6c7c10a

Memory management

fb3524b

ClaudiaComito mentioned this pull request Mar 25, 2022

fix #925: ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays #937

Merged

4 tasks

Mystic-Slice and others added 10 commits April 8, 2022 10:56

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

8485a31

calculate output_shape, split axis bookkeeping for advanced indexing

a52e518

__process_key() to return expanded array, expanded key, output gsha…

5995639

…pe and new split axis

in , copy before manipulations

3830e62

nonzero() to return tuple of 1D arrays, stable distributed results

82b2508

update __process_key(), get rid of recursive calls, __getitem__ broken

aafaf99

deal with scalar key, local and distributed cases

b746872

test getitem separately, follow numpy Indexing on ndarray examples

00fe538

test for 0-dim DNDarray key

4360bd1

This was referenced Aug 30, 2022

[Bug]: Indexing with 0-dimensional key #1019

Closed

[Bug]: Slice error when array contains an axis of length 0 #1012

Closed

ClaudiaComito added 6 commits August 31, 2022 09:31

Expand __process_key() to deal with distributed boolean mask

231c1de

Expand test_getitem for distributed single-element indexing, non-dist…

f19f902

…r boolean mask

Add check for matching boolean index / indexed array shapes

7ed435f

Only sort result if input.split != 0

0da7f56

BROKEN: distributed boolean indexing to return stable result for all …

e55c7f9

…splits

Add tests for distributed boolean indexing

75d9314

Merge branch '914_adv-indexing-outshape-outsplit' of github.com:helmh…

e6256d5

…oltz-analytics/heat into 914_adv-indexing-outshape-outsplit

Improved code coverage in tests

6da8259

Hakdag97 added the PR talk label Dec 15, 2025

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

3c9b989

mrfh92 previously approved these changes Dec 17, 2025

View reviewed changes

github-project-automation bot moved this from In Progress to Merge queue in Roadmap Dec 17, 2025

Hakdag97 added 3 commits December 20, 2025 11:27

Test debugging advanced indexing for dmd

2bfdbc5

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

0149260

Fixed bug in process_key leading to failing dmd test

17446a2

Hakdag97 dismissed mrfh92’s stale review via 17446a2 December 20, 2025 12:02

Robustified edge cases in __process_key

9aa581e

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

75483a7

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

477aa2f

Merge branch 'main' into 914_adv-indexing-outshape-outsplit

4d79b0b

chore: minor type hints improvements for dndarray.py

df6714d

JuanPedroGHM reviewed Jan 9, 2026

View reviewed changes

JuanPedroGHM reviewed Jan 13, 2026

View reviewed changes

ClaudiaComito modified the milestones: 1.7.0, 1.8.0 Jan 14, 2026

JuanPedroGHM removed the benchmark PR label Jan 14, 2026

		@@ -879,6 +882,641 @@ def fill_diagonal(self, value: float) -> DNDarray:

		return self

		def __process_key(

Expand distributed indexing, match numpy indexing scheme #938

Are you sure you want to change the base?

Expand distributed indexing, match numpy indexing scheme #938

Uh oh!

Conversation

ClaudiaComito commented Mar 24, 2022 • edited by Hakdag97 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

DRAFT

Intro: main points from Numpy array indexing scheme

Heat: indexing massive, memory-distributed arrays

process-local indexing

distributed indexing

Memory footprint

Scaling behaviour

Changes proposed:

Type of change

Memory requirements

Performance

Due Diligence

Does this change modify the behaviour of other functions? If so, which?

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

mrfh92 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 20, 2025

Uh oh!

github-actions bot commented Dec 20, 2025

Uh oh!

github-actions bot commented Jan 5, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ClaudiaComito commented Mar 24, 2022 •

edited by Hakdag97

Loading