Skip to content

Conversation

@ClaudiaComito
Copy link
Contributor

@ClaudiaComito ClaudiaComito commented Mar 24, 2022

Description

DRAFT

Intro: main points from Numpy array indexing scheme

Heat: indexing massive, memory-distributed arrays

process-local indexing

distributed indexing

  • sorted key
  • non-sorted key

Memory footprint

Scaling behaviour

Issue/s resolved: #914 #918

Changes proposed:

  • feature extension in __process_key, getitem, and setitem methods
  • edge case handling
  • extensive comparison to numpy API in unittests

Type of change

Memory requirements

Performance

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relevant functions
  • Documentation updated (if needed)
  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

yes / no

skip ci

Mystic-Slice and others added 10 commits April 8, 2022 10:56
…ays (#937)

* Create ci.yaml

* Update ci.yaml

* Update ci.yaml

* Create CITATION.cff

* Update CITATION.cff

* Update ci.yaml

different python and pytorch versions

* Update ci.yaml

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Delete pre-commit.yml

* Update ci.yaml

* Update CITATION.cff

* Update tutorial.ipynb

delete example with different split axis

* Delete logo_heAT.pdf

Removal of old logo

* ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays

* Updated documentation and Unit-tests

* replace x.larray with local_x

* Code fixes

* Fix return type of nonzero function and gout value

* Made sure DNDarray meta-data is available to the tuple members

* Transpose before if-branching + adjustments to accomodate it

* Fixed global shape assignment

* Updated changelog

Co-authored-by: mtar <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Coquelin <[email protected]>
Co-authored-by: Markus Goetz <[email protected]>
Co-authored-by: Claudia Comito <[email protected]>
…oltz-analytics/heat into 914_adv-indexing-outshape-outsplit
@github-actions
Copy link
Contributor

Thank you for the PR!

@github-actions
Copy link
Contributor

Thank you for the PR!

@github-actions
Copy link
Contributor

Thank you for the PR!

mrfh92
mrfh92 previously approved these changes Dec 17, 2025
Copy link
Collaborator

@mrfh92 mrfh92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approve just to let CI matrix run.
No further checking done (!)

@github-project-automation github-project-automation bot moved this from In Progress to Merge queue in Roadmap Dec 17, 2025
@github-actions
Copy link
Contributor

Thank you for the PR!

@github-actions
Copy link
Contributor

Thank you for the PR!

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

Thank you for the PR!

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

Thank you for the PR!

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

Thank you for the PR!

@@ -879,6 +882,641 @@ def fill_diagonal(self, value: float) -> DNDarray:

return self

def __process_key(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__process_key and __process_scalar_key do not use self, so they should not be declared inside the DNDArray.

Possibly move to indexing.py?

@@ -879,6 +882,641 @@ def fill_diagonal(self, value: float) -> DNDarray:

return self

def __process_key(
arr: DNDarray,
key: Union[Tuple[int, ...], List[int, ...]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to define a key type to make type definitions easier.

Index = int | slice | Ellipsis | None
Indexer = Index | tuple[Index, ...]

And the apply it everywhere.

def __getitem__(self, key: Union[int, Tuple[int, ...], List[int, ...]]) -> DNDarray:
def __process_key(
arr: "DNDarray",
key: tuple[int, ...] | list[int],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should define an index, indexer type for shorter type hints

Index = int | slice | Ellipsis | None
Indexer = Index | tuple[Index, ...]

key = kst + slices + kend
else:
key = key + [slice(None)] * (self.ndim - len(key))
from .types import bool as ht_bool, uint8 as ht_uint8 # avoid circulars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need a better solution, not sure what performance impact this could have over the long run.

for i in range(len(key[: self.split + 1])):
if self.__key_is_singular(key, i, self_proxy):
new_split = None if i == self.split else new_split - 1
def _normalize_index_component(comp):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for the defining the function here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be moved to the sanitation module.

Comment on lines +1575 to +1578
if isinstance(key, DNDarray):
key = _normalize_index_component(key)
elif isinstance(key, (list, tuple)):
key = type(key)(_normalize_index_component(k) for k in key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check if key is a tuple, and the normalization function just returns the list/tuple in most cases. Logic could be simplified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key also might always be a tuple? Need to actually check the entries of the first element.

@ClaudiaComito ClaudiaComito modified the milestones: 1.7.0, 1.8.0 Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core enhancement New feature or request indexing linalg PR talk testing Implementation of tests, or test-related issues

Projects

Status: Merge queue

Development

Successfully merging this pull request may close these issues.

Address distributed non-ordered indexing

6 participants