[query] remove uses of `HailContext.backend` #14964

ehigham · 2025-07-23T03:19:19Z

This change removes uses of HailContext.backend as part of an effort to remove the HailContext singleton. Instead, the current Backend is accessed by threading ExecuteContext.

This change cannot impact the Hail Batch instance as deployed by Broad Institute in GCP

ehigham · 2025-07-23T03:19:26Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

ehigham · 2025-09-17T18:33:18Z

hail/hail/src/is/hail/HailContext.scala

-  def parseVCFMetadata(fs: FS, file: String): Map[String, Map[String, Map[String, String]]] =
-    LoadVCF.parseHeaderMetadata(fs, Set.empty, TFloat64, file)
-
-  def pyParseVCFMetadataJSON(fs: FS, file: String): String = {
-    val metadata = LoadVCF.parseHeaderMetadata(fs, Set.empty, TFloat64, file)
-    implicit val formats = defaultJSONFormats
-    JsonMethods.compact(Extraction.decompose(metadata))
-  }


Unused/implemented in BackendRpc

chrisvittal

I had a thought to avoid the 'hack' you put in. Fantastic work though.

chrisvittal · 2025-09-24T15:17:42Z

hail/hail/src/is/hail/expr/ir/TableIR.scala

  def pathsUsed: Seq[String] = FastSeq(params.path)

-  val getNumPartitions: Int = params.nPartitions.getOrElse(HailContext.backend.defaultParallelism)
+  val getNumPartitions: Int = params.nPartitions.getOrElse(4)


Thought, make this parameter required, and then supply it with a default from python.

This is where we construct this node in python. What if we were to get the parallelism from the branching_factor flag here?

hail/hail/python/hail/linalg/blockmatrix.py

Lines 1773 to 1827 in e899546

@typecheck_method(n_partitions=nullable(int), maximum_cache_memory_in_bytes=nullable(int))

def to_table_row_major(self, n_partitions=None, maximum_cache_memory_in_bytes=None):

"""Returns a table where each row represents a row in the block matrix.

The resulting table has the following fields:

- **row_idx** (:py:data.`tint64`, key field) -- Row index

- **entries** (:py:class:`.tarray` of :py:data:`.tfloat64`) -- Entries for the row

Examples

--------

>>> import numpy as np

>>> block_matrix = BlockMatrix.from_numpy(np.array([[1, 2], [3, 4], [5, 6]]), 2)

>>> t = block_matrix.to_table_row_major()

>>> t.show()

+---------+---------------------+

| row_idx | entries |

+---------+---------------------+

| int64 | array<float64> |

+---------+---------------------+

| 0 | [1.00e+00,2.00e+00] |

| 1 | [3.00e+00,4.00e+00] |

| 2 | [5.00e+00,6.00e+00] |

+---------+---------------------+

Parameters

----------

n_partitions : int or None

Number of partitions of the table.

maximum_cache_memory_in_bytes : int or None

The amount of memory to reserve, per partition, to cache rows of the

matrix in memory. This value must be at least large enough to hold

one row of the matrix in memory. If this value is exactly the size of

one row, then a partition makes a network request for every row of

every block. Larger values reduce the number of network requests. If

memory permits, setting this value to the size of one output

partition permits one network request per block per partition.

Notes

-----

Does not support block-sparse matrices.

Returns

-------

:class:`.Table`

Table where each row corresponds to a row in the block matrix.

"""

path = new_temp_file()

if maximum_cache_memory_in_bytes and maximum_cache_memory_in_bytes > (1 << 31) - 1:

raise ValueError(

f'maximum_cache_memory_in_bytes must be less than 2^31 -1, was: {maximum_cache_memory_in_bytes}'

)

self.write(path, overwrite=True, force_row_major=True)

reader = TableFromBlockMatrixNativeReader(path, n_partitions, maximum_cache_memory_in_bytes)

return Table(TableRead(reader))

patrick-schultz · 2025-09-24T16:48:44Z

hail/hail/src/is/hail/backend/BackendUtils.scala

    val remainingPartitions =
      contexts.indices.filterNot(k => cachedResults.containsOrdered[Int](k, _ < _, _._2))

-    val backend = HailContext.backend


I see you get rid of this upstack, but I'm curious if there's a simple rule for when we get the backend via an ExecuteContext, vs when we still need to get a HailContext (for now, later using a different mechanism). Is it just a compile-time vs runtime distinction?

I've been proceeding on the basis of avoiding "global" mutable fields entirely, instead favouring dependency injection in code that we maintain (via ExecuteContext in this case). For generated code, using a constant pool or some such is probably the right thing to do, so long as it doesn't depend on non-generated code.

For Backend specifically, my intention is that that ref in the upcoming change should only be used by BackendUtils.collectDArray . I hope is to remove the mutable ref eventually, either by

code-generating parallelizeAndComputeWithIndex,

by initialising a "constant" field in the generated code with either

the backend in a similar way to reference genomes etc

the BackendContext that's passed to collectDArray

I'm not sure if that answers your question properly...

I might just do that last one now...

I'm not sure if that answers your question properly...

It does, thanks!

https://app.graphite.dev/github/pr/hail-is/hail/15107

patrick-schultz

Great change!

This was referenced Jul 23, 2025

[query] Configure Optimiser via Flags, not HailContext #14952

Merged

[query] remove branchingFactor from HailContext #14963

Merged

ehigham force-pushed the ehigham/remove-hail-context-references branch from 76131f2 to 20d92fc Compare July 23, 2025 03:41

ehigham force-pushed the ehigham/branching-factor-flags branch from 3adc65e to 11403ed Compare July 23, 2025 03:41

ehigham force-pushed the ehigham/remove-hail-context-references branch from 20d92fc to 219a9e3 Compare July 23, 2025 03:49

ehigham force-pushed the ehigham/branching-factor-flags branch from 11403ed to 2e67e7b Compare July 23, 2025 03:49

ehigham force-pushed the ehigham/remove-hail-context-references branch from 219a9e3 to c5ac897 Compare July 23, 2025 15:32

ehigham force-pushed the ehigham/branching-factor-flags branch from 2e67e7b to 417e8a3 Compare July 23, 2025 15:32

ehigham force-pushed the ehigham/branching-factor-flags branch from 417e8a3 to e739136 Compare July 30, 2025 20:07

ehigham force-pushed the ehigham/remove-hail-context-references branch 2 times, most recently from d69c346 to c870e2a Compare July 31, 2025 16:12

ehigham force-pushed the ehigham/branching-factor-flags branch 2 times, most recently from 2d38548 to c3bb8a1 Compare July 31, 2025 16:38

ehigham force-pushed the ehigham/remove-hail-context-references branch from c870e2a to d32eac1 Compare July 31, 2025 16:38

ehigham force-pushed the ehigham/branching-factor-flags branch from c3bb8a1 to 24a200f Compare July 31, 2025 17:41

ehigham force-pushed the ehigham/remove-hail-context-references branch from d32eac1 to 214074f Compare July 31, 2025 17:41

ehigham force-pushed the ehigham/branching-factor-flags branch from 24a200f to ae470ba Compare July 31, 2025 17:41

ehigham force-pushed the ehigham/remove-hail-context-references branch from 214074f to 4d26e71 Compare July 31, 2025 17:41

ehigham force-pushed the ehigham/branching-factor-flags branch from ae470ba to c164f2f Compare August 1, 2025 14:58

ehigham force-pushed the ehigham/remove-hail-context-references branch from 4d26e71 to 32b7556 Compare August 1, 2025 14:58

ehigham force-pushed the ehigham/branching-factor-flags branch from c164f2f to 4538bfd Compare August 1, 2025 22:11

ehigham force-pushed the ehigham/remove-hail-context-references branch 2 times, most recently from dfa72f0 to aaf34a2 Compare August 1, 2025 22:16

ehigham force-pushed the ehigham/branching-factor-flags branch 2 times, most recently from 9d81d62 to 4f0cd2b Compare August 2, 2025 04:21

ehigham force-pushed the ehigham/remove-hail-context-references branch from aaf34a2 to c75ea75 Compare August 2, 2025 04:21

ehigham force-pushed the ehigham/branching-factor-flags branch from 4f0cd2b to c070764 Compare August 2, 2025 04:59

ehigham force-pushed the ehigham/remove-hail-context-references branch from c75ea75 to 3cc18ef Compare August 2, 2025 04:59

ehigham force-pushed the ehigham/branching-factor-flags branch from c070764 to 8834837 Compare August 2, 2025 05:02

ehigham force-pushed the ehigham/move-check-rvd-keys branch 2 times, most recently from 33d0b37 to 840e82a Compare September 17, 2025 18:01

ehigham force-pushed the ehigham/remove-hail-context-references branch from 8760f47 to 2d586d3 Compare September 17, 2025 18:01

ehigham commented Sep 17, 2025

View reviewed changes

Base automatically changed from ehigham/move-check-rvd-keys to main September 18, 2025 21:18

ehigham force-pushed the ehigham/remove-hail-context-references branch from 2d586d3 to c92b622 Compare September 19, 2025 14:59

ehigham requested review from chrisvittal and patrick-schultz September 23, 2025 19:51

chrisvittal requested changes Sep 24, 2025

View reviewed changes

patrick-schultz reviewed Sep 24, 2025

View reviewed changes

ehigham mentioned this pull request Sep 24, 2025

[query] remove global Backend field #15107

Merged

ehigham force-pushed the ehigham/remove-hail-context-references branch 3 times, most recently from c84b0e3 to 1fb57c6 Compare September 25, 2025 16:30

ehigham requested review from chrisvittal and patrick-schultz September 25, 2025 16:38

ehigham force-pushed the ehigham/remove-hail-context-references branch 3 times, most recently from f8dccee to 7bd6dc4 Compare September 25, 2025 17:09

chrisvittal approved these changes Sep 25, 2025

View reviewed changes

patrick-schultz approved these changes Sep 25, 2025

View reviewed changes

[query] remove unnecessary references to HailContext

52f5a2d

ehigham force-pushed the ehigham/remove-hail-context-references branch from 7bd6dc4 to 52f5a2d Compare September 25, 2025 18:37

hail-ci-robot merged commit 69d8951 into main Sep 25, 2025
2 checks passed

hail-ci-robot deleted the ehigham/remove-hail-context-references branch September 25, 2025 20:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[query] remove uses of `HailContext.backend` #14964

[query] remove uses of `HailContext.backend` #14964

Uh oh!

ehigham commented Jul 23, 2025 •

edited

Loading

Uh oh!

ehigham commented Jul 23, 2025 •

edited

Loading

Uh oh!

ehigham Sep 17, 2025

Uh oh!

chrisvittal left a comment

Uh oh!

chrisvittal Sep 24, 2025

Uh oh!

patrick-schultz Sep 24, 2025

Uh oh!

ehigham Sep 24, 2025 •

edited

Loading

Uh oh!

ehigham Sep 24, 2025

Uh oh!

patrick-schultz Sep 24, 2025

Uh oh!

ehigham Sep 25, 2025

Uh oh!

patrick-schultz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	@typecheck_method(n_partitions=nullable(int), maximum_cache_memory_in_bytes=nullable(int))
	def to_table_row_major(self, n_partitions=None, maximum_cache_memory_in_bytes=None):
	"""Returns a table where each row represents a row in the block matrix.

	The resulting table has the following fields:
	- row_idx (:py:data.`tint64`, key field) -- Row index
	- entries (:py:class:`.tarray` of :py:data:`.tfloat64`) -- Entries for the row

	Examples
	--------
	>>> import numpy as np
	>>> block_matrix = BlockMatrix.from_numpy(np.array([[1, 2], [3, 4], [5, 6]]), 2)
	>>> t = block_matrix.to_table_row_major()
	>>> t.show()
	+---------+---------------------+
	\| row_idx \| entries \|
	+---------+---------------------+
	\| int64 \| array<float64> \|
	+---------+---------------------+
	\| 0 \| [1.00e+00,2.00e+00] \|
	\| 1 \| [3.00e+00,4.00e+00] \|
	\| 2 \| [5.00e+00,6.00e+00] \|
	+---------+---------------------+

	Parameters
	----------
	n_partitions : int or None
	Number of partitions of the table.
	maximum_cache_memory_in_bytes : int or None
	The amount of memory to reserve, per partition, to cache rows of the
	matrix in memory. This value must be at least large enough to hold
	one row of the matrix in memory. If this value is exactly the size of
	one row, then a partition makes a network request for every row of
	every block. Larger values reduce the number of network requests. If
	memory permits, setting this value to the size of one output
	partition permits one network request per block per partition.

	Notes
	-----
	Does not support block-sparse matrices.

	Returns
	-------
	:class:`.Table`
	Table where each row corresponds to a row in the block matrix.
	"""
	path = new_temp_file()
	if maximum_cache_memory_in_bytes and maximum_cache_memory_in_bytes > (1 << 31) - 1:
	raise ValueError(
	f'maximum_cache_memory_in_bytes must be less than 2^31 -1, was: {maximum_cache_memory_in_bytes}'
	)

	self.write(path, overwrite=True, force_row_major=True)
	reader = TableFromBlockMatrixNativeReader(path, n_partitions, maximum_cache_memory_in_bytes)
	return Table(TableRead(reader))

[query] remove uses of HailContext.backend #14964

[query] remove uses of HailContext.backend #14964

Uh oh!

Conversation

ehigham commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehigham commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehigham Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

chrisvittal left a comment

Choose a reason for hiding this comment

Uh oh!

chrisvittal Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

patrick-schultz Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

ehigham Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehigham Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

patrick-schultz Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

ehigham Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

patrick-schultz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[query] remove uses of `HailContext.backend` #14964

[query] remove uses of `HailContext.backend` #14964

ehigham commented Jul 23, 2025 •

edited

Loading

ehigham commented Jul 23, 2025 •

edited

Loading

ehigham Sep 24, 2025 •

edited

Loading