feat: Rust internals by johnchildren · Pull Request #266 · Quantinuum/tierkreis

johnchildren · 2026-01-02T13:47:10Z

No description provided.

tierkreis_visualization/tierkreis_visualization/main.py

johnchildren · 2026-01-06T10:31:13Z

tierkreis_core/rust/value.rs

+    }
+
+    #[derive(Clone, Debug, PartialEq, Serialize, Deserialize, FromPyObject, IntoPyObject)]
+    pub enum Value {


This is quite non-exhaustive and should realistically match PType (though it technically works as a subset of PType that can be used as constants in nodes)

johnchildren · 2026-01-06T10:33:08Z

tierkreis_core/rust/identifiers.rs

+
+    #[derive(Clone, Debug, PartialEq, Serialize, Deserialize, FromPyObject, IntoPyObject)]
+    pub enum ExteriorOrValueRef {
+        Exterior(ExteriorRef),


I would like to remove Exterior as an option potentially as it would simplify some of the code a bit, this requires some changes to the controller code itself however.

johnchildren · 2026-01-06T10:35:27Z

tierkreis_core/rust/bin/tierkreis-core-stubs-gen.rs

+
+use regex::Regex;
+
+pub fn main() {


This code is admittedly quite janky, pyO3 has a tracking issue to do this better: PyO3/pyo3#5137

I think other projects inside the company are using a solution that is a bit more advanced, but it seemed like I would need to use more proc macros which seemed unappealing initially.

johnchildren · 2026-01-06T10:36:37Z

tierkreis_core/python/tierkreis_core/aliases.py

+from tierkreis_core._tierkreis_core.graph import GraphData
+
+type PortID = str
+type Value = (


This is kind of a workaround for not getting a simple enum via introspection of the rust types, not sure how long it will stick around.

This might need a longer comment

tierkreis_core/rust/location.rs

tierkreis/tests/controller/main.py

johnchildren · 2026-01-06T10:42:42Z

tierkreis/tests/controller/test_graphdata.py


 def test_only_one_output():
-    with pytest.raises(TierkreisError):
+    with pytest.raises(ValueError):


Philosophical question here about whether we should use the standard python errors or custom ones. I can raise a custom exception in the rust code instead but this was easier for me to do at least initially.

Personally I prefer errors that reflect what the problem was - "MalformedGraphError" or "GraphTypingError", say, or "TierkreisRuntimeError" (/GraphExecutionError, say). I think it's probably more useful to be able to distinguish between kinds of error (even if (re)using existing python Error classes) rather than to switch all the errors from one type to another. (Ok, there is some value in switching everything from ValueError to TierkreisError because the latter is more distinct from ValueErrors from code that isn't tierkreis at all, but even that would be better done selectively.)

There's certainly some value in custom errors for making the errors more explanatory as well I suppose and custom errors can always inherit from the "pythonic" error base classes.

Just to double check, something like MalformedGraphError would be a subclass of TierkreisError? If so then I'm in favour.

How bad would multiple inheritance be? I could imagine something like

class MalformedGraphError(TierkreisError, ValueError): ...

tierkreis/tests/controller/test_graphdata_storage.py

tierkreis/tierkreis/controller/data/core.py

johnchildren · 2026-01-06T10:46:25Z

tierkreis/tierkreis/controller/data/models.py

        args: list[TKR] = []
        for ref in refs:
-            key = ref[1].replace("-*", "")
+            key = ref.port_id.replace("-*", "")


I guess one thing we could do at some point is have an explicit separate type for wildcard ports?

What is the type of key here? PortId? Node? It looks like it does not (conceptually) refer to a port anymore...

I think this is replacing some "wildcard" suffix ports from foo-* -> foo, but admittedly I am unsure why this is needed in this case or what the difference between * and foo-* is.

The existing code isn't very nice here. The situation is that we have a map node for which the return type of the body graph is a portmapping. (I.e. multiple wires out.) This means that we need to be able to distinguish wildcards corresponding to each of these output wires. Hopefully the example typed_destructuring is helpful.

Relatedly this is where TList comes in, which is another not very nice aspect of the existing code.

johnchildren · 2026-01-06T10:47:06Z

tierkreis/tierkreis/controller/storage/adjacency.py

-    ins = [x for x in ins if x[0] >= 0]  # inputs at -1 already finished
-    return [x for x in ins if not storage.is_node_finished(loc.N(x[0]))]
+    ins = node.in_edges.values()
+    ins = [x for x in ins if isinstance(x, ValueRef)]  # Only look an Values on Edges


I think this comment is both incorrect and needs expansion

johnchildren · 2026-01-06T10:50:10Z

tierkreis/tierkreis/controller/storage/protocol.py

    def read_node_def(self, node_location: Loc) -> NodeDef:
        bs = self.read(self._nodedef_path(node_location))
-        return NodeDefModel(**json.loads(bs)).root
+        return NodeDef.model_load_json(bs.decode())


I wonder if it would be better if the rust data structures could read bytes directly instead so we don't need to do the decoding here, but I was copying the API from pydantic

tierkreis/tierkreis/controller/start.py

johnchildren · 2026-01-08T10:37:57Z

tierkreis/tests/controller/test_graphdata_storage.py

 @pytest.mark.parametrize(
    ["node_location_str", "graph", "target"],
    [
-        ("-.N0", simple_eval(), Const(0, outputs=set(["value"]))),


I guess I should test the full definition as this was also looking at outputs before

tierkreis/tests/controller/test_locs.py

johnchildren · 2026-01-08T10:41:38Z

tierkreis/tierkreis/controller/storage/graphdata.py

 class NodeData(BaseModel):
    """Internal storage class to store all necessary node information."""

+    model_config = {"arbitrary_types_allowed": True}


It wouldn't be completely unreasonable to make NodeDef pydantic compatible, I'm unclear how much of an advantage there would be to doing so however as it's quite an error prone process

johnchildren · 2026-01-08T10:44:12Z

tierkreis_core/rust/location.rs

+            // Used to validate the output of the schema
+            let any_schema = core_schema.call_method0("any_schema")?;
+
+            let serialization = core_schema.call_method0("to_string_ser_schema")?;


I probably could add an example to the pydantic schema here if we want Loc's to appear properly in the API docs. A bit of a nice to have though.

tierkreis_visualization/tierkreis_visualization/data/eval.py

tierkreis_visualization/tierkreis_visualization/data/loop.py

tierkreis_visualization/tierkreis_visualization/data/map.py

tierkreis/tierkreis/controller/storage/protocol.py

acl-cqc

Thanks John, I can see you've done a lot of hard work here in particular introducing maturin etc. into what has been an all-python project. A few thoughts, mostly on the rust/python boundary and the new graph structure...

As you say, quite a big PR. I wonder if you can split it in two, or indeed perhaps even remove half of it. Specifically, can you remove the porting of Loc into Rust? I think the new API is much nicer, but (a) you could implement that in python, as an orthogonal change; or (b) delay having a rust Loc until you really need it (i.e. some amount of controller code moves into Rust too).
- The only use of Loc in Rust is for query_node_description which I reckon you could implement in python just querying the Rust graph structure "one level at a time". (Really I think the Loc is kind of a concept of the controller, not the graph; the latter needs to deal only with individual NodeId + PortId's)
ExteriorRef. You hint at this in one of your own comments. I think in an actual graph, all NodeDefs would contain ValueRef's (not ExteriorRefs); what would it take to remove ExteriorRef?
- Ah oh no, you allow a graph to refer to itself via def ref(), which is only valid because all the nodes in which the graph might be run have a graph input called body. Eeeek....can I say that again, eek? Ok - I think the thing to do here is to add a new input-less node, called say RecursiveGraphSelfReference (longer = more likely to alert the reader that something funny is going on....no ok, I jest, find a shorter name) that's a lot like Input i.e. produces the graph on its own output port. (Another advantage here is that the graph building does not rely on "all nodes that might run this graph must take it on an inport called body" - instead the actual graph is provided by the controller.)
- In new_eval_root which creates a NodeDef::Eval containing a load of ExteriorRefs. So the thing here is that NodeDef is used both to represent a node in a graph (with its edges - not the usual idea of a graph being nodes + edges; really this is NodeAndInEdges rather than NodeRef), and also a description of a task (actual work to do). So (given the controller here is in python) I think a good thing to do would be to define a python struct for a task, which is basically a node that can be executed - has its inputs available - so actually this is just NodeRunData, but give it a map from PortId -> actual values/data (I think that's PortId -> (Loc,PortId)). There could be an easy constructor for NodeRunData from a NodeDef and a Loc which builds the input map by looking up the source, of each incoming edge in the NodeRef, in the parent Loc, as start currently does; and then start uses the input in the NodeRunData. (I'm keeping NodeDef as a field of NodeRunData even though the inputs/in-edges of the NodeDef will not be used, because it's not worth duplicating the enum). Then new_eval_root builds a task with the input map being exterior Loc+PortId
  ...then, ok, you need ExteriorRef, but only in python as part of NodeRunData, not in Rust or in the graph.
As below, I think it'd be cleaner and have less risk of confusion if NodeDef didn't have both in_edges and inputs

tierkreis/tests/controller/loop_graphdata.py

tierkreis/tests/controller/test_eagerifelse.py

tierkreis/tests/controller/test_graphdata_storage.py

acl-cqc · 2026-01-08T13:30:18Z

tierkreis/tests/controller/test_locs.py

 def test_pop_empty() -> None:
    loc = Loc("")
-    with pytest.raises(TierkreisError):
+    with pytest.raises(ValueError):


fwiw, this is an example where I think ValueError is an improvement over TierkreisError. (Well, ok, ValueError is extremely broad, but so is the idea of "an error from anything in a particular software package")

acl-cqc · 2026-01-08T13:37:41Z

tierkreis/tests/controller/test_graphdata.py


 def test_only_one_output():
-    with pytest.raises(TierkreisError):
+    with pytest.raises(ValueError):


Personally I prefer errors that reflect what the problem was - "MalformedGraphError" or "GraphTypingError", say, or "TierkreisRuntimeError" (/GraphExecutionError, say). I think it's probably more useful to be able to distinguish between kinds of error (even if (re)using existing python Error classes) rather than to switch all the errors from one type to another. (Ok, there is some value in switching everything from ValueError to TierkreisError because the latter is more distinct from ValueErrors from code that isn't tierkreis at all, but even that would be better done selectively.)

acl-cqc · 2026-01-08T15:54:45Z

tierkreis/tierkreis/controller/start.py

+        raise TierkreisError(f"{type(node)} node must have parent Loc.")
+
+    ins = {
+        k: (parent.extend_from_ref(ref), ref.port_id) for k, ref in node.inputs.items()


If you used in_edges rather than inputs here, I think that would save you the work in loop and eval. You'd still have to do something fancy for map but I think just by copying from ins["body"]

acl-cqc · 2026-01-08T16:02:57Z

tierkreis_core/rust/graph.rs

+
+            let actual_inputs: IndexSet<PortID> =
+                fixed_inputs.union(&provided_inputs).cloned().collect();
+            self.graph_inputs


This looks like you pass in the non-fixed-inputs that you have; you get back the list of fixed-and-non-fixed inputs you still need. So if you move some of those into the provided inputs....then you might end up in the unimplemented. (Which I took to be a corner case, maybe it's not so much....and I'm not really sure what these fixed_inputs are anyway....)

I think in the original python code it did nothing in this case, I can double check but I think I asked @mwpb at the time and he said this feature was never fully implemented.

AFAIK the intention was to be able to "curry" nodes, but now I am more familiar with the codebase I think introducing something like a "thunk" that can be used for the "virtual" nodes of map and loop as well as for currying could make a lot of sense.

acl-cqc · 2026-01-08T17:32:06Z

tierkreis/tierkreis/controller/storage/graphdata.py

        raise NotImplementedError("GraphDataStorage is read only storage.")

-    def read_node_def(self, node_location: Loc) -> NodeDef:
+    def read_node_description(self, node_location: Loc) -> NodeDescription:


I don't think this is used? All the calls to read_node_description are to the storage layer (the one in protocol.py)?

It's used pretty heavily in the visualiser for inspecting graphs that haven't run yet. I feel like the existence of the GraphDataStorage is probably a hack and we should be able to ask storage for a graph and query that rather than using this wrapping object.

Ah sorry think I may have got confused. Nonetheless there are two identical-signature methods read_node_description...does the visualiser use both?

Right, yes in a specific development mode it will use this GraphDataStorage instead:

tierkreis/tierkreis_visualization/tierkreis_visualization/storage.py

Line 44 in e63445c

return GraphDataStorage(UUID(int=0), graph=graph)

acl-cqc · 2026-01-08T18:12:28Z

tierkreis/tierkreis/controller/storage/walk.py

+    ins: dict[PortID, ValueRef | ExteriorRef] = {
+        k: ExteriorRef(k) for k in loop.inputs.keys()
+    }
+    # Update with the outputs of the previous iteration.


Suggested change

# Update with the outputs of the previous iteration.

# Replace loop variants with the outputs of the previous iteration.

(Or loop-variant inputs/values, perhaps)

acl-cqc · 2026-01-08T18:21:16Z

tierkreis_core/rust/graph.rs

+                                inputs: inputs.clone(),
+                            },
+                            outputs: outputs.clone(),
+                            outer_graph: Some(const_graph.clone()),


Is that right? As per PR comment I think you should probably move this into python, but say I have a graph G containing a const C which is a graph containing a const D which is also a graph....and I query with a Loc that identifies a node inside D...I think I'll get back graph C (not graph D, the one containing the node in question). Is that what you want?

Admittedly it is unclear to me why this behavior exists, I just copied over what it seemed to be doing previously. Possibly I've given it a bad name due to a misunderstanding?

johnchildren · 2026-01-09T09:42:45Z

As you say, quite a big PR. I wonder if you can split it in two, or indeed perhaps even remove half of it. Specifically, can you remove the porting of Loc into Rust? I think the new API is much nicer, but (a) you could implement that in python, as an orthogonal change; or (b) delay having a rust Loc until you really need it (i.e. some amount of controller code moves into Rust too).

I think my primary motivation here was just trying to understand the existing code a bit better with respect to exterior refs and the removal of the -1 semantic for what I am now calling "exterior" Loc components. Some of the API changes I made to understand that have since been factored out so it does look a bit more out of place as a result. I can experiment with trying to split adding the Loc into a separate PR however.

johnchildren · 2026-01-09T09:50:41Z

As below, I think it'd be cleaner and have less risk of confusion if NodeDef didn't have both in_edges and inputs

Agreed, this tripped me up quite a lot, it's a reasonably direct port of existing behavior but it would be good if we decide to use one of them and remove the other.

acl-cqc · 2026-01-09T10:40:14Z

The only use of Loc in Rust is for query_node_description which I reckon you could implement in python just querying the Rust graph structure "one level at a time". (Really I think the Loc is kind of a concept of the controller, not the graph; the latter needs to deal only with individual NodeId + PortId's)

I also have to think....do we actually need to query_node_description or read_outputs/read_output_ports on an arbitrary Loc (nested), rather than of a node in a flat graph? The latter seems the common case, so is it, say, eval nodes needing to know the type of the graph constant to determine their own output ports?? I guess there might be cases like that - Hugr kinda solves this by forcing enough type information to be cached locally in the graph that the deep/nonlocal-traversal needs be done only in validation (and only by comparing each level to the next level down, rather than arbitrarily deep)

mwpb · 2026-01-09T11:22:31Z

Thanks all. Whilst I recognise a lot of the concepts being talked about here (and agree with quite a few of the proposed improvements), I'm getting confused as multiple things are moving at once. Does the following list of potential improvements look roughly correct?

Remove the -1 nodes. A few strategies here but one simple one might be: when starting an EVAL node, put the inputs into the relevant input nodes directly.
Allow exterior references? I.e. to a path that isn't fully described by a NodeIndex.
Consistently use one of in_edges vs inputs.
Rework/remove recursion and self references.
Rework the situation when the return type of the body of a MAP is a portmapping. Rework TList alongside.
Improve the Loc datastructure. (+ port to Rust)
Change graph representation. Maybe separate the representation (currently adjacency list) from the data required to run? (+ port to Rust)
Output nodes of graphs. My guess is that we want to ensure that every graph always has one. Also I'd rather the author had to explicitly add it but don't have a very strong opinion about that.
Remove the fixed_inputs.
Implement a node for partial evaluation?

If so then how would we prefer to group this work?

feat: Continue adding rust internals chore: Try to get the tests working chore: Try to get rust internals working chore: Get tests running chore: Remove extra print statements chore: Further debugging chore: Refactor rust code into modules chore: Fix a bug related to loop scoping chore: Fix many type errors and simplify API chore: More docs and type fixes chore: Fix up more code drift chore: fix more type issues chore: Fix up more of type issues

johnchildren · 2026-01-09T16:50:35Z

Remove the -1 nodes. A few strategies here but one simple one might be: when starting an EVAL node, put the inputs into the relevant input nodes directly.

Seems reasonable, I'm not sure what order it should be done however. Hopefully the API I've added for them should make removal easier I believe.

* Allow exterior references? I.e. to a path that isn't fully described by a `NodeIndex`.

Do you mean allowing fully resolved paths to appear in the controller? I think that would be a good idea and would tie into

* Consistently use one of `in_edges` vs `inputs`.

I've dropped my usage of inputs in the branch and it seems to work reasonably well now. Possibly we can do some further code simplification as I'm sure there are a few dict updates I have missed.

* Rework/remove recursion and self references.

I don't have as much of an opinion on this, I will need to re-read @acl-cqc 's comments

* Rework the situation when the return type of the body of a MAP is a portmapping. Rework TList alongside.

I'm not sure I understand this well enough to comment still

* Improve the `Loc` datastructure. (+ port to Rust)

I think the main improvements would be the ones we've discussed a bit on slack to do with making invalid Locs unrepresentatable, I think it shouldn't be too bad to do, but could interfere with the string representation of them.

* Change graph representation. Maybe separate the representation (currently adjacency list) from the data required to run? (+ port to Rust)

I think my PR should now treat GraphData as immutable which should make this a bit easier.

* Output nodes of graphs. My guess is that we want to ensure that every graph always has one. Also I'd rather the author had to explicitly add it but don't have a very strong opinion about that.

My suspicion now is that this is probably a GraphBuilder/GraphData distinction and when someone tries to produce GraphData from a GraphBuilder we should error if there is no Output node.

* Remove the `fixed_inputs`.

Seems reasonable

* Implement a node for partial evaluation?

Unclear on this one

tierkreis/tests/cli/test_tkr.py

tierkreis/tests/controller/test_locs.py

acl-cqc · 2026-02-10T11:11:30Z

tierkreis_core/rust/graph.rs

+                        } => graph,
+                        _ => {
+                            return Err(PyValueError::new_err(
+                                "Const node connected to body port does not contain a graph",


In this case it's not necessarily a Const node, right? (so any other node producing a graph, e.g. if, connected to an eval)

acl-cqc · 2026-02-10T11:13:21Z

tierkreis_core/rust/graph.rs

+
+        /// Query a NodeDescription from a Loc (which describes a location on the graph.)
+        ///
+        /// Useful for visualisation and debugging.


Yah ok, so this is used only for "graphdata storage" (readonly), not if we are using filestorage. (Hence it can't return nodes within dynamically constructed graphs, because they aren't stored.)

Good to have. Could be done in python, but since we're not trying to integrate with filestorage, ok to have it here too.

johnchildren force-pushed the rust-internals branch 5 times, most recently from 951fc55 to 25682e5 Compare January 5, 2026 16:29

johnchildren commented Jan 6, 2026

View reviewed changes

tierkreis_visualization/tierkreis_visualization/main.py Outdated Show resolved Hide resolved

johnchildren commented Jan 6, 2026

View reviewed changes

tierkreis_core/rust/location.rs Outdated Show resolved Hide resolved

johnchildren commented Jan 6, 2026

View reviewed changes

tierkreis/tests/controller/main.py Show resolved Hide resolved

johnchildren commented Jan 6, 2026

View reviewed changes

tierkreis/tests/controller/test_graphdata_storage.py Outdated Show resolved Hide resolved

johnchildren commented Jan 6, 2026

View reviewed changes

tierkreis/tierkreis/controller/data/core.py Outdated Show resolved Hide resolved

johnchildren commented Jan 6, 2026

View reviewed changes

tierkreis/tierkreis/controller/start.py Show resolved Hide resolved

johnchildren force-pushed the rust-internals branch from 052cf26 to 5283d5d Compare January 7, 2026 14:26