Skip to content

feat: Rust internals#266

Open
johnchildren wants to merge 6 commits intomainfrom
rust-internals
Open

feat: Rust internals#266
johnchildren wants to merge 6 commits intomainfrom
rust-internals

Conversation

@johnchildren
Copy link
Collaborator

No description provided.

@johnchildren johnchildren force-pushed the rust-internals branch 5 times, most recently from 951fc55 to 25682e5 Compare January 5, 2026 16:29
}

#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, FromPyObject, IntoPyObject)]
pub enum Value {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite non-exhaustive and should realistically match PType (though it technically works as a subset of PType that can be used as constants in nodes)


#[derive(Clone, Debug, PartialEq, Serialize, Deserialize, FromPyObject, IntoPyObject)]
pub enum ExteriorOrValueRef {
Exterior(ExteriorRef),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to remove Exterior as an option potentially as it would simplify some of the code a bit, this requires some changes to the controller code itself however.


use regex::Regex;

pub fn main() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is admittedly quite janky, pyO3 has a tracking issue to do this better: PyO3/pyo3#5137

I think other projects inside the company are using a solution that is a bit more advanced, but it seemed like I would need to use more proc macros which seemed unappealing initially.

from tierkreis_core._tierkreis_core.graph import GraphData

type PortID = str
type Value = (
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a workaround for not getting a simple enum via introspection of the rust types, not sure how long it will stick around.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need a longer comment


def test_only_one_output():
with pytest.raises(TierkreisError):
with pytest.raises(ValueError):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Philosophical question here about whether we should use the standard python errors or custom ones. I can raise a custom exception in the rust code instead but this was easier for me to do at least initially.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I prefer errors that reflect what the problem was - "MalformedGraphError" or "GraphTypingError", say, or "TierkreisRuntimeError" (/GraphExecutionError, say). I think it's probably more useful to be able to distinguish between kinds of error (even if (re)using existing python Error classes) rather than to switch all the errors from one type to another. (Ok, there is some value in switching everything from ValueError to TierkreisError because the latter is more distinct from ValueErrors from code that isn't tierkreis at all, but even that would be better done selectively.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's certainly some value in custom errors for making the errors more explanatory as well I suppose and custom errors can always inherit from the "pythonic" error base classes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, something like MalformedGraphError would be a subclass of TierkreisError? If so then I'm in favour.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How bad would multiple inheritance be? I could imagine something like

class MalformedGraphError(TierkreisError, ValueError):
    ...

args: list[TKR] = []
for ref in refs:
key = ref[1].replace("-*", "")
key = ref.port_id.replace("-*", "")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one thing we could do at some point is have an explicit separate type for wildcard ports?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the type of key here? PortId? Node? It looks like it does not (conceptually) refer to a port anymore...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is replacing some "wildcard" suffix ports from foo-* -> foo, but admittedly I am unsure why this is needed in this case or what the difference between * and foo-* is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code isn't very nice here. The situation is that we have a map node for which the return type of the body graph is a portmapping. (I.e. multiple wires out.) This means that we need to be able to distinguish wildcards corresponding to each of these output wires. Hopefully the example typed_destructuring is helpful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatedly this is where TList comes in, which is another not very nice aspect of the existing code.

ins = [x for x in ins if x[0] >= 0] # inputs at -1 already finished
return [x for x in ins if not storage.is_node_finished(loc.N(x[0]))]
ins = node.in_edges.values()
ins = [x for x in ins if isinstance(x, ValueRef)] # Only look an Values on Edges
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is both incorrect and needs expansion

def read_node_def(self, node_location: Loc) -> NodeDef:
bs = self.read(self._nodedef_path(node_location))
return NodeDefModel(**json.loads(bs)).root
return NodeDef.model_load_json(bs.decode())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be better if the rust data structures could read bytes directly instead so we don't need to do the decoding here, but I was copying the API from pydantic

@pytest.mark.parametrize(
["node_location_str", "graph", "target"],
[
("-.N0", simple_eval(), Const(0, outputs=set(["value"]))),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I should test the full definition as this was also looking at outputs before

class NodeData(BaseModel):
"""Internal storage class to store all necessary node information."""

model_config = {"arbitrary_types_allowed": True}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't be completely unreasonable to make NodeDef pydantic compatible, I'm unclear how much of an advantage there would be to doing so however as it's quite an error prone process

// Used to validate the output of the schema
let any_schema = core_schema.call_method0("any_schema")?;

let serialization = core_schema.call_method0("to_string_ser_schema")?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably could add an example to the pydantic schema here if we want Loc's to appear properly in the API docs. A bit of a nice to have though.

@johnchildren johnchildren changed the base branch from main to minor-test-improvements January 8, 2026 17:12
@johnchildren johnchildren changed the base branch from minor-test-improvements to update-devenv January 8, 2026 17:29
@johnchildren johnchildren force-pushed the rust-internals branch 2 times, most recently from b798594 to 19c9e5e Compare January 8, 2026 17:54
Copy link
Contributor

@acl-cqc acl-cqc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks John, I can see you've done a lot of hard work here in particular introducing maturin etc. into what has been an all-python project. A few thoughts, mostly on the rust/python boundary and the new graph structure...

  • As you say, quite a big PR. I wonder if you can split it in two, or indeed perhaps even remove half of it. Specifically, can you remove the porting of Loc into Rust? I think the new API is much nicer, but (a) you could implement that in python, as an orthogonal change; or (b) delay having a rust Loc until you really need it (i.e. some amount of controller code moves into Rust too).

    • The only use of Loc in Rust is for query_node_description which I reckon you could implement in python just querying the Rust graph structure "one level at a time". (Really I think the Loc is kind of a concept of the controller, not the graph; the latter needs to deal only with individual NodeId + PortId's)
  • ExteriorRef. You hint at this in one of your own comments. I think in an actual graph, all NodeDefs would contain ValueRef's (not ExteriorRefs); what would it take to remove ExteriorRef?

    • Ah oh no, you allow a graph to refer to itself via def ref(), which is only valid because all the nodes in which the graph might be run have a graph input called body. Eeeek....can I say that again, eek? Ok - I think the thing to do here is to add a new input-less node, called say RecursiveGraphSelfReference (longer = more likely to alert the reader that something funny is going on....no ok, I jest, find a shorter name) that's a lot like Input i.e. produces the graph on its own output port. (Another advantage here is that the graph building does not rely on "all nodes that might run this graph must take it on an inport called body" - instead the actual graph is provided by the controller.)
    • In new_eval_root which creates a NodeDef::Eval containing a load of ExteriorRefs. So the thing here is that NodeDef is used both to represent a node in a graph (with its edges - not the usual idea of a graph being nodes + edges; really this is NodeAndInEdges rather than NodeRef), and also a description of a task (actual work to do). So (given the controller here is in python) I think a good thing to do would be to define a python struct for a task, which is basically a node that can be executed - has its inputs available - so actually this is just NodeRunData, but give it a map from PortId -> actual values/data (I think that's PortId -> (Loc,PortId)). There could be an easy constructor for NodeRunData from a NodeDef and a Loc which builds the input map by looking up the source, of each incoming edge in the NodeRef, in the parent Loc, as start currently does; and then start uses the input in the NodeRunData. (I'm keeping NodeDef as a field of NodeRunData even though the inputs/in-edges of the NodeDef will not be used, because it's not worth duplicating the enum). Then new_eval_root builds a task with the input map being exterior Loc+PortId
      ...then, ok, you need ExteriorRef, but only in python as part of NodeRunData, not in Rust or in the graph.
  • As below, I think it'd be cleaner and have less risk of confusion if NodeDef didn't have both in_edges and inputs

def test_pop_empty() -> None:
loc = Loc("")
with pytest.raises(TierkreisError):
with pytest.raises(ValueError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, this is an example where I think ValueError is an improvement over TierkreisError. (Well, ok, ValueError is extremely broad, but so is the idea of "an error from anything in a particular software package")


def test_only_one_output():
with pytest.raises(TierkreisError):
with pytest.raises(ValueError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I prefer errors that reflect what the problem was - "MalformedGraphError" or "GraphTypingError", say, or "TierkreisRuntimeError" (/GraphExecutionError, say). I think it's probably more useful to be able to distinguish between kinds of error (even if (re)using existing python Error classes) rather than to switch all the errors from one type to another. (Ok, there is some value in switching everything from ValueError to TierkreisError because the latter is more distinct from ValueErrors from code that isn't tierkreis at all, but even that would be better done selectively.)

raise TierkreisError(f"{type(node)} node must have parent Loc.")

ins = {
k: (parent.extend_from_ref(ref), ref.port_id) for k, ref in node.inputs.items()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you used in_edges rather than inputs here, I think that would save you the work in loop and eval. You'd still have to do something fancy for map but I think just by copying from ins["body"]


let actual_inputs: IndexSet<PortID> =
fixed_inputs.union(&provided_inputs).cloned().collect();
self.graph_inputs
Copy link
Contributor

@acl-cqc acl-cqc Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like you pass in the non-fixed-inputs that you have; you get back the list of fixed-and-non-fixed inputs you still need. So if you move some of those into the provided inputs....then you might end up in the unimplemented. (Which I took to be a corner case, maybe it's not so much....and I'm not really sure what these fixed_inputs are anyway....)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the original python code it did nothing in this case, I can double check but I think I asked @mwpb at the time and he said this feature was never fully implemented.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the intention was to be able to "curry" nodes, but now I am more familiar with the codebase I think introducing something like a "thunk" that can be used for the "virtual" nodes of map and loop as well as for currying could make a lot of sense.

raise NotImplementedError("GraphDataStorage is read only storage.")

def read_node_def(self, node_location: Loc) -> NodeDef:
def read_node_description(self, node_location: Loc) -> NodeDescription:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is used? All the calls to read_node_description are to the storage layer (the one in protocol.py)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used pretty heavily in the visualiser for inspecting graphs that haven't run yet. I feel like the existence of the GraphDataStorage is probably a hack and we should be able to ask storage for a graph and query that rather than using this wrapping object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry think I may have got confused. Nonetheless there are two identical-signature methods read_node_description...does the visualiser use both?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, yes in a specific development mode it will use this GraphDataStorage instead:

return GraphDataStorage(UUID(int=0), graph=graph)

ins: dict[PortID, ValueRef | ExteriorRef] = {
k: ExteriorRef(k) for k in loop.inputs.keys()
}
# Update with the outputs of the previous iteration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Update with the outputs of the previous iteration.
# Replace loop variants with the outputs of the previous iteration.

(Or loop-variant inputs/values, perhaps)

inputs: inputs.clone(),
},
outputs: outputs.clone(),
outer_graph: Some(const_graph.clone()),
Copy link
Contributor

@acl-cqc acl-cqc Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that right? As per PR comment I think you should probably move this into python, but say I have a graph G containing a const C which is a graph containing a const D which is also a graph....and I query with a Loc that identifies a node inside D...I think I'll get back graph C (not graph D, the one containing the node in question). Is that what you want?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly it is unclear to me why this behavior exists, I just copied over what it seemed to be doing previously. Possibly I've given it a bad name due to a misunderstanding?

@johnchildren
Copy link
Collaborator Author

  • As you say, quite a big PR. I wonder if you can split it in two, or indeed perhaps even remove half of it. Specifically, can you remove the porting of Loc into Rust? I think the new API is much nicer, but (a) you could implement that in python, as an orthogonal change; or (b) delay having a rust Loc until you really need it (i.e. some amount of controller code moves into Rust too).

I think my primary motivation here was just trying to understand the existing code a bit better with respect to exterior refs and the removal of the -1 semantic for what I am now calling "exterior" Loc components. Some of the API changes I made to understand that have since been factored out so it does look a bit more out of place as a result. I can experiment with trying to split adding the Loc into a separate PR however.

@johnchildren
Copy link
Collaborator Author

  • As below, I think it'd be cleaner and have less risk of confusion if NodeDef didn't have both in_edges and inputs

Agreed, this tripped me up quite a lot, it's a reasonably direct port of existing behavior but it would be good if we decide to use one of them and remove the other.

@acl-cqc
Copy link
Contributor

acl-cqc commented Jan 9, 2026

The only use of Loc in Rust is for query_node_description which I reckon you could implement in python just querying the Rust graph structure "one level at a time". (Really I think the Loc is kind of a concept of the controller, not the graph; the latter needs to deal only with individual NodeId + PortId's)

I also have to think....do we actually need to query_node_description or read_outputs/read_output_ports on an arbitrary Loc (nested), rather than of a node in a flat graph? The latter seems the common case, so is it, say, eval nodes needing to know the type of the graph constant to determine their own output ports?? I guess there might be cases like that - Hugr kinda solves this by forcing enough type information to be cached locally in the graph that the deep/nonlocal-traversal needs be done only in validation (and only by comparing each level to the next level down, rather than arbitrarily deep)

@mwpb
Copy link
Collaborator

mwpb commented Jan 9, 2026

Thanks all. Whilst I recognise a lot of the concepts being talked about here (and agree with quite a few of the proposed improvements), I'm getting confused as multiple things are moving at once. Does the following list of potential improvements look roughly correct?

  • Remove the -1 nodes. A few strategies here but one simple one might be: when starting an EVAL node, put the inputs into the relevant input nodes directly.
  • Allow exterior references? I.e. to a path that isn't fully described by a NodeIndex.
  • Consistently use one of in_edges vs inputs.
  • Rework/remove recursion and self references.
  • Rework the situation when the return type of the body of a MAP is a portmapping. Rework TList alongside.
  • Improve the Loc datastructure. (+ port to Rust)
  • Change graph representation. Maybe separate the representation (currently adjacency list) from the data required to run? (+ port to Rust)
  • Output nodes of graphs. My guess is that we want to ensure that every graph always has one. Also I'd rather the author had to explicitly add it but don't have a very strong opinion about that.
  • Remove the fixed_inputs.
  • Implement a node for partial evaluation?

If so then how would we prefer to group this work?

@johnchildren johnchildren force-pushed the rust-internals branch 2 times, most recently from 7daf20f to 685e701 Compare January 9, 2026 15:41
@johnchildren johnchildren changed the base branch from update-devenv to main January 9, 2026 15:41
@johnchildren johnchildren marked this pull request as ready for review January 9, 2026 15:44
feat: Continue adding rust internals

chore: Try to get the tests working

chore: Try to get rust internals working

chore: Get tests running

chore: Remove extra print statements

chore: Further debugging

chore: Refactor rust code into modules

chore: Fix a bug related to loop scoping

chore: Fix many type errors and simplify API

chore: More docs and type fixes

chore: Fix up more code drift

chore: fix more type issues

chore: Fix up more of type issues
@johnchildren
Copy link
Collaborator Author

johnchildren commented Jan 9, 2026

  • Remove the -1 nodes. A few strategies here but one simple one might be: when starting an EVAL node, put the inputs into the relevant input nodes directly.

Seems reasonable, I'm not sure what order it should be done however. Hopefully the API I've added for them should make removal easier I believe.

* Allow exterior references? I.e. to a path that isn't fully described by a `NodeIndex`.

Do you mean allowing fully resolved paths to appear in the controller? I think that would be a good idea and would tie into

* Consistently use one of `in_edges` vs `inputs`.

I've dropped my usage of inputs in the branch and it seems to work reasonably well now. Possibly we can do some further code simplification as I'm sure there are a few dict updates I have missed.

* Rework/remove recursion and self references.

I don't have as much of an opinion on this, I will need to re-read @acl-cqc 's comments

* Rework the situation when the return type of the body of a MAP is a portmapping. Rework TList alongside.

I'm not sure I understand this well enough to comment still

* Improve the `Loc` datastructure. (+ port to Rust)

I think the main improvements would be the ones we've discussed a bit on slack to do with making invalid Locs unrepresentatable, I think it shouldn't be too bad to do, but could interfere with the string representation of them.

* Change graph representation. Maybe separate the representation (currently adjacency list) from the data required to run? (+ port to Rust)

I think my PR should now treat GraphData as immutable which should make this a bit easier.

* Output nodes of graphs. My guess is that we want to ensure that every graph always has one. Also I'd rather the author had to explicitly add it but don't have a very strong opinion about that.

My suspicion now is that this is probably a GraphBuilder/GraphData distinction and when someone tries to produce GraphData from a GraphBuilder we should error if there is no Output node.

* Remove the `fixed_inputs`.

Seems reasonable

* Implement a node for partial evaluation?

Unclear on this one

} => graph,
_ => {
return Err(PyValueError::new_err(
"Const node connected to body port does not contain a graph",
Copy link
Contributor

@acl-cqc acl-cqc Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it's not necessarily a Const node, right? (so any other node producing a graph, e.g. if, connected to an eval)


/// Query a NodeDescription from a Loc (which describes a location on the graph.)
///
/// Useful for visualisation and debugging.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah ok, so this is used only for "graphdata storage" (readonly), not if we are using filestorage. (Hence it can't return nodes within dynamically constructed graphs, because they aren't stored.)

Good to have. Could be done in python, but since we're not trying to integrate with filestorage, ok to have it here too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants