Conversation
951fc55 to
25682e5
Compare
| } | ||
|
|
||
| #[derive(Clone, Debug, PartialEq, Serialize, Deserialize, FromPyObject, IntoPyObject)] | ||
| pub enum Value { |
There was a problem hiding this comment.
This is quite non-exhaustive and should realistically match PType (though it technically works as a subset of PType that can be used as constants in nodes)
|
|
||
| #[derive(Clone, Debug, PartialEq, Serialize, Deserialize, FromPyObject, IntoPyObject)] | ||
| pub enum ExteriorOrValueRef { | ||
| Exterior(ExteriorRef), |
There was a problem hiding this comment.
I would like to remove Exterior as an option potentially as it would simplify some of the code a bit, this requires some changes to the controller code itself however.
|
|
||
| use regex::Regex; | ||
|
|
||
| pub fn main() { |
There was a problem hiding this comment.
This code is admittedly quite janky, pyO3 has a tracking issue to do this better: PyO3/pyo3#5137
I think other projects inside the company are using a solution that is a bit more advanced, but it seemed like I would need to use more proc macros which seemed unappealing initially.
| from tierkreis_core._tierkreis_core.graph import GraphData | ||
|
|
||
| type PortID = str | ||
| type Value = ( |
There was a problem hiding this comment.
This is kind of a workaround for not getting a simple enum via introspection of the rust types, not sure how long it will stick around.
There was a problem hiding this comment.
This might need a longer comment
|
|
||
| def test_only_one_output(): | ||
| with pytest.raises(TierkreisError): | ||
| with pytest.raises(ValueError): |
There was a problem hiding this comment.
Philosophical question here about whether we should use the standard python errors or custom ones. I can raise a custom exception in the rust code instead but this was easier for me to do at least initially.
There was a problem hiding this comment.
Personally I prefer errors that reflect what the problem was - "MalformedGraphError" or "GraphTypingError", say, or "TierkreisRuntimeError" (/GraphExecutionError, say). I think it's probably more useful to be able to distinguish between kinds of error (even if (re)using existing python Error classes) rather than to switch all the errors from one type to another. (Ok, there is some value in switching everything from ValueError to TierkreisError because the latter is more distinct from ValueErrors from code that isn't tierkreis at all, but even that would be better done selectively.)
There was a problem hiding this comment.
There's certainly some value in custom errors for making the errors more explanatory as well I suppose and custom errors can always inherit from the "pythonic" error base classes.
There was a problem hiding this comment.
Just to double check, something like MalformedGraphError would be a subclass of TierkreisError? If so then I'm in favour.
There was a problem hiding this comment.
How bad would multiple inheritance be? I could imagine something like
class MalformedGraphError(TierkreisError, ValueError):
...| args: list[TKR] = [] | ||
| for ref in refs: | ||
| key = ref[1].replace("-*", "") | ||
| key = ref.port_id.replace("-*", "") |
There was a problem hiding this comment.
I guess one thing we could do at some point is have an explicit separate type for wildcard ports?
There was a problem hiding this comment.
What is the type of key here? PortId? Node? It looks like it does not (conceptually) refer to a port anymore...
There was a problem hiding this comment.
I think this is replacing some "wildcard" suffix ports from foo-* -> foo, but admittedly I am unsure why this is needed in this case or what the difference between * and foo-* is.
There was a problem hiding this comment.
The existing code isn't very nice here. The situation is that we have a map node for which the return type of the body graph is a portmapping. (I.e. multiple wires out.) This means that we need to be able to distinguish wildcards corresponding to each of these output wires. Hopefully the example typed_destructuring is helpful.
There was a problem hiding this comment.
Relatedly this is where TList comes in, which is another not very nice aspect of the existing code.
| ins = [x for x in ins if x[0] >= 0] # inputs at -1 already finished | ||
| return [x for x in ins if not storage.is_node_finished(loc.N(x[0]))] | ||
| ins = node.in_edges.values() | ||
| ins = [x for x in ins if isinstance(x, ValueRef)] # Only look an Values on Edges |
There was a problem hiding this comment.
I think this comment is both incorrect and needs expansion
| def read_node_def(self, node_location: Loc) -> NodeDef: | ||
| bs = self.read(self._nodedef_path(node_location)) | ||
| return NodeDefModel(**json.loads(bs)).root | ||
| return NodeDef.model_load_json(bs.decode()) |
There was a problem hiding this comment.
I wonder if it would be better if the rust data structures could read bytes directly instead so we don't need to do the decoding here, but I was copying the API from pydantic
052cf26 to
5283d5d
Compare
| @pytest.mark.parametrize( | ||
| ["node_location_str", "graph", "target"], | ||
| [ | ||
| ("-.N0", simple_eval(), Const(0, outputs=set(["value"]))), |
There was a problem hiding this comment.
I guess I should test the full definition as this was also looking at outputs before
| class NodeData(BaseModel): | ||
| """Internal storage class to store all necessary node information.""" | ||
|
|
||
| model_config = {"arbitrary_types_allowed": True} |
There was a problem hiding this comment.
It wouldn't be completely unreasonable to make NodeDef pydantic compatible, I'm unclear how much of an advantage there would be to doing so however as it's quite an error prone process
| // Used to validate the output of the schema | ||
| let any_schema = core_schema.call_method0("any_schema")?; | ||
|
|
||
| let serialization = core_schema.call_method0("to_string_ser_schema")?; |
There was a problem hiding this comment.
I probably could add an example to the pydantic schema here if we want Loc's to appear properly in the API docs. A bit of a nice to have though.
0edbaea to
94fab49
Compare
94fab49 to
05611b2
Compare
b798594 to
19c9e5e
Compare
There was a problem hiding this comment.
Thanks John, I can see you've done a lot of hard work here in particular introducing maturin etc. into what has been an all-python project. A few thoughts, mostly on the rust/python boundary and the new graph structure...
-
As you say, quite a big PR. I wonder if you can split it in two, or indeed perhaps even remove half of it. Specifically, can you remove the porting of
Locinto Rust? I think the new API is much nicer, but (a) you could implement that in python, as an orthogonal change; or (b) delay having a rust Loc until you really need it (i.e. some amount of controller code moves into Rust too).- The only use of
Locin Rust is forquery_node_descriptionwhich I reckon you could implement in python just querying the Rust graph structure "one level at a time". (Really I think theLocis kind of a concept of the controller, not the graph; the latter needs to deal only with individual NodeId + PortId's)
- The only use of
-
ExteriorRef. You hint at this in one of your own comments. I think in an actual graph, all NodeDefs would contain ValueRef's (not ExteriorRefs); what would it take to remove ExteriorRef?
- Ah oh no, you allow a graph to refer to itself via
def ref(), which is only valid because all the nodes in which the graph might be run have a graph input calledbody. Eeeek....can I say that again, eek? Ok - I think the thing to do here is to add a new input-less node, called say RecursiveGraphSelfReference (longer = more likely to alert the reader that something funny is going on....no ok, I jest, find a shorter name) that's a lot like Input i.e. produces the graph on its own output port. (Another advantage here is that the graph building does not rely on "all nodes that might run this graph must take it on an inport called body" - instead the actual graph is provided by the controller.) - In
new_eval_rootwhich creates aNodeDef::Evalcontaining a load ofExteriorRefs. So the thing here is that NodeDef is used both to represent a node in a graph (with its edges - not the usual idea of a graph being nodes + edges; really this is NodeAndInEdges rather than NodeRef), and also a description of a task (actual work to do). So (given the controller here is in python) I think a good thing to do would be to define a python struct for a task, which is basically a node that can be executed - has its inputs available - so actually this is just NodeRunData, but give it a map from PortId -> actual values/data (I think that's PortId -> (Loc,PortId)). There could be an easy constructor for NodeRunData from a NodeDef and a Loc which builds the input map by looking up the source, of each incoming edge in the NodeRef, in the parent Loc, asstartcurrently does; and thenstartuses the input in the NodeRunData. (I'm keeping NodeDef as a field of NodeRunData even though the inputs/in-edges of the NodeDef will not be used, because it's not worth duplicating the enum). Then new_eval_root builds a task with the input map being exterior Loc+PortId
...then, ok, you need ExteriorRef, but only in python as part of NodeRunData, not in Rust or in the graph.
- Ah oh no, you allow a graph to refer to itself via
-
As below, I think it'd be cleaner and have less risk of confusion if
NodeDefdidn't have both in_edges and inputs
| def test_pop_empty() -> None: | ||
| loc = Loc("") | ||
| with pytest.raises(TierkreisError): | ||
| with pytest.raises(ValueError): |
There was a problem hiding this comment.
fwiw, this is an example where I think ValueError is an improvement over TierkreisError. (Well, ok, ValueError is extremely broad, but so is the idea of "an error from anything in a particular software package")
|
|
||
| def test_only_one_output(): | ||
| with pytest.raises(TierkreisError): | ||
| with pytest.raises(ValueError): |
There was a problem hiding this comment.
Personally I prefer errors that reflect what the problem was - "MalformedGraphError" or "GraphTypingError", say, or "TierkreisRuntimeError" (/GraphExecutionError, say). I think it's probably more useful to be able to distinguish between kinds of error (even if (re)using existing python Error classes) rather than to switch all the errors from one type to another. (Ok, there is some value in switching everything from ValueError to TierkreisError because the latter is more distinct from ValueErrors from code that isn't tierkreis at all, but even that would be better done selectively.)
| raise TierkreisError(f"{type(node)} node must have parent Loc.") | ||
|
|
||
| ins = { | ||
| k: (parent.extend_from_ref(ref), ref.port_id) for k, ref in node.inputs.items() |
There was a problem hiding this comment.
If you used in_edges rather than inputs here, I think that would save you the work in loop and eval. You'd still have to do something fancy for map but I think just by copying from ins["body"]
|
|
||
| let actual_inputs: IndexSet<PortID> = | ||
| fixed_inputs.union(&provided_inputs).cloned().collect(); | ||
| self.graph_inputs |
There was a problem hiding this comment.
This looks like you pass in the non-fixed-inputs that you have; you get back the list of fixed-and-non-fixed inputs you still need. So if you move some of those into the provided inputs....then you might end up in the unimplemented. (Which I took to be a corner case, maybe it's not so much....and I'm not really sure what these fixed_inputs are anyway....)
There was a problem hiding this comment.
I think in the original python code it did nothing in this case, I can double check but I think I asked @mwpb at the time and he said this feature was never fully implemented.
There was a problem hiding this comment.
AFAIK the intention was to be able to "curry" nodes, but now I am more familiar with the codebase I think introducing something like a "thunk" that can be used for the "virtual" nodes of map and loop as well as for currying could make a lot of sense.
| raise NotImplementedError("GraphDataStorage is read only storage.") | ||
|
|
||
| def read_node_def(self, node_location: Loc) -> NodeDef: | ||
| def read_node_description(self, node_location: Loc) -> NodeDescription: |
There was a problem hiding this comment.
I don't think this is used? All the calls to read_node_description are to the storage layer (the one in protocol.py)?
There was a problem hiding this comment.
It's used pretty heavily in the visualiser for inspecting graphs that haven't run yet. I feel like the existence of the GraphDataStorage is probably a hack and we should be able to ask storage for a graph and query that rather than using this wrapping object.
There was a problem hiding this comment.
Ah sorry think I may have got confused. Nonetheless there are two identical-signature methods read_node_description...does the visualiser use both?
There was a problem hiding this comment.
Right, yes in a specific development mode it will use this GraphDataStorage instead:
| ins: dict[PortID, ValueRef | ExteriorRef] = { | ||
| k: ExteriorRef(k) for k in loop.inputs.keys() | ||
| } | ||
| # Update with the outputs of the previous iteration. |
There was a problem hiding this comment.
| # Update with the outputs of the previous iteration. | |
| # Replace loop variants with the outputs of the previous iteration. |
(Or loop-variant inputs/values, perhaps)
| inputs: inputs.clone(), | ||
| }, | ||
| outputs: outputs.clone(), | ||
| outer_graph: Some(const_graph.clone()), |
There was a problem hiding this comment.
Is that right? As per PR comment I think you should probably move this into python, but say I have a graph G containing a const C which is a graph containing a const D which is also a graph....and I query with a Loc that identifies a node inside D...I think I'll get back graph C (not graph D, the one containing the node in question). Is that what you want?
There was a problem hiding this comment.
Admittedly it is unclear to me why this behavior exists, I just copied over what it seemed to be doing previously. Possibly I've given it a bad name due to a misunderstanding?
I think my primary motivation here was just trying to understand the existing code a bit better with respect to exterior refs and the removal of the |
Agreed, this tripped me up quite a lot, it's a reasonably direct port of existing behavior but it would be good if we decide to use one of them and remove the other. |
I also have to think....do we actually need to |
836614c to
1e66992
Compare
|
Thanks all. Whilst I recognise a lot of the concepts being talked about here (and agree with quite a few of the proposed improvements), I'm getting confused as multiple things are moving at once. Does the following list of potential improvements look roughly correct?
If so then how would we prefer to group this work? |
7daf20f to
685e701
Compare
1e66992 to
55c1296
Compare
feat: Continue adding rust internals chore: Try to get the tests working chore: Try to get rust internals working chore: Get tests running chore: Remove extra print statements chore: Further debugging chore: Refactor rust code into modules chore: Fix a bug related to loop scoping chore: Fix many type errors and simplify API chore: More docs and type fixes chore: Fix up more code drift chore: fix more type issues chore: Fix up more of type issues
685e701 to
e8f1415
Compare
Seems reasonable, I'm not sure what order it should be done however. Hopefully the API I've added for them should make removal easier I believe.
Do you mean allowing fully resolved paths to appear in the controller? I think that would be a good idea and would tie into
I've dropped my usage of
I don't have as much of an opinion on this, I will need to re-read @acl-cqc 's comments
I'm not sure I understand this well enough to comment still
I think the main improvements would be the ones we've discussed a bit on slack to do with making invalid Locs unrepresentatable, I think it shouldn't be too bad to do, but could interfere with the string representation of them.
I think my PR should now treat GraphData as immutable which should make this a bit easier.
My suspicion now is that this is probably a GraphBuilder/GraphData distinction and when someone tries to produce GraphData from a GraphBuilder we should error if there is no Output node.
Seems reasonable
Unclear on this one |
| } => graph, | ||
| _ => { | ||
| return Err(PyValueError::new_err( | ||
| "Const node connected to body port does not contain a graph", |
There was a problem hiding this comment.
In this case it's not necessarily a Const node, right? (so any other node producing a graph, e.g. if, connected to an eval)
|
|
||
| /// Query a NodeDescription from a Loc (which describes a location on the graph.) | ||
| /// | ||
| /// Useful for visualisation and debugging. |
There was a problem hiding this comment.
Yah ok, so this is used only for "graphdata storage" (readonly), not if we are using filestorage. (Hence it can't return nodes within dynamically constructed graphs, because they aren't stored.)
Good to have. Could be done in python, but since we're not trying to integrate with filestorage, ok to have it here too.
No description provided.