-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config object to keep config in sync at all times #704
base: main
Are you sure you want to change the base?
Conversation
@brynpickering this sounds really nice. I'll give a thorough look at it. One thing, based on your description of the additions (and before I check the code): perhaps we should alter things so that the mode extends the configuration, rather than hiding it? This would avoid extra work on our end down the line, and move the code of these modes towards a 'plug-in' approach. I'll share more thoughts or suggestions in a while. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a comment, for now, since the code is not 100% ready.
Generally, I like this proposal.
Some of the positives I see.
- The capacity in inherit validation models could do wonders to streamline our code. In particular, it could enable us to make non-standard modes 'plug-ins' that people can choose to use.
- With a bit of context,
pydantic
makes the configuration quite easy to follow. It giving intellisense suggestions is very nice too.
Some concerns, though:
- I see this approach as a duplicate of yaml schemas. We should only use one. Keeping both only makes the code harder to understand, imo.
- I do not think this really solves Invalid configuration showing for un-activated modes due to schema defaults #626. The configuration is still 'tangled'. But it does provide a way to solve it.
- More of an open question than a concern: would
pydantic
help in our efforts to make parameters and dimensions part of the input yaml files?
match build_config.backend: | ||
case "pyomo": | ||
return PyomoBackendModel(data, math, **kwargs) | ||
return PyomoBackendModel(data, math, build_config) | ||
case "gurobi": | ||
return GurobiBackendModel(data, math, **kwargs) | ||
return GurobiBackendModel(data, math, build_config) | ||
case _: | ||
raise BackendError(f"Incorrect backend '{name}' requested.") | ||
raise BackendError(f"Incorrect backend '{build_config.backend}' requested.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an area where the old approach and the new pydantic
may be at odds.
Case _
is spurious, since pydantic should catch wrong settings beforehand, no?
from calliope import config, exceptions | ||
from calliope.attrdict import AttrDict | ||
from calliope.backend import helper_functions, parsing | ||
from calliope.exceptions import warn as model_warn | ||
from calliope.io import load_config | ||
from calliope.preprocess.model_math import ORDERED_COMPONENTS_T, CalliopeMath | ||
from calliope.util.schema import ( | ||
MODEL_SCHEMA, | ||
extract_from_schema, | ||
update_then_validate_config, | ||
) | ||
from calliope.util.schema import MODEL_SCHEMA, extract_from_schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An issue here is that config.py
and config/**.yaml
files are at odds, since both provide similar functionality. Do we expect pydantic
to replace our approach completely?
if TYPE_CHECKING: | ||
from calliope import config | ||
from calliope.backend.backend_model import BackendModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need for TYPE_CHECKING
may indicate that config.py
is not being placed sensibly (a possible cyclic import?). Would it make sense to move this into src/calliope/config/
?). The dependencies of config.py
do not seem to conflict with anything else, and this would make things easier to maintain.
@model_validator(mode="before") | ||
@classmethod | ||
def update_solve_mode(cls, data): | ||
"""Solve mode should match build mode.""" | ||
data["solve"]["mode"] = data["build"]["mode"] | ||
return data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to do nothing?
@overload | ||
def model_yaml_schema(self, filepath: str | Path) -> None: ... | ||
|
||
@overload | ||
def model_yaml_schema(self, filepath: None = None) -> str: ... | ||
|
||
def model_yaml_schema(self, filepath: str | Path | None = None) -> None | str: | ||
"""Generate a YAML schema for the class. | ||
|
||
Args: | ||
filepath (str | Path | None, optional): If given, save schema to given path. Defaults to None. | ||
|
||
Returns: | ||
None | str: If `filepath` is given, returns None. Otherwise, returns the YAML string. | ||
""" | ||
# By default, the schema uses $ref/$def cross-referencing for each pydantic model class, | ||
# but this isn't very readable when rendered in our documentation. | ||
# So, we resolve references and then delete all the `$defs` | ||
schema_dict = AttrDict(jsonref.replace_refs(self.model_json_schema())) | ||
schema_dict.del_key("$defs") | ||
return schema_dict.to_yaml(filepath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Nice to have' comment: the need for these overloads stems from to_yaml
wanting to be two functions:
- One saves a YAML schema file
- The other returns a YAML string
An easy fix to make our code leaner would be to split this into two simpler functions: first one generates the string (to_yaml
), the second uses the first to save it (save_yaml
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in config.py
seems to only want the first functionality, generally.
if "applied_math" in model_data.attrs: | ||
self.applied_math = preprocess.CalliopeMath.from_dict( | ||
model_data.attrs.pop("applied_math") | ||
) | ||
if "config" in model_data.attrs: | ||
self.config = config.CalliopeConfig(**model_data.attrs.pop("config")) | ||
self.config.update(model_data.attrs.pop("config_kwarg_overrides")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you detail why updating with the overrides is necessary?
Wouldn't this cause ambiguity in what the pre-existing results relate to?
this_build_config = self.config.update({"build": kwargs}).build | ||
mode = this_build_config.mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of an odd name...
this_build_config
-> new_build_config
, or just build_config
???
|
||
this_solve_config = self.config.update({"solve": kwargs}).solve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above: this naming feels a bit... off?
solve_config
is perfectly fine, IMO
def _prepare_operate_mode_inputs( | ||
self, start_window_idx: int = 0, **config_kwargs | ||
self, operate_config: config.BuildOperate | ||
) -> xr.Dataset: | ||
"""Slice the input data to just the length of operate mode time horizon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an additional comment to my comments on moving towards configuration setups that do not 'tangle', this is the kind of function I'd hope moves into a different file or plug-in, which is only possible if the most basic configuration is not 'tainted' by a plethora of modes.
Ditto for all the other if mode == 'operate'
cases we have lying around.
table_name: str, | ||
data_table: DataTableDict, | ||
data_table_dfs: dict[str, pd.DataFrame] | None = None, | ||
model_definition_path: Path | None = None, | ||
model_definition_path: Path = Path("."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we always pass model_definition_path
, I'd remove the default entirely. It does not seem to be 'optional'.
Less complexity / bugs down the line.
Fixes #626
Partially fixes #619
Thought I'd already upload this so you can contribute to the attempt @irm-codebase. Tests haven't been cleaned up so I expect many will fail.
I'm quite liking pydantic @sjpfenninger. I know we questioned using it some time ago, but now that I've spent more time with it I do wonder whether it might make other parts of the code and input validation cleaner...
Summary of changes in this pull request
model.config.model_dump(exclude_defaults=True)
)build
andsolve
steps have isolated configs that account for ad-hockwargs
, which are stored in the config class asapplied_keyword_overrides
(might want something snappier)operate_[...]
andspores_[...]
config options have returned to being sub-dicts (build.operate.[...]
andbuild.spores.[...]
) so the options can be easily isolated and passed around as necessaryupdate
method, which returns an updated config object, but keeps the base config object unchanged except for content inapplied_keyword_overrides
. So you can't change config options accidentally (e.g.model.config.init.name = "new_name"
won't work).Reviewer checklist