Add config object to keep config in sync at all times #704

brynpickering · 2024-11-06T23:51:45Z

Fixes #626
Partially fixes #619

Thought I'd already upload this so you can contribute to the attempt @irm-codebase. Tests haven't been cleaned up so I expect many will fail.

I'm quite liking pydantic @sjpfenninger. I know we questioned using it some time ago, but now that I've spent more time with it I do wonder whether it might make other parts of the code and input validation cleaner...

Summary of changes in this pull request

Config is a pydantic model, replacing the config schema (we can dump a yaml schema at any time, though!)
Config repr hides operate/spores options data if those modes aren't activated
For debugging, pydantic methods can be used to hide defaults (model.config.model_dump(exclude_defaults=True))
build and solve steps have isolated configs that account for ad-hoc kwargs, which are stored in the config class as applied_keyword_overrides (might want something snappier)
operate_[...] and spores_[...] config options have returned to being sub-dicts (build.operate.[...] and build.spores.[...]) so the options can be easily isolated and passed around as necessary
config options are "frozen" unless using the update method, which returns an updated config object, but keeps the base config object unchanged except for content in applied_keyword_overrides. So you can't change config options accidentally (e.g. model.config.init.name = "new_name" won't work).
Intellisense picks up the config option docstrings which is useful when doing development and probably also for users writing scripts!

Reviewer checklist

Test(s) added to cover contribution
Documentation updated
Changelog updated
Coverage maintained or improved

…ig throughout src

irm-codebase · 2024-11-12T10:35:24Z

@brynpickering this sounds really nice. I'll give a thorough look at it.

One thing, based on your description of the additions (and before I check the code): perhaps we should alter things so that the mode extends the configuration, rather than hiding it? This would avoid extra work on our end down the line, and move the code of these modes towards a 'plug-in' approach.

I'll share more thoughts or suggestions in a while.

irm-codebase

This is a comment, for now, since the code is not 100% ready.

Generally, I like this proposal.

Some of the positives I see.

The capacity in inherit validation models could do wonders to streamline our code. In particular, it could enable us to make non-standard modes 'plug-ins' that people can choose to use.
With a bit of context, pydantic makes the configuration quite easy to follow. It giving intellisense suggestions is very nice too.

Some concerns, though:

I see this approach as a duplicate of yaml schemas. We should only use one. Keeping both only makes the code harder to understand, imo.
I do not think this really solves Invalid configuration showing for un-activated modes due to schema defaults #626. The configuration is still 'tangled'. But it does provide a way to solve it.
More of an open question than a concern: would pydantic help in our efforts to make parameters and dimensions part of the input yaml files?

irm-codebase · 2024-11-12T17:05:17Z

src/calliope/backend/__init__.py

+    match build_config.backend:
        case "pyomo":
-            return PyomoBackendModel(data, math, **kwargs)
+            return PyomoBackendModel(data, math, build_config)
        case "gurobi":
-            return GurobiBackendModel(data, math, **kwargs)
+            return GurobiBackendModel(data, math, build_config)
        case _:
-            raise BackendError(f"Incorrect backend '{name}' requested.")
+            raise BackendError(f"Incorrect backend '{build_config.backend}' requested.")


This is an area where the old approach and the new pydantic may be at odds.
Case _ is spurious, since pydantic should catch wrong settings beforehand, no?

irm-codebase · 2024-11-12T17:11:16Z

src/calliope/backend/backend_model.py

+from calliope import config, exceptions
 from calliope.attrdict import AttrDict
 from calliope.backend import helper_functions, parsing
 from calliope.exceptions import warn as model_warn
 from calliope.io import load_config
 from calliope.preprocess.model_math import ORDERED_COMPONENTS_T, CalliopeMath
-from calliope.util.schema import (
-    MODEL_SCHEMA,
-    extract_from_schema,
-    update_then_validate_config,
-)
+from calliope.util.schema import MODEL_SCHEMA, extract_from_schema


An issue here is that config.py and config/**.yaml files are at odds, since both provide similar functionality. Do we expect pydantic to replace our approach completely?

irm-codebase · 2024-11-12T17:23:20Z

src/calliope/backend/where_parser.py

 if TYPE_CHECKING:
+    from calliope import config
    from calliope.backend.backend_model import BackendModel


The need for TYPE_CHECKING may indicate that config.py is not being placed sensibly (a possible cyclic import?). Would it make sense to move this into src/calliope/config/?). The dependencies of config.py do not seem to conflict with anything else, and this would make things easier to maintain.

irm-codebase · 2024-11-12T17:37:10Z

src/calliope/config.py

+    @model_validator(mode="before")
+    @classmethod
+    def update_solve_mode(cls, data):
+        """Solve mode should match build mode."""
+        data["solve"]["mode"] = data["build"]["mode"]
+        return data


This seems to do nothing?

irm-codebase · 2024-11-13T12:18:06Z

src/calliope/config.py

+    @overload
+    def model_yaml_schema(self, filepath: str | Path) -> None: ...
+
+    @overload
+    def model_yaml_schema(self, filepath: None = None) -> str: ...
+
+    def model_yaml_schema(self, filepath: str | Path | None = None) -> None | str:
+        """Generate a YAML schema for the class.
+
+        Args:
+            filepath (str | Path | None, optional): If given, save schema to given path. Defaults to None.
+
+        Returns:
+            None | str: If `filepath` is given, returns None. Otherwise, returns the YAML string.
+        """
+        # By default, the schema uses $ref/$def cross-referencing for each pydantic model class,
+        # but this isn't very readable when rendered in our documentation.
+        # So, we resolve references and then delete all the `$defs`
+        schema_dict = AttrDict(jsonref.replace_refs(self.model_json_schema()))
+        schema_dict.del_key("$defs")
+        return schema_dict.to_yaml(filepath)


'Nice to have' comment: the need for these overloads stems from to_yaml wanting to be two functions:

One saves a YAML schema file

The other returns a YAML string

An easy fix to make our code leaner would be to split this into two simpler functions: first one generates the string (to_yaml), the second uses the first to save it (save_yaml).

The code in config.py seems to only want the first functionality, generally.

irm-codebase · 2024-11-13T13:28:19Z

src/calliope/model.py

        if "applied_math" in model_data.attrs:
            self.applied_math = preprocess.CalliopeMath.from_dict(
                model_data.attrs.pop("applied_math")
            )
+        if "config" in model_data.attrs:
+            self.config = config.CalliopeConfig(**model_data.attrs.pop("config"))
+            self.config.update(model_data.attrs.pop("config_kwarg_overrides"))


Can you detail why updating with the overrides is necessary?
Wouldn't this cause ambiguity in what the pre-existing results relate to?

irm-codebase · 2024-11-13T13:30:16Z

src/calliope/model.py

+        this_build_config = self.config.update({"build": kwargs}).build
+        mode = this_build_config.mode


Kind of an odd name...
this_build_config -> new_build_config, or just build_config???

irm-codebase · 2024-11-13T13:34:30Z

src/calliope/model.py

+
+        this_solve_config = self.config.update({"solve": kwargs}).solve


Same comment as above: this naming feels a bit... off?
solve_config is perfectly fine, IMO

irm-codebase · 2024-11-13T13:37:34Z

src/calliope/model.py

    def _prepare_operate_mode_inputs(
-        self, start_window_idx: int = 0, **config_kwargs
+        self, operate_config: config.BuildOperate
    ) -> xr.Dataset:
        """Slice the input data to just the length of operate mode time horizon.


As an additional comment to my comments on moving towards configuration setups that do not 'tangle', this is the kind of function I'd hope moves into a different file or plug-in, which is only possible if the most basic configuration is not 'tainted' by a plethora of modes.

Ditto for all the other if mode == 'operate' cases we have lying around.

irm-codebase · 2024-11-13T13:40:43Z

src/calliope/preprocess/data_tables.py

        table_name: str,
        data_table: DataTableDict,
        data_table_dfs: dict[str, pd.DataFrame] | None = None,
-        model_definition_path: Path | None = None,
+        model_definition_path: Path = Path("."),


Given that we always pass model_definition_path, I'd remove the default entirely. It does not seem to be 'optional'.

Less complexity / bugs down the line.

brynpickering added 3 commits October 30, 2024 22:02

Update to using pydantic for config

fe421a5

Update config to have operate and spores as subdicts; fix use of conf…

1f96c47

…ig throughout src

Minor cleanup

4f81684

irm-codebase reviewed Nov 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add config object to keep config in sync at all times #704

Add config object to keep config in sync at all times #704

brynpickering commented Nov 6, 2024 •

edited

Loading

irm-codebase commented Nov 12, 2024

irm-codebase left a comment •

edited

Loading

irm-codebase Nov 12, 2024

irm-codebase Nov 12, 2024

irm-codebase Nov 12, 2024

irm-codebase Nov 12, 2024

irm-codebase Nov 13, 2024

irm-codebase Nov 13, 2024

irm-codebase Nov 13, 2024

irm-codebase Nov 13, 2024

irm-codebase Nov 13, 2024

irm-codebase Nov 13, 2024

irm-codebase Nov 13, 2024

		this_build_config = self.config.update({"build": kwargs}).build
		mode = this_build_config.mode


		this_solve_config = self.config.update({"solve": kwargs}).solve

Add config object to keep config in sync at all times #704

Are you sure you want to change the base?

Add config object to keep config in sync at all times #704

Conversation

brynpickering commented Nov 6, 2024 • edited Loading

Summary of changes in this pull request

Reviewer checklist

irm-codebase commented Nov 12, 2024

irm-codebase left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brynpickering commented Nov 6, 2024 •

edited

Loading

irm-codebase left a comment •

edited

Loading