Skip to content

Aggressive boundary serialization is surprising in practice #78

Description

@elinscott

I was surprised by some of the serialization that pyfunctions do, and I think it's probably overly aggressive.

For example, a @task PyFunction declared as def f(structure: StructureData) does not receive a StructureData at runtime. aiida_pythonjob/data/deserializer.py::structure_data_to_atoms is registered as the default deserializer, so the body actually sees an ase.Atoms. Common AiiDA patterns then fail e.g.

  • family.get_pseudos(structure=structure) raises a type-validation error.
  • for site in structure.sites: raises AttributeError.

The function signature says StructureData, which totally misrepresents what the body actually receives. I can understand serializers that take orm.Int to int as the two behaviour is the same. But StructureData to ase.Atoms?

As a second example, aiida_pseudo.data.pseudo.upf.UpfData has no registered deserializer and no .value. Passing a dict[str, UpfData] into a @task PyFunction raises ValueError: Cannot deserialize AiiDA data of type ...

The result is that every AiiDA Data type that isn't in builtin_serializers requires explicit per-task deserializers plumbing, even when the natural intent (simply carry the node through) is — in my opinion — the more intuitive approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions