make a serializable version of FunctionalProfile and FunctionalMagnetism #219

bmaranville · 2024-11-06T23:27:58Z

These versions of FunctionalProfile and FunctionalMagnetism differ from the original:

if the classes are initialized with a profile function, they will read the source with inspect.getsource.
if the classes are initialized with profile_source string, the string will be executed in a context with scipy and np defined, and the function that results will be pulled out. (the function is expected to be the only defined name in the locals context when the exec is complete, so the string should contain only a function definition and nothing else)
profile functions can only use basic math operators and anything in the np and scipy namespaces (these are embedded in the execution context), e.g.

def blobby(z, a, b):
    return scipy.special.erf(b) * np.sin(a * z)

parameters for the function are not added to the class as attributes
they are instead added to a dict attribute FunctionalProfile.profile_params
new attributes are added which are Parameter objects, and a Calculation object in their slot attribute:
- rho_start
- irho_start
- rho_end
- irho_end
the start and end attributes are now properties, which create an SLD object, e.g. start == SLD(rho=rho_start, irho=irho_start)

For FunctionalMagnetism, a further difference is that the total thickness is stored as an expression in the class. You can set it with a set_anchor function as before, but now that function will construct a thickness expression, or you can pass in an existing Parameter or Expression, e.g. layer[3].thickness + layer[4].thickness

bmaranville · 2024-11-06T23:30:09Z

Given that these changes will probably break the way that e.g. molgroups use FunctionalProfile, we should probably offer them as a new type rather than replacing the existing classes.

acaruana2009 · 2024-11-07T10:59:53Z

Given that these changes will probably break the way that e.g. molgroups use FunctionalProfile, we should probably offer them as a new type rather than replacing the existing classes.

I am happy with this. How would it work in principle? Would we just have a different name for the class/type?

…t to convert to Parameter objects

pkienzle · 2024-11-26T17:30:14Z

Not loving the increased verbosity of layer.profile_params["period"] but I don't have a counterproposal. The class already has a parameters attribute, so simplifying it to pars or params would be confusing.

Does the existing layer.period use too much behind-the-scenes magic? Do the dynamic attributes interfere with dataclass serialization?

Maybe have a profile_parameters attribute to support serialization/deserialization and document it as an internal parameter that should be ignored in manual.

start/end as properties could be problematic since they create a new SLD object each time. That is, layer.start == layer.start will be false. I'm not sure this matters, though, since the underlying parameters are shared.

Is there a case for end users providing {rho,irho}_{start,end} parameters when creating a profile? Or are these also noise required for dataclass deserialization?

bmaranville · 2024-11-26T19:16:09Z

Actually, the way dataclass equality comparison works the result is

(layer.start == layer.start) is True

(it compares attributes, and they're all equal)

The dynamic attributes don't specifically interfere with serialization, but add maintenance overhead for the developers (weird bugs!) and cognitive overhead for the users (not a common pattern in python anymore, and full of footguns).

The rho_start etc. are by default None, and I agree most users won't initialize them directly.

pkienzle · 2025-03-13T00:39:35Z

This doesn't capture the variables from the script used inside the function.

Here's some code to run in a jupyter cell to explore the space:

import numpy as np
from dataclasses import dataclass
from typing import Any
@dataclass
class CapturedFunction:
    name: str
    source: str
    context: dict[str, Any]
    def __init__(self, name, source, context):
        self.name = name
        self.source = source
        self.context = context
        self._fn = None

    @property
    def fn(self):
        if self._fn is None:
            import numba
            global_context = {
                'np': np,
                'njit': numba.njit,
                'prange': numba.prange,
                **self.context,
            }
            local_context = {}
            #print(f"compiling <{self.source}> in", self.context)
            exec(self.source, global_context, local_context)
            #print("local context", local_context)
            self._fn = local_context[self.name]
        return self._fn

    def __call__(self, *args, **kw):
        return self.fn(*args, **kw)
    
def capture_source(fn):
    if isinstance(fn, CapturedFunction):
        return fn
    
    import inspect
    from textwrap import dedent
    #from copy import deepcopy
    #import bumps.pmath
    #whitelist = bumps.pmath.__all__
    base_context = {'np': np} #{k: getattr(np, k) for k in whitelist}

    name = fn.__name__
    print("type fn", type(fn))
    source = dedent(inspect.getsource(fn))
    print("source =>", source)
    context = {}
    try:
        closure = inspect.getclosurevars(fn)
        print("closure =>", closure)
        # build context from closure
        # ignoring builtins and unbound for now
        for k, v in closure.globals.items():
            if k not in base_context:
                context[k] = v # TODO: deepcopy(v) ?
        for k, v in closure.nonlocals.items():
            if k not in base_context:
                context[k] = v # TODO: deepcopy(v) ?
    except Exception as exc:
        print("exception capturing closure", exc)
    
    capture = CapturedFunction(name, source, context)
    capture._fn = fn # already have the function; don't need to recompile
    return capture

cutoff = 5
def topf(x, a=1, b=5):
    return np.minimum(a*x + b, cutoff)
def do():

    def roundtrip(fn, kwargs):
        import inspect
        value = fn(**kwargs)

        #print("1")
        capture = capture_source(fn)
        print("source => ", capture.source)
        print("context => ", capture.context)
        #assert (value == capture.fn(**kwargs)).all()

        # Force recompile
        capture._fn = None
        assert (value == capture.fn(**kwargs)).all()

    kwargs = dict(a=15, b=3, x=np.arange(5))

    print("== simple ==")
    def f(x, a=1, b=5): return a*x + b
    roundtrip(f, kwargs)
    
    # Can't do lambda because the function object doesn't have a name
    # Because inspect.getsource is so stupid, we would have to extract the
    # lambda expression from a source line such as f = lambda x, a, b: ...
    # or roundtrip(lambda x, a, b: ..., kwargs). Can't do either of these
    # without a full parser.
    if 0:
        print("== lambda ==")
        f = lambda x, a=1, b=5: a*x + b
        roundtrip(f, kwargs)
        print("== lambda ==")
        roundtrip(
            lambda x, a=1, b=5: \
            a*x + b,
            kwargs,
        )
    
    print("== function ==")
    def f(x, a=1, b=5): return np.sin(a*x + b)
    roundtrip(f, kwargs)
        
    print("== closure over locals ==")
    cutoff = 36
    def f(x, a=1, b=5): return np.minimum(a*x + b, cutoff)
    roundtrip(f, kwargs)

    print("== closure over globals ==")
    roundtrip(topf, kwargs)

    print("== numba jit with prange ==")
    from numba import njit, prange
    # Numba doesn't work with closures. It uses the value
    # as defined when njit was called. I don't see anything
    # obvious in the jitted object that contains the context.
    #cutoff = 4
    @njit(parallel=True)
    def f(x, a=1, b=5):
        cutoff = 4
        result = np.empty_like(x)
        for k in prange(len(x)):
            if x[k] < cutoff:
                result[k] = a*x[k] + b
            else:
                result[k] = cutoff
        return result
    #print(f(**kwargs))
    #print("jit", dir(f))
    #print(f.py_func.__code__)
    roundtrip(f, kwargs)

    # helper functions
    # class method
    # instance method
    # helper class
    # third party packages like pandas

do()

make a serializable version of FunctionalProfile and FunctionalMagnetism

6a2cd6f

bmaranville requested review from acaruana2009 and hoogerheide November 6, 2024 23:28

bmaranville added 4 commits November 7, 2024 10:44

cleanup unused functions, and process any supplied profile_params dic…

ce422ba

…t to convert to Parameter objects

remove unused to_dict methods

ca3b290

update example model for new functional profile classes

bb12c5e

minor cleanup

90cc6d8

pkienzle mentioned this pull request May 2, 2025

Improve serialization of functional profiles #276

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

make a serializable version of FunctionalProfile and FunctionalMagnetism #219

make a serializable version of FunctionalProfile and FunctionalMagnetism #219

Uh oh!

bmaranville commented Nov 6, 2024

Uh oh!

bmaranville commented Nov 6, 2024

Uh oh!

acaruana2009 commented Nov 7, 2024

Uh oh!

pkienzle commented Nov 26, 2024

Uh oh!

bmaranville commented Nov 26, 2024

Uh oh!

pkienzle commented Mar 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

make a serializable version of FunctionalProfile and FunctionalMagnetism #219

Are you sure you want to change the base?

make a serializable version of FunctionalProfile and FunctionalMagnetism #219

Uh oh!

Conversation

bmaranville commented Nov 6, 2024

Uh oh!

bmaranville commented Nov 6, 2024

Uh oh!

acaruana2009 commented Nov 7, 2024

Uh oh!

pkienzle commented Nov 26, 2024

Uh oh!

bmaranville commented Nov 26, 2024

Uh oh!

pkienzle commented Mar 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants