Skip to content

Data array input for zt and zu works, but does it do the right thing? #81

@jbusecke

Description

@jbusecke

I just ran this example:

import xarray as xr
from aerobulk.flux import noskin_np, skin_np, noskin, skin

def wrap_da(input:np.ndarray) -> xr.DataArray:
    return xr.DataArray([np.float64(i) for i in input], dims={'time':np.arange(len(input))})
    
sst = wrap_da([270, 270])
t_zt = wrap_da([190, 190])                           
hum_zt = wrap_da([0.05, 0.05])
u_zu = wrap_da([50.0, 50.0])                      
v_zu = wrap_da([50.0, 50.0])
slp = wrap_da([110000, 110000])
zt = wrap_da([10, 1])                              
zu = wrap_da([1, 10])

algo = 'coare3p6'
ql, qh, taux, tauy, evap = noskin(
    sst=sst,
    t_zt=t_zt, 
    hum_zt=hum_zt, 
    u_zu=u_zu, 
    v_zu=v_zu, 
    slp=slp, 
    algo=algo, 
    zt=zt, 
    zu=zu, 
)

Which worked, but I am suspicious of how multiple values for zt and zu are used. This behavior is not documented since we say the only allowed int types for those inputs. From the above we get two output values that are identical (here for the latent heatflux as example)

image

This seems fishy to start with, as I would expect the vastly different values for the height to produce different outputs (when all other inputs are kept identical).

Indeed when running some further tests:

# evaluate each point separately with integer input for height values
ql_0, qh_0, tau_0, tauy_0, evap_0 = noskin(
    sst=sst,
    t_zt=t_zt, 
    hum_zt=hum_zt, 
    u_zu=u_zu, 
    v_zu=v_zu, 
    slp=slp, 
    algo=algo, 
    zt=10, 
    zu=1, 
    niter=10, 
    input_range_check=True
)

ql_1, qh_1, tau_1, tauy_1, evap_1 = noskin(
    sst=sst,
    t_zt=t_zt, 
    hum_zt=hum_zt, 
    u_zu=u_zu, 
    v_zu=v_zu, 
    slp=slp, 
    algo=algo, 
    zt=1, 
    zu=10, 
    niter=10, 
    input_range_check=True
)

We actually get different results for the different height inputs

image

What seems to happen is that in the original case something in our code is silently taking the first value of zt and zu and applying that to all data.

For context this was brought up in the context of #79 by @juliasimpson97 who wants to use aerobulk-python for observational data. This data can be described as a timeseries of inputs where the height of measurement is also varying in time (hence is provided as a dataarray of equal length).

I think that for that specific use case, we might possibly run up agains the way aerobulk (the fortran code) is set up - it is very much designed for model datasets where the height for a large 2/3 d array is going to be a constant.

A short solution for @juliasimpson97 could be to loop over all the values, calculate them one by one, and concatenate the results. This will be slow, but the other way of doing this seems to produce results that are incorrect.

In the longer run we need to see if there is a way to replicate this solution as part of the .apply_ufunc call and thus allow xr.Dataarray input for zt/zu. If that is not possible/desired we might consider at least warning the user about this and adding an explicit check for those inputs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions