Skip to content

Conversation

@wxtim
Copy link
Member

@wxtim wxtim commented Oct 17, 2025

Closes #6053

Datapoint is to be switched off from the first of December.

The data is available from the Met Office via the Amazon Sustainability Data Initiative. Amazon link.

Annoyingly the metadata there seems quite limited, so I applied to a bunch of contacts in Obs R&D who insisted that the data has the following properties:

image

Which raises 2 problems:

  1. The image is in a different projection to previous input data (Transverse rather than obverse Mercator projection)
  2. The domain doesn't seem quite right with the values given (though these could just be a side effect of (1)

Rather than re-jigging the mathematics I've made something vaguely plausible by fiddling with domain values. Hopefully this creates a product good enough for the training purpose for which it is built.

Finally, I have fixed a bug I introduced when I added the SYNOP collecting routine - these wind observations are in meteorological convention (wind is blowing from), but we need where the wind is going to, so all wind directions were 180° off!

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@oliver-sanders oliver-sanders changed the title Killdatapoint tutorial: replace deprecated DataPoint API Oct 17, 2025
@oliver-sanders oliver-sanders marked this pull request as draft October 17, 2025 09:22
RESOLUTION = 0.2
# The area to generate forecasts for (lng1, lat1, lng2, lat2)
DOMAIN = -12,48,5,61 # Do not change!

Copy link
Member Author

@wxtim wxtim Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Obs R&D Claim the Domain Should be

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'm not convinced -

@wxtim wxtim force-pushed the killdatapoint branch 3 times, most recently from d109fc3 to f684f1d Compare October 17, 2025 10:05
# TODO - work out why this isn't working
scale = get_scale(domain, width)
offset = get_offset(domain, scale)
scale = (1672.2334443981335, 3344.466888796267)
Copy link
Member Author

@wxtim wxtim Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oliver-sanders - These numbers seem to work (roughly) might I poke you to have a look at why the functions do or don't?

If you run the workflow in debug mode you get piuctures

Compare my images

Actually, only get_offset isn't giving a sensible answer.

NVM - I'll keep poking - I think that this is a problem with the projection.

[y[1] for y in z_coords],
color='red')
plt.savefig('wind.png')
plt.savefig(f'{os.environ["CYLC_TASK_LOG_DIR"]}/wind.png')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only saving this for people to look at, so it ought to go here?

return req.json()['SiteRep']['DV']['Location']


def get_archived_data(site_id, time):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see any evidence of actual archived data - Perhaps we should save some at some point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are using archive data as part of the Rose tutorial. It gets bundled with the app for use in the test config.

I think this test data will need to be re-generated from the new data source (check if this is true) otherwise the Rose app might fail due to a resolution mismatch?

Copy link
Member Author

@wxtim wxtim Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wierdly no, -we're using archived output of this task for the forecast task. This looks like legacy. Still need a rose PR to follow up this though. PR at metomi/rose#2959

I think we should have a follow up issue to check and fix.

@oliver-sanders
Copy link
Member

@wxtim, FYI, this will need to go into a bugfix release.

@wxtim wxtim marked this pull request as ready for review October 21, 2025 12:36
Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is on the wrong branch.

Some small comments so far:

Comment on lines 78 to 83
# A template Map file
MAP_FILE = "${CYLC_TASK_LOG_ROOT}-map.html"
# Create the html map file in the task's log directory.
MAP_FILE = "${CYLC_TASK_LOG_ROOT}-map.html"
# The path to the template file used to generate the html map.
MAP_TEMPLATE = "$CYLC_WORKFLOW_RUN_DIR/lib/template/map.html"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't need to specify these here before.

Might be easier to stick with using the work or share directory as appropriate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because I thought that it was quite nice to view the intermediate stages from Cylc Review. Happy to remove if you think I should though.,

@oliver-sanders
Copy link
Member

It looks like we have multiple options for reading HDF5 files. h5py, is one, but Pandas and xarray are apparently alternatives.

Pandas might be an appealing option as widely used and is already an optional dependency.

Can we have a quick review of the options to work out which is the most lightweight, easiest to support, least likely to cause problems, etc.

@wxtim
Copy link
Member Author

wxtim commented Oct 23, 2025

It looks like we have multiple options for reading HDF5 files. h5py, is one, but Pandas and xarray are apparently alternatives.

Pandas might be an appealing option as widely used and is already an optional dependency.

Can we have a quick review of the options to work out which is the most lightweight, easiest to support, least likely to cause problems, etc.

I think that they may all be the same: The documentation certainly suggests that pandas.read_hdf uses pytables, whose docs suggest that uses h5py. Looking at the source of pandas shows h5py is an optional depency. I think it's going to be h5py whatever we choose.

@wxtim wxtim requested a review from oliver-sanders October 29, 2025 10:22
@oliver-sanders
Copy link
Member

FYI: There's a wrap-around problem with rainfall leaking out of the west of the domain into the east:

(possible that this was a pre-existing issue?)

Screenshot from 2025-10-30 12-48-25

@oliver-sanders
Copy link
Member

oliver-sanders commented Oct 30, 2025

The get-rainfall step is really slow ~1:40. After a bit of fiddling, I managed to get that down to ~0:30 which is reasonable.

Note: It uses ~25% CPU (but negligible RAM) on my box.

wxtim#75

@wxtim
Copy link
Member Author

wxtim commented Oct 30, 2025

FYI: There's a wrap-around problem with rainfall leaking out of the west of the domain into the east:

Not me guv.

image

I'm going to leave this. It's probably not important enough to matter.

'NNW': '157.5'
}

class Meteorology:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a bit bonkers. I won't mind (very much) if you tell me to take it out again.

wxtim and others added 3 commits November 5, 2025 14:52
Remove unwated ref to datapoint API key

remove all references to datapoint api keys

Remove api keys from cylc get-resources

Update conda-environment.yml

Co-authored-by: Oliver Sanders <[email protected]>
Add a meteorological tweak

remove unwanted env vars
dim_x, dim_y, resolution, resolution,
spline_level)

domain = util.parse_domain(os.environ['DOMAIN'])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems odd to run this every single loop.

@wxtim wxtim added this to the 7.8.x milestone Nov 5, 2025
@wxtim wxtim added doc Documentation dependencies labels Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies doc Documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tutorial workflow: move away from datapoint

2 participants