tutorial: replace deprecated DataPoint API #7044

wxtim · 2025-10-17T09:17:06Z

Datapoint is to be switched off from the first of December.

The data is available from the Met Office via the Amazon Sustainability Data Initiative. Amazon link.

Annoyingly the metadata there seems quite limited, so I applied to a bunch of contacts in Obs R&D who insisted that the data has the following properties:

Which raises 2 problems:

The image is in a different projection to previous input data (Transverse rather than obverse Mercator projection)
The domain doesn't seem quite right with the values given (though these could just be a side effect of (1)

Rather than re-jigging the mathematics I've made something vaguely plausible by fiddling with domain values. Hopefully this creates a product good enough for the training purpose for which it is built.

Finally, I have fixed a bug I introduced when I added the SYNOP collecting routine - these wind observations are in meteorological convention (wind is blowing from), but we need where the wind is going to, so all wind directions were 180° off!

I have read CONTRIBUTING.md and added my name as a Code Contributor.
Contains logically grouped changes (else tidy your branch by rebase).
Does not contain off-topic changes (use other PRs for other changes).
Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
Tests are included (or explain why tests are not needed).
Changelog entry included if this is a change that can affect users
Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

wxtim · 2025-10-17T09:32:17Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/flow.cylc

            RESOLUTION = 0.2
            # The area to generate forecasts for (lng1, lat1, lng2, lat2)
-            DOMAIN = -12,48,5,61  # Do not change!
+


What Obs R&D Claim the Domain Should be

But I'm not convinced -

wxtim · 2025-10-17T10:06:36Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-rainfall

+    # TODO - work out why this isn't working
+    scale = get_scale(domain, width)
    offset = get_offset(domain, scale)
+    scale = (1672.2334443981335, 3344.466888796267)


@oliver-sanders - These numbers seem to work (roughly) might I poke you to have a look at why the functions do or don't?

If you run the workflow in debug mode you get piuctures

Compare my images

With these lines

With these lines commented

Actually, only get_offset isn't giving a sensible answer.

NVM - I'll keep poking - I think that this is a problem with the projection.

wxtim · 2025-10-20T10:48:43Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/consolidate-observations

               [y[1] for y in z_coords],
               color='red')
-    plt.savefig('wind.png')
+    plt.savefig(f'{os.environ["CYLC_TASK_LOG_DIR"]}/wind.png')


We're only saving this for people to look at, so it ought to go here?

wxtim · 2025-10-20T10:50:32Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-observations

-    return req.json()['SiteRep']['DV']['Location']
-
-
 def get_archived_data(site_id, time):


I can't see any evidence of actual archived data - Perhaps we should save some at some point?

I think we are using archive data as part of the Rose tutorial. It gets bundled with the app for use in the test config.

I think this test data will need to be re-generated from the new data source (check if this is true) otherwise the Rose app might fail due to a resolution mismatch?

Wierdly no, -we're using archived output of this task for the forecast task. This looks like legacy. Still need a rose PR to follow up this though. PR at metomi/rose#2959

I think we should have a follow up issue to check and fix.

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-observations

oliver-sanders · 2025-10-21T12:04:22Z

@wxtim, FYI, this will need to go into a bugfix release.

oliver-sanders

PR is on the wrong branch.

Some small comments so far:

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-observations

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-rainfall

oliver-sanders · 2025-10-23T10:30:20Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/flow.cylc

+            # A template Map file
+            MAP_FILE = "${CYLC_TASK_LOG_ROOT}-map.html"
+            # Create the html map file in the task's log directory.
+            MAP_FILE = "${CYLC_TASK_LOG_ROOT}-map.html"
+            # The path to the template file used to generate the html map.
+            MAP_TEMPLATE = "$CYLC_WORKFLOW_RUN_DIR/lib/template/map.html"


We didn't need to specify these here before.

Might be easier to stick with using the work or share directory as appropriate?

I did this because I thought that it was quite nice to view the intermediate stages from Cylc Review. Happy to remove if you think I should though.,

cylc/flow/etc/tutorial/cylc-forecasting-workflow/lib/python/util.py

oliver-sanders · 2025-10-23T10:38:58Z

It looks like we have multiple options for reading HDF5 files. h5py, is one, but Pandas and xarray are apparently alternatives.

Pandas might be an appealing option as widely used and is already an optional dependency.

Can we have a quick review of the options to work out which is the most lightweight, easiest to support, least likely to cause problems, etc.

wxtim · 2025-10-23T12:59:00Z

It looks like we have multiple options for reading HDF5 files. h5py, is one, but Pandas and xarray are apparently alternatives.

Pandas might be an appealing option as widely used and is already an optional dependency.

Can we have a quick review of the options to work out which is the most lightweight, easiest to support, least likely to cause problems, etc.

I think that they may all be the same: The documentation certainly suggests that pandas.read_hdf uses pytables, whose docs suggest that uses h5py. Looking at the source of pandas shows h5py is an optional depency. I think it's going to be h5py whatever we choose.

conda-environment.yml

oliver-sanders · 2025-10-30T12:49:22Z

FYI: There's a wrap-around problem with rainfall leaking out of the west of the domain into the east:

(possible that this was a pre-existing issue?)

oliver-sanders · 2025-10-30T13:50:03Z

The get-rainfall step is really slow ~1:40. After a bit of fiddling, I managed to get that down to ~0:30 which is reasonable.

Note: It uses ~25% CPU (but negligible RAM) on my box.

wxtim#75

wxtim · 2025-10-30T15:23:04Z

FYI: There's a wrap-around problem with rainfall leaking out of the west of the domain into the east:

Not me guv.

I'm going to leave this. It's probably not important enough to matter.

wxtim · 2025-11-05T14:43:37Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-observations

    'NNW': '157.5'
 }

+class Meteorology:


This is probably a bit bonkers. I won't mind (very much) if you tell me to take it out again.

Remove unwated ref to datapoint API key remove all references to datapoint api keys Remove api keys from cylc get-resources Update conda-environment.yml Co-authored-by: Oliver Sanders <[email protected]>

Add a meteorological tweak remove unwanted env vars

wxtim · 2025-11-05T14:53:15Z

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/forecast

                                dim_x, dim_y, resolution, resolution,
                                spline_level)

+    domain = util.parse_domain(os.environ['DOMAIN'])


Seems odd to run this every single loop.

oliver-sanders changed the title ~~Killdatapoint~~ tutorial: replace deprecated DataPoint API Oct 17, 2025

oliver-sanders marked this pull request as draft October 17, 2025 09:22

wxtim commented Oct 17, 2025

View reviewed changes

wxtim force-pushed the killdatapoint branch 3 times, most recently from d109fc3 to f684f1d Compare October 17, 2025 10:05

wxtim commented Oct 17, 2025

View reviewed changes

wxtim force-pushed the killdatapoint branch from a06a09b to f0ed18e Compare October 20, 2025 10:33

wxtim requested a review from oliver-sanders October 20, 2025 10:48

wxtim commented Oct 20, 2025

View reviewed changes

cylc/flow/etc/tutorial/cylc-forecasting-workflow/bin/get-observations Outdated Show resolved Hide resolved

wxtim force-pushed the killdatapoint branch from 9d4c799 to 47af945 Compare October 20, 2025 11:03

oliver-sanders assigned wxtim Oct 21, 2025

wxtim marked this pull request as ready for review October 21, 2025 12:36

oliver-sanders reviewed Oct 23, 2025

View reviewed changes

conda-environment.yml Show resolved Hide resolved

wxtim requested a review from oliver-sanders October 29, 2025 10:22

wxtim mentioned this pull request Nov 5, 2025

Update Rose Tutorial to use Canned Amazon Data metomi/rose#2959

Open

wxtim force-pushed the killdatapoint branch from cc2fcec to ce95bf1 Compare November 5, 2025 14:41

wxtim commented Nov 5, 2025

View reviewed changes

wxtim and others added 3 commits November 5, 2025 14:52

Use Radar Data from Obs Via Amazon S3 bucket

94df3b2

Remove unwated ref to datapoint API key remove all references to datapoint api keys Remove api keys from cylc get-resources Update conda-environment.yml Co-authored-by: Oliver Sanders <[email protected]>

tutorial: speedup get-resources

761dbbb

Add a meteorological tweak

096ae04

Add a meteorological tweak remove unwanted env vars

wxtim force-pushed the killdatapoint branch from ce95bf1 to 096ae04 Compare November 5, 2025 14:52

wxtim commented Nov 5, 2025

View reviewed changes

wxtim added this to the 7.8.x milestone Nov 5, 2025

wxtim added doc Documentation dependencies labels Nov 5, 2025

		return req.json()['SiteRep']['DV']['Location']


		def get_archived_data(site_id, time):

tutorial: replace deprecated DataPoint API #7044

Are you sure you want to change the base?

tutorial: replace deprecated DataPoint API #7044

Uh oh!

Conversation

wxtim commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wxtim Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wxtim Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wxtim Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oliver-sanders commented Oct 21, 2025

Uh oh!

oliver-sanders left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oliver-sanders commented Oct 23, 2025

Uh oh!

wxtim commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

oliver-sanders commented Oct 30, 2025

Uh oh!

oliver-sanders commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wxtim commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wxtim commented Oct 17, 2025 •

edited

Loading

wxtim Oct 17, 2025 •

edited

Loading

wxtim Oct 17, 2025 •

edited

Loading

wxtim Nov 5, 2025 •

edited

Loading

wxtim commented Oct 23, 2025 •

edited

Loading

oliver-sanders commented Oct 30, 2025 •

edited

Loading

wxtim commented Oct 30, 2025 •

edited

Loading