Skip to content

Commit 5bb7660

Browse files
committed
update notebook to be compatible with newer releases
1 parent ab310f8 commit 5bb7660

1 file changed

Lines changed: 15 additions & 19 deletions

File tree

docs/source/tutorial_notebooks/query_gcr_datasets.ipynb

Lines changed: 15 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "markdown",
5-
"id": "302fd57e-ae4f-4a22-95ad-a69573212a98",
5+
"id": "509e7882-7bc8-4180-9634-0be59e4baad7",
66
"metadata": {},
77
"source": [
88
"<div style=\"overflow: hidden;\">\n",
@@ -22,7 +22,7 @@
2222
"\n",
2323
"### Before we begin\n",
2424
"\n",
25-
"Currently (November, 2024) the required versions of gcr-catalogs and dataregistry are only available in the `desc-python-bleed` kernel. Make sure you have selected that kernel while running this tutorial.\n",
25+
"As of September, 2025, the required versions of gcr-catalogs and dataregistry are available in the `desc-python` and `desc-python-bleed` kernels. Make sure you have selected one of those kernels while running this tutorial.\n",
2626
"\n",
2727
"If you haven't done so already, check out the [getting setup](https://lsstdesc.org/dataregistry/tutorial_setup.html) page from the documentation if you want to run this tutorial interactively."
2828
]
@@ -143,10 +143,11 @@
143143
"outputs": [],
144144
"source": [
145145
"from dataregistry import DataRegistry\n",
146-
"from dataregistry.schema import DEFAULT_SCHEMA_PRODUCTION\n",
146+
"from dataregistry.schema import DEFAULT_NAMESPACE\n",
147147
"\n",
148148
"# Establish connection to the production schema\n",
149-
"datareg = DataRegistry(schema=DEFAULT_SCHEMA_PRODUCTION)"
149+
"prod_schema = DEFAULT_NAMESPACE + \"_production\"\n",
150+
"datareg = DataRegistry(schema=prod_schema)"
150151
]
151152
},
152153
{
@@ -176,10 +177,11 @@
176177
},
177178
{
178179
"cell_type": "markdown",
179-
"id": "fa586592-2c2e-428b-b443-33ca26038add",
180+
"id": "f08bc754-4fee-44c0-8468-f2d82d2a9283",
180181
"metadata": {},
181182
"source": [
182-
"That is a list of __all__ columns from __all__ tables, maybe more than we bargained for. Let's restrict it to columns in the `dataset` table."
183+
"By default that prints only the columns in the `dataset` table, which is the most interesting for most purposed.\n",
184+
"Datasets can be associated with an \"execution\" - in practice this could be a run of a script or a job step in a pipeline. Here are the columns for that table:"
183185
]
184186
},
185187
{
@@ -191,16 +193,16 @@
191193
},
192194
"outputs": [],
193195
"source": [
194-
"dataset_columns = [col for col in all_columns if col.startswith('dataset.')]\n",
195-
"print(dataset_columns)"
196+
"execution_columns = datareg.Query.get_all_columns(table=\"execution\")\n",
197+
"print(execution_columns)"
196198
]
197199
},
198200
{
199201
"cell_type": "markdown",
200-
"id": "ad32a278-694a-4364-8dcd-39cdc702039c",
202+
"id": "b43623bd-d903-4fab-881c-ec41a81e46b7",
201203
"metadata": {},
202204
"source": [
203-
"Among the more interesting for our purposes are `name`, `relative_path`, `access_api`, `access_api_configuration` and `location_type`. In the case of catalogs registered with GCRCatalogs, `name` in the data registry is the same name GCRCatalogs uses to refer to it: the basename of the corresponding config file, not including the suffix `.yaml`. But keep in mind that, unlike GCRCatalog, the dataregistry always respects case in names\n",
205+
"Among the more interesting dataset columns for our purposes are `name`, `relative_path`, `access_api`, `access_api_configuration` and `location_type`. In the case of catalogs registered with GCRCatalogs, `name` in the data registry is the same name GCRCatalogs uses to refer to it: the basename of the corresponding config file, not including the suffix `.yaml`. But keep in mind that, unlike GCRCatalog, the dataregistry always respects case in names\n",
204206
"\n",
205207
"Let's look at those properties for the dataset `cosmoDC2_v1.1.4`."
206208
]
@@ -294,19 +296,13 @@
294296
"source": [
295297
"It all looks pretty much as you would expect, except what happened to the value of `dataset.relative_path`? That doesn't look like a path. You can see the reason in the catalog's configuration: it's based on another catalog. Or you can see it in the value for `dataset.location_type`. \"meta_only\" means that the data registry is only storing metadata for the catalog; it is not keeping track of the (indirectly) associated files. The same thing would happen for a composite catalog: the data registry just stores the catalog's configuration. It doesn't know how to parse it as GCRCatalogs would."
296298
]
297-
},
298-
{
299-
"cell_type": "markdown",
300-
"id": "5721858e-8e42-4285-9ef0-ead3d780e918",
301-
"metadata": {},
302-
"source": []
303299
}
304300
],
305301
"metadata": {
306302
"kernelspec": {
307-
"display_name": "desc-python-bleed",
303+
"display_name": "desc-python",
308304
"language": "python",
309-
"name": "desc-python-bleed"
305+
"name": "desc-python"
310306
},
311307
"language_info": {
312308
"codemirror_mode": {
@@ -318,7 +314,7 @@
318314
"name": "python",
319315
"nbconvert_exporter": "python",
320316
"pygments_lexer": "ipython3",
321-
"version": "3.12.7"
317+
"version": "3.12.11"
322318
}
323319
},
324320
"nbformat": 4,

0 commit comments

Comments
 (0)