[V2 Migration] Move metadata modals in tile charts to V2 by nick-nlb · Pull Request #6078 · datacommonsorg/website

nick-nlb · 2026-03-13T18:27:30Z

Issue

b/491842059

Description

This is part one of the conversion of the website metadata handling from V1 to V2.

Previously, metadata fetching from the front end via metadata_fetcher.ts was performed by making a sequence of discrete endpoint calls, which, once resolved, were collated in the frontend into a final metadata object. Most importantly for this task, this previous method leaned heavily on v1 endpoints v1/bulk/info/variable-group and /v1/bulk/info/variable.

Solution

We have moved what used to be multiple frontend calls into a single Flask endpoint /metadata, that then uses only v2 to fetch the data. This endpoint consolidates that data and returns it to the frontend.

Because the v2 methods used to pull metadata require entities, these entities needed to be plumbed up through from each tile into the metadata modal.

The changes comprise the following distinct areas:

The metadata.py endpoint. This is a new endpoint that takes as parameters: entities, statVars, a statVar to facet map and an optional list of actively selected facets. Aside from the addition of the entities, these are the same inputs that the old, frontend-based V1 facet interface was taking
A new, much lighter "V2" version of the frontend facet fetching function.
Plumbing of the entities from the scatter, highlight, bar and timeline charts into the metadata modal.

Notes

An important aspect of the conversion from v1 to v2 is that v1 did not require entity information to function, whereas v2 does. With the bulk/info/variable-group endpoint, entities did not need to be passed up through from the charts to the metadata modal. However, this also meant that the old metadata fetch could not correctly resolve dates to the entities involved (they were scoped to the entire facet), meaning that the dates were often far broader than they are now.

With the v2 version of the metadata fetch, we now use the entities and are able to get dates that are location-scoped. A note here that these do not necessarily correspond to exactly what is shown in the chart (although they usually do), but rather what dates are available for the entity/facet/stat var combinations.

Testing

For the most part, the metadata modal should show the same information as it does in production. There are two areas where this can be expected to diverge:

Dates: Dates will mostly be the same. However, because we are now including entities in how we reckon dates, these are sometimes different between the V2 and V1 versions of the metadata. In those cases, V2 should provide narrower (and more accurate dates).
Topics: The topic section of the metadata will sometimes provide more than one vertical (i.e., Demographics, Economy) in the new version, as we walk up the graph to each vertical, because we sometimes have multiple memberOf, specializationOf entries that end up resolving to different verticals. In the original metadata modal, this usually showed only a single vertical.

These changes only apply to the tiles (and so the explore page). The changes can be tested by running the same query (i.e., "Demographics in the United States") side-by-side locally and in production, to verify that, in the various chart configurations (and facet selections), the data is consistent between the two versions (aside from the exceptions listed above).

Upcoming

This is the first change in a series. Pending changes coming in later PRs are:

The extension of this functionality to the standardized vis tools.
The moving of facet selection dialog metadata to V2.

…ls made directly via the frontend to a single collated Flask endpoint that returns the same data. Entities, now required, and now plumbed up from the various tile sources (except for the highlight).

…he backend are used in a filter applied directly to the V2 call.

…n the full facet to attribute mapping (as the endpoint is just discarding that information).

…o entities, rather than just the parent

…a little too small).

gemini-code-assist · 2026-03-13T18:27:59Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request initiates the migration of the website's metadata handling from a V1-based approach to a more efficient V2 system, specifically targeting tile charts. The core change involves centralizing metadata retrieval through a new backend endpoint, which processes requests using V2 APIs. This transition enhances the accuracy of date ranges and the granularity of topic information presented in metadata modals by incorporating entity-specific data, which was not a requirement in the previous V1 implementation.

Highlights

New V2 Metadata Endpoint: Introduced a new Flask endpoint /api/shared/metadata that consolidates multiple frontend calls into a single backend request, utilizing only V2 APIs for metadata fetching.
Frontend V2 Facet Fetching: Implemented a new, lighter frontend function (fetchMetadataV2) for fetching facets, which interacts with the new V2 backend endpoint.
Entity Plumbing for Metadata Modals: Plumbed entity information from various tile charts (scatter, highlight, bar, timeline, map, ranking) up to the metadata modal, enabling more accurate, location-scoped date resolution and potentially richer topic categorization in V2.

Changelog

server/init.py
- Registered the new shared_metadata blueprint to expose the /api/shared/metadata endpoint.
server/routes/shared_api/metadata.py
- Added a new Python file defining the /api/shared/metadata endpoint, which handles V2 metadata fetching, category traversal, facet processing, and provenance lookups.
server/services/datacommons.py
- Modified the v2observation function to accept an optional filter parameter, allowing for filtering by facetIds in V2 observation requests.
static/js/components/tiles/bar_tile.tsx
- Passed the entities prop to the ChartTileContainer component.
static/js/components/tiles/chart_tile.tsx
- Added an entities prop to the ChartTileContainerProp interface and propagated it to TileSources and ChartEmbed components.
static/js/components/tiles/highlight_tile.tsx
- Passed the entities prop, containing the place DCID, to the ChartTileContainer component.
static/js/components/tiles/line_tile.tsx
- Passed the entities prop, derived from place DCIDs, to the ChartTileContainer component.
static/js/components/tiles/map_tile.tsx
- Added a useMemo hook to derive entities from mapChartData and passed it to the ChartTileContainer component.
static/js/components/tiles/ranking_tile.tsx
- Added a useMemo hook to derive entities from rankingData and passed it to the ChartTileContainer component.
static/js/components/tiles/scatter_tile.tsx
- Added a useMemo hook to derive entities from scatterChartData and passed it to the ChartTileContainer component.
static/js/components/tiles/sv_ranking_units.tsx
- Derived entities from ranking points and passed it to the TileSources component.
static/js/place/chart_embed.tsx
- Imported fetchMetadataV2, added an entities prop to ChartEmbedPropsType, and conditionally called fetchMetadataV2 if entities are present, otherwise fetchMetadata.
static/js/tools/shared/metadata/metadata_fetcher.ts
- Added fetchMetadataV2 function to fetch metadata using the new V2 backend endpoint, consolidating V1 calls.
static/js/tools/shared/metadata/tile_metadata_modal.tsx
- Imported fetchMetadataV2, added an entities prop to TileMetadataModalPropType, and conditionally called fetchMetadataV2 if entities are present, otherwise fetchMetadata.
static/js/tools/shared/metadata/tile_sources.tsx
- Added an entities prop to the TileSources component and passed it to the TileMetadataModal.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

The pull request introduces a new backend API endpoint (/api/shared/metadata) to fetch comprehensive metadata for statistical variables and entities, consolidating existing metadata fetching logic and enhancing the v2observation function to support facet filtering. Frontend chart tile components are updated to collect and pass entity DCIDs to this new, more efficient metadata API. Review comments suggest addressing a potential IndexError when accessing licenseType if the list is empty and removing an unused node_data variable for improved code clarity in the new metadata API implementation.

server/routes/shared_api/metadata.py

juliawu

Great changes! Left some questions inline. Generally would just like some more inline comments in metadata.py for the new functions, to help future maintainers.

juliawu · 2026-03-16T16:10:20Z

server/routes/shared_api/metadata.py

+
+from server.services import datacommons as dc
+
+bp = Blueprint("metadata", __name__, url_prefix='/api/shared/metadata')


Do we want /shared/ in the API path? I see that this file is in the shared_api/ folder, but none of the other files in it use this convention. E.g. shared_api/stats.py serves /api/stats API routes.

What do you think of making the url prefix /api/metadata instead?

Very good call - it definitely should not have the "shared" in the route!

juliawu · 2026-03-16T16:11:04Z

server/routes/shared_api/metadata.py

+
+bp = Blueprint("metadata", __name__, url_prefix='/api/shared/metadata')
+
+MAX_CATEGORY_DEPTH = 50


Please add a comment on what MAX_CATEGORY_DEPTH does.

juliawu · 2026-03-16T16:11:36Z

server/routes/shared_api/metadata.py

+
+MAX_CATEGORY_DEPTH = 50
+
+MEASUREMENT_METHODS_SUPPRESSION_PROVENANCES: set[str] = {"WikipediaStatsData"}


Please add a comment on what MEASUREMENT_METHODS_SUPPRESSION_PROVENANCES does. E.g., what happens if I add a provenance to this set?

juliawu · 2026-03-16T16:32:21Z

static/js/tools/shared/metadata/metadata_fetcher.ts

+ * This version utilizes a consolidated backend API endpoint that contains no
+ * V1 calls.


When you go back to remove the old endpoint, could you remove or update this line as well? In the future once we're completely off of V1, this note might be confusing about what it's referring to.

I removed this right away (as the old endpoint will be removed relatively shortly anyway).

juliawu · 2026-03-16T16:47:38Z

server/routes/shared_api/metadata.py

+async def fetch_categories_async(stat_vars: list[str]) -> dict[str, list[str]]:
+  """Traverses the category hierarchy tree up to top-level topics."""


This function is quite long, could you add a little more to the docstring about what this function is doing and how? E.g. what the str -> list[str] dictionary represents (I think statVar -> top-level topics) and the type of traversal implemented (I see both BFS and DFS here)

Both BFS and DFS! (one for fetching, one for traversal). I've updated the documentation.

juliawu · 2026-03-16T17:03:00Z

server/routes/shared_api/metadata.py

+  for sv in stat_vars:
+    tops = set()
+
+    def traverse(n: str, curr_visited: set[str]) -> None:


Can you add a short docstring on what traverse() does?

juliawu · 2026-03-16T17:06:44Z

server/routes/shared_api/metadata.py

+  sv_top_levels = collections.defaultdict(list)
+  all_top_level_dcids = set()
+
+  for sv in stat_vars:


Similarly, can you add a note on what this for loop does?

juliawu · 2026-03-16T17:08:45Z

server/routes/shared_api/metadata.py

+  category_map: dict[str, list[str]] = {}
+  if all_top_level_dcids:
+    parent_name_resp = await asyncio.to_thread(dc.v2node,
+                                               list(all_top_level_dcids),
+                                               '->name')
+    parent_name_map = {}
+    for pid in all_top_level_dcids:
+      nodes = _get_arc_nodes(parent_name_resp, pid, 'name')
+      if nodes:
+        parent_name_map[pid] = nodes[0].get('value')
+
+    for sv in stat_vars:
+      category_map[sv] = [
+          parent_name_map.get(p) or p.split('/')[-1]
+          for p in sv_top_levels.get(sv, [])
+      ]
+  else:
+    category_map = {sv: [] for sv in stat_vars}


Similarly, could you add a comment on what this code block does? I see fetching names for top level dcids and building the final map with them.

juliawu · 2026-03-16T17:09:23Z

server/routes/shared_api/metadata.py

+
+    for sv in stat_vars:
+      category_map[sv] = [
+          parent_name_map.get(p) or p.split('/')[-1]


Took me a bit to get why you used p.split('/')[-1] here, could you add a quick explanation comment?

juliawu · 2026-03-16T17:12:49Z

static/js/place/chart_embed.tsx

+        let metadataResp;
+        if (this.props.entities && this.props.entities.length > 0) {
+          metadataResp = await fetchMetadataV2(
+            this.props.entities,
+            statVarSet,
+            statVarToFacets,
+            apiRoot,
+            facets
+          );
+        } else {
+          metadataResp = await fetchMetadata(
+            statVarSet,
+            facets,
+            dataCommonsClient,
+            statVarToFacets,
+            apiRoot
+          );
+        }


For both here and in tile_metadata_modal, what is the migration plan for the "else" case here? Would we ever reach that code block going forward?

The migration plan (which will happen as soon as we've converted all calls to V2 by supplying entities) will be to:

Remove the else

Remove the old fetchMetadata function completely

Rename fetchMetadataV2 to fetchMetadata

…tion out from a closure to a separate top-level function, and more thoroughly document the more complex areas of the metadata endpoint.

juliawu

Thank you for the updates!

## Issue [b/491842059](https://buganizer.corp.google.com/issues/491842059) ## Description This is part two of the conversion of the website metadata handling from V1 to V2. See [6078](#6078) for part 1, which describes the overall purpose. This part converts metadata calls related to the three tools (timeline, scatter and map) to use the V2 metadata endpoint. ## Solution The primary change was to send entities up through into the metadata modal and the chart embed for each of the tools. However, this update (and the general move to V2) highlighted some changes that needed to be made to all three of the tools in order to supply the precisely used numerator and demoninator facets in each of the charts. This is what makes up the majority of the changes in this PR. ## Testing For the most part, the metadata modal should show the same information as it does in production. However, there are certain areas where we can expect divergences. The production metadata modals do not recognize denominator values (and so will not display the chosen denomator. Additionally, the production metadata modals will sometimes show only a single facet when more facets were actually used (ultimately for the same reason). The following chart demonstrates both of these issues: [Literacy in India](https://datacommons.org/tools/scatter#svx%3DCount_Person_BelowPovertyLevelInThePast12Months_AsFractionOf_Count_Person%26dx%3DCount_Person%26svy%3DCount_Person_Literate%26pcy%3D1%26dy%3DCount_Person%26epd%3Dcountry%2FIND%26ept%3DAdministrativeArea1) ## Upcoming This is the second change in a series. Pending changes coming in later PRs are: * The moving of facet selection dialog metadata to V2. * Final cleanup and removal of the non V2 endpoints.

nick-nlb added 13 commits March 9, 2026 11:38

Implement initial metadata endpoint, moving functionality from V1 cal…

6698c76

…ls made directly via the frontend to a single collated Flask endpoint that returns the same data. Entities, now required, and now plumbed up from the various tile sources (except for the highlight).

Entities now being passed up from the highlight tile.

783b4ce

Merge branch 'master' into metadata_v2_tiles

6ca71c1

Merge branch 'master' into metadata_v2_tiles

e311505

Merge branch 'master' into metadata_v2_tiles

7dd1613

Update facet handling so that the frontend metadata entries sent to t…

4c4e1d0

…he backend are used in a filter applied directly to the V2 call.

Update the metadata endpoint to accept a list of facet ids rather tha…

21c4618

…n the full facet to attribute mapping (as the endpoint is just discarding that information).

Cleanup of some of the entity collation code to dedup.

c27b0ba

Apply lint to metadata endpoint and small improvements.

b77fa0d

Remove need for sleep via a wrapper function

fc2ece2

Add type hints to metadata.py

bcc0c07

Update the scatter tile so that all plotted entities are collated int…

73e04fc

…o entities, rather than just the parent

Move the max depth stopgap to a constant and increased it (10 seemed …

6663c89

…a little too small).

gemini-code-assist bot reviewed Mar 13, 2026

View reviewed changes

server/routes/shared_api/metadata.py Outdated Show resolved Hide resolved

server/routes/shared_api/metadata.py Outdated Show resolved Hide resolved

nick-nlb added 2 commits March 13, 2026 13:15

Add safety check on license type get.

aa405cf

Remove unused deconstructed variable.

6940c22

nick-nlb requested a review from juliawu March 13, 2026 20:27

nick-nlb marked this pull request as ready for review March 13, 2026 20:27

nick-nlb added 3 commits March 15, 2026 20:01

Merge branch 'master' into metadata_v2_tiles

f5d8674

Remove stray items that was causing issues.

963800e

Merge branch 'master' into metadata_v2_tiles

b52c3e6

juliawu reviewed Mar 16, 2026

View reviewed changes

Remove the "shared" from the route, refactor the traverse to top func…

4347460

…tion out from a closure to a separate top-level function, and more thoroughly document the more complex areas of the metadata endpoint.

nick-nlb requested a review from juliawu March 16, 2026 20:56

juliawu approved these changes Mar 16, 2026

View reviewed changes

nick-nlb merged commit 2214b89 into datacommonsorg:master Mar 16, 2026
12 checks passed

nick-nlb mentioned this pull request Mar 17, 2026

[v0/v1 migration] Move metadata modals in tools to V2 #6090

Merged

nick-nlb mentioned this pull request Mar 18, 2026

[v0/v1 migration] Move facet metadata enrichment to V2 #6104

Open


		from server.services import datacommons as dc

		bp = Blueprint("metadata", __name__, url_prefix='/api/shared/metadata')


		bp = Blueprint("metadata", __name__, url_prefix='/api/shared/metadata')

		MAX_CATEGORY_DEPTH = 50


		MAX_CATEGORY_DEPTH = 50

		MEASUREMENT_METHODS_SUPPRESSION_PROVENANCES: set[str] = {"WikipediaStatsData"}

		* This version utilizes a consolidated backend API endpoint that contains no
		* V1 calls.

		async def fetch_categories_async(stat_vars: list[str]) -> dict[str, list[str]]:
		"""Traverses the category hierarchy tree up to top-level topics."""

Conversation

nick-nlb commented Mar 13, 2026

Issue

Description

Solution

Notes

Testing

Upcoming

Uh oh!

gemini-code-assist bot commented Mar 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

juliawu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juliawu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants