Skip to content

[draft] Support multi-entity in v2/observation#1867

Draft
rohitkumarbhagat wants to merge 2 commits into
datacommonsorg:masterfrom
rohitkumarbhagat:multi-entity-prototype
Draft

[draft] Support multi-entity in v2/observation#1867
rohitkumarbhagat wants to merge 2 commits into
datacommonsorg:masterfrom
rohitkumarbhagat:multi-entity-prototype

Conversation

@rohitkumarbhagat
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for multi-entity observations within the Spanner data source, including necessary protobuf updates and logic for querying and reconstructing observations from the normalized schema. The implementation also adds support for hydrating dimension nodes. Feedback focuses on improving data integrity and efficiency by removing hardcoded property exclusions that lead to data loss, eliminating redundant in-memory sorting of results already ordered by the database, and addressing potential data truncation during node hydration by properly handling result limits.

Comment on lines +38 to +48
var multiEntityDimensionExcludedProperties = map[string]struct{}{
"facetId": {},
"importName": {},
"isDcAggregate": {},
"measurementMethod": {},
"observationAbout": {},
"observationPeriod": {},
"provenanceUrl": {},
"scalingFactor": {},
"unit": {},
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The property observationAbout is hardcoded in the multiEntityDimensionExcludedProperties list, which leads to dimension information being lost in the response. Following repository guidelines, avoid hardcoding property-specific behaviors such as special value filtering. The API or schema should dictate the type of value and how it is handled.

References
  1. Avoid hardcoding property-specific behaviors, such as special value filtering. Instead, the API/schema should dictate the type of value and how it's handled.

Comment on lines +196 to +198
sort.Slice(pointStats, func(i, j int) bool {
return pointStats[i].GetDate() < pointStats[j].GetDate()
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The explicit sort of pointStats in Go is redundant because the underlying Spanner query already includes an ORDER BY svo.date ASC clause within the ARRAY subquery (see statements_normalized.go), and Spanner preserves the order of elements in the resulting array.

Suggested change
sort.Slice(pointStats, func(i, j int) bool {
return pointStats[i].GetDate() < pointStats[j].GetDate()
})
if len(pointStats) == 0 {
return nil
}

Comment on lines +270 to +273
nodeResp, err := sds.Node(ctx, &pbv2.NodeRequest{
Nodes: nodeDcids,
Property: nodePropertyExpression(nodeProperties),
}, datasources.DefaultPageSize)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using datasources.DefaultPageSize as a limit for sds.Node might result in incomplete hydration of dimension nodes if the number of unique DCIDs in the multi-entity observations exceeds the default page size. Unless there is a strong, documented domain constraint ensuring the number of results will not exceed this limit, pagination should be handled or the limit should be adjusted to len(nodeDcids).

References
  1. Pagination for API calls can be omitted if there's a strong, documented domain constraint ensuring the number of results will not exceed the page size limit.

@rohitkumarbhagat rohitkumarbhagat force-pushed the multi-entity-prototype branch from d943e78 to c76a0e5 Compare May 14, 2026 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant