-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lasso with point field #5534
base: release/v1.4.0
Are you sure you want to change the base?
Lasso with point field #5534
Conversation
WalkthroughThis pull request introduces updates to the selection handling and state management within the embeddings package. The UI now passes additional parameters (lasso points) during selection/deselection events. A new hook, Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant UI
participant PlotSelection
participant ExtendedEffect
participant Server
User->>UI: Initiates lasso selection
UI->>PlotSelection: onSelected(pointIds, lassoPoints)
PlotSelection->>PlotSelection: Update state with lassoPoints
PlotSelection->>ExtendedEffect: Notify state change
ExtendedEffect->>Server: fetchExtendedStage(pointsField, lassoPoints)
Server->>Server: Process lassoPoints via _get_fiftyone_geowithin()
Server-->>ExtendedEffect: Return filtered query/result
ExtendedEffect-->>UI: Update plot data
sequenceDiagram
participant UI
participant ViewEffect
participant Server
UI->>ViewEffect: Trigger view change
ViewEffect->>Server: Fetch new plot data
Server-->>ViewEffect: Return response with points_field
ViewEffect->>ViewEffect: setPointsField(response.points_field)
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
fiftyone/server/routes/embeddings.py (2)
244-245
: Use consistent naming
lassoPoints = data.get("lassoPoints", None)
is descriptive, but consider consistently naming variables (e.g.,lasso_points
) throughout the codebase to match Python conventions for clarity.
246-263
: Ensure spatial indexing for efficient queriesWhen creating a Mongo
$geoWithin
query, confirm that an appropriate geospatial index (e.g., a 2dsphere index) is in place on thepoint_field
. Otherwise, queries might be slow or fail.app/packages/embeddings/src/usePlotSelection.tsx (3)
1-6
: Unify import styleImports look fine here, but confirm no default or named import collisions exist elsewhere. Also ensure Recoil dependencies match the project’s version constraints.
35-35
: Leverage typed state usage
const [lassoPoints, setLassoPoints] = useRecoilState(atoms.lassoPoints);
works. If you want stronger type safety, consider specifying the atom’s state interface.
47-50
: Clarify “unknown cases” and fallback behaviorThe TODO mentions that
handleSelected
might be called withoutlassoPoints
. Provide a fallback iflassoPoints
isundefined
, or investigate and document those “unknown cases” for maintainability.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
app/packages/embeddings/src/EmbeddingsPlot.tsx
(1 hunks)app/packages/embeddings/src/useBrainResult.tsx
(1 hunks)app/packages/embeddings/src/useExtendedStageEffect.tsx
(4 hunks)app/packages/embeddings/src/usePlotSelection.tsx
(2 hunks)app/packages/embeddings/src/useViewChangeEffect.tsx
(2 hunks)fiftyone/server/routes/embeddings.py
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{ts,tsx}`: Review the Typescript and React code for co...
**/*.{ts,tsx}
: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.
app/packages/embeddings/src/useBrainResult.tsx
app/packages/embeddings/src/useViewChangeEffect.tsx
app/packages/embeddings/src/EmbeddingsPlot.tsx
app/packages/embeddings/src/useExtendedStageEffect.tsx
app/packages/embeddings/src/usePlotSelection.tsx
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: e2e / test-e2e
- GitHub Check: test / test-app
- GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
- GitHub Check: lint / eslint
- GitHub Check: build / build
- GitHub Check: build
🔇 Additional comments (14)
app/packages/embeddings/src/useBrainResult.tsx (1)
9-9
: Clean implementation of the point field hook.The new
usePointField
hook follows the same well-established pattern as the existinguseBrainResult
hook, ensuring consistency in state management throughout the application.app/packages/embeddings/src/EmbeddingsPlot.tsx (2)
95-95
: Correctly extended selection handling to include lasso points.The updated call to
handleSelected
now passes the selected point IDs along with the lasso points from the plotly selection event, enabling the spatial querying feature mentioned in the PR objectives.
98-98
: Properly handles deselection case with the updated parameters.The deselection handler now passes two
null
values instead of one, maintaining consistency with the updated function signature forhandleSelected
. This change ensures that both the point IDs and lasso points are properly cleared on deselection.app/packages/embeddings/src/useViewChangeEffect.tsx (3)
6-6
: Import updated to include the new hook.The import statement has been updated to include the newly added
usePointField
hook, correctly maintaining the module's existing imports.
14-14
: Correctly implemented point field state management.The new hook is used to create state for tracking the point field, following the same pattern as other state variables in this component.
76-76
: Properly updates point field state from API response.The point field value is correctly extracted from the plot data response and stored in state using the setter function from the hook. This ensures the value is available for use in other components.
app/packages/embeddings/src/useExtendedStageEffect.tsx (4)
6-7
: Added necessary imports for lasso points and point field functionality.The imports for selection atoms and the point field hook are correctly added to support the new features.
21-22
: Properly initialized state values for spatial querying.The component now correctly retrieves both the lasso points from Recoil state and the point field using the new hook, making them available for use in the effect.
32-33
: Enhanced fetchExtendedStage with spatial query parameters.The function call has been updated to include the lasso points and point field parameters, enabling the MongoDB-specific spatial querying capability described in the PR objectives.
42-49
: Correctly updated effect dependencies.The dependency array for the useEffect hook has been properly expanded to include the new state values, ensuring the effect re-runs when the lasso points or point field change. The array is also well-formatted for readability.
fiftyone/server/routes/embeddings.py (2)
238-238
: ValidatepointField
before usageCurrently, the code assumes the
"pointField"
key always exists indata
. If there's any chance it could be missing, consider handling aKeyError
or providing a default value.
361-401
: Confirm coordinate ordering for MongoDB$geoWithin
This function combines
x
andy
into[x, y]
pairs, which might require[longitude, latitude]
ordering for a 2dsphere index. Verify that this matches the expected coordinate system.app/packages/embeddings/src/usePlotSelection.tsx (2)
13-18
: Atom definition is clearDefining
lassoPoints
in a dedicated atom is straightforward. Good approach for global state. No concerns here.
38-41
: Signature includes optional parameter
function handleSelected(selectedResults, lassoPoints)
is a good addition. Ensure calls tohandleSelected
always provide eithernull
or a valid object for the second argument to avoid runtime errors.
Lasso with point field (updates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
fiftyone/server/routes/embeddings.py (2)
268-286
: Consider validating lasso_points structure before use.The code assumes that
lasso_points
has the expected structure with 'x' and 'y' keys. Without validation, this could lead to KeyErrors if malformed data is provided.if ( lasso_points is not None and points_field is not None and (patches_field is None or is_patches_view) ): + # Validate lasso_points structure + if not (isinstance(lasso_points, dict) and "x" in lasso_points and "y" in lasso_points): + raise ValueError("Invalid lasso_points format. Expected dict with 'x' and 'y' keys.") + # Lasso points via spatial index # Unfortunately we can't use $geoWithin to filter nested arrays. # The user must have switched to a patches view for this
274-281
: Consider closing the polygon for complete MongoDB compatibility.MongoDB requires polygons to be closed (first point = last point) for certain geospatial operations. While $geoWithin may work with unclosed polygons, it's safer to explicitly close the polygon.
points_field: { "$geoWithin": { "$polygon": [ [x, y] for x, y in zip( lasso_points["x"], lasso_points["y"], - ) + ) + ] + [[lasso_points["x"][0], lasso_points["y"][0]]] # Close the polygon ] } }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/packages/embeddings/src/useBrainResult.tsx
(1 hunks)app/packages/embeddings/src/useExtendedStageEffect.tsx
(4 hunks)app/packages/embeddings/src/useViewChangeEffect.tsx
(2 hunks)fiftyone/core/stages.py
(1 hunks)fiftyone/server/routes/embeddings.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- app/packages/embeddings/src/useExtendedStageEffect.tsx
- app/packages/embeddings/src/useBrainResult.tsx
- app/packages/embeddings/src/useViewChangeEffect.tsx
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test / test-app
- GitHub Check: e2e / test-e2e
- GitHub Check: build / build
- GitHub Check: lint / eslint
- GitHub Check: build
🔇 Additional comments (6)
fiftyone/server/routes/embeddings.py (5)
243-244
: LGTM! Good use of safe dictionary access for new parameters.Using
data.get()
with a default value ofNone
is a robust way to extract potentially missing keys from the request data, ensuring backward compatibility with existing clients.
255-263
: Clear explanation of limitations with nested arrays.The comment effectively explains why the lasso filtering can't work with nested arrays in MongoDB and the requirement for the user to switch to a patches view for this functionality to work. This is important context for future maintainers.
263-267
: Correct field path resolution for patches.This block correctly handles the resolution of the points field path when working with patches, ensuring proper integration with the existing field path resolution mechanism.
287-287
: Correct modification of conditional.The change from
if
toelif
is necessary to make this part of the conditional chain and avoid executing this block if the lasso points condition is met. This ensures the correct logical flow through the various selection scenarios.
255-287
:❓ Verification inconclusive
Implementation of spatial queries using MongoDB's $geoWithin looks good.
The implementation correctly uses MongoDB's spatial query capabilities to filter documents based on lasso points. The code properly converts the lasso points into a polygon format suitable for MongoDB's $geoWithin operator. The conditional logic ensures this feature is only used in supported scenarios (when not dealing with nested arrays).
A few considerations:
- There's no validation that
lasso_points
has the expected structure with 'x' and 'y' keys- For optimal performance, ensure that
points_field
has a geospatial index in MongoDB
🏁 Script executed:
#!/bin/bash # Check if points_field is used in other parts of the codebase and if it's indexed # Look for index creation on the points field rg -A 3 -B 3 "create_index.*point" --glob "*.py" # Check if points_field is used in other parts of the codebase rg "points_field" --glob "*.py" | grep -v "embeddings.py"Length of output: 106
Spatial Query Implementation in
embeddings.py
- The spatial query setup using
$geoWithin
is correctly implemented for converting lasso points into a polygon.- There is no validation to ensure that
lasso_points
contains the expected keys ('x'
and'y'
). Adding such validation would help avoid potential runtime errors.- It is unclear whether the field used in the geospatial query (
points_field
) is indexed appropriately (e.g., a 2dsphere index). Manual verification or an explicit index creation in the codebase should be ensured for optimal performance.fiftyone/core/stages.py (1)
3298-3299
: Documentation addition looks goodThe addition of the
create_index
parameter documentation is well-formatted and consistent with the rest of the docstring. This parameter is important for users to understand as it allows control over whether a spherical index is automatically created for geospatial queries, which can significantly impact performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
fiftyone/server/routes/embeddings.py (1)
248-248
: Remove or replace this debug print statementIt’s preferable to use proper logging or remove the debug statement before committing to production to avoid unintentional output.
- print(points_field) + # import logging + # logging.debug(points_field)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/server/routes/embeddings.py
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
- GitHub Check: build / build
- GitHub Check: build / changes
- GitHub Check: test / test-app
- GitHub Check: e2e / test-e2e
- GitHub Check: lint / eslint
- GitHub Check: build
🔇 Additional comments (4)
fiftyone/server/routes/embeddings.py (4)
79-79
: No issues found with referencingpoints_field
This addition looks good and follows the existing pattern of reading configuration fields.
155-155
: Exposingpoints_field
in the response is beneficialIncluding
"points_field"
in the final JSON allows consumers to easily fetch and use this data in the client side.
245-247
: Validate the presence and shape oflassoPoints
Be sure to verify that
lassoPoints
keys ("x"
,"y"
) exist and contain matching-length arrays. Malformed inputs could lead to runtime errors when building the$polygon
.Do you want me to generate a shell script or propose a code snippet to validate
lassoPoints
and guard against malformed data?
259-291
: Add geometry validation for lasso-based$geoWithin
queriesWhile constructing a MongoDB
$polygon
, a mismatch or missing data inlasso_points["x"]
orlasso_points["y"]
will trigger errors. Additionally, consider closing the polygon if needed by your geometry convention.Here is a suggested snippet to validate input before building the
$polygon
:if ( lasso_points is not None and points_field is not None and (patches_field is None or is_patches_view) ): + # Validate lassoPoints data + if ( + not isinstance(lasso_points, dict) + or "x" not in lasso_points + or "y" not in lasso_points + or not isinstance(lasso_points["x"], list) + or not isinstance(lasso_points["y"], list) + or len(lasso_points["x"]) != len(lasso_points["y"]) + ): + raise ValueError("Invalid lassoPoints data: mismatch in x,y arrays") if patches_field is not None: _, points_field = view._get_label_field_path(patches_field, points_field)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
app/packages/embeddings/src/tracesToData.tsx (1)
41-46
: Consider adding type annotations to arraysWhile pre-allocating arrays is efficient, consider adding TypeScript type annotations to these arrays for better type safety:
-const x = new Array(trace.length); -const y = new Array(trace.length); -const ids = new Array(trace.length); -const labelsForColors = - isContinuous && !isUncolored ? new Array(trace.length) : null; +const x: number[] = new Array(trace.length); +const y: number[] = new Array(trace.length); +const ids: string[] = new Array(trace.length); +const labelsForColors: number[] | null = + isContinuous && !isUncolored ? new Array(trace.length) : null;
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/packages/embeddings/src/tracesToData.tsx
(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{ts,tsx}`: Review the Typescript and React code for co...
**/*.{ts,tsx}
: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.
app/packages/embeddings/src/tracesToData.tsx
⏰ Context from checks skipped due to timeout of 90000ms (6)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test / test-app
- GitHub Check: e2e / test-e2e
- GitHub Check: lint / eslint
- GitHub Check: build / build
- GitHub Check: build
🔇 Additional comments (6)
app/packages/embeddings/src/tracesToData.tsx (6)
23-26
: Good selection optimization with Set data structureUsing a Set for fast O(1) lookups when checking if an item is selected is an efficient approach, especially with potentially large datasets in embeddings visualizations.
27-30
: Performance improvement: pre-computing colorscale mappingMoving the colorscale mapping outside the trace iteration is a good optimization. This prevents redundant calculations for each trace when the colorscale is the same for all traces.
47-64
: Efficient imperative approach for data processingReplacing functional programming methods with a direct loop is more efficient for large datasets as it avoids creating intermediate arrays. This change aligns well with handling spatial queries for the lasso selection feature.
59-63
: Selection logic implementation looks correctThe selection logic correctly checks both
id
andsample_id
against the selection set, which is important for handling various identifier types in the data.
67-80
: Direct array assignment in return object is more efficientUsing pre-populated arrays directly in the return object rather than creating them inline with mapping functions is a good performance optimization.
91-91
: Good handling of empty selectedpointsSetting
selectedpoints
tonull
when empty rather than an empty array is a good practice for Plotly, as it helps the library distinguish between "no selection" and "empty selection" states.
What changes are proposed in this pull request?
Uses the new
points_field
param added in https://github.com/voxel51/fiftyone-brain/pull/236/files to allow for mongodb only spatial querying when lassoing in the embeddings plot.Depends on this fiftyone.brain PR
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Documentation
GeoWithin
class to clarify the newcreate_index
parameter.