Skip to content

Conversation

Jibola
Copy link
Contributor

@Jibola Jibola commented Sep 8, 2025

Summary

An addendum to INTPYTHON-739 This PR implements support for $getField expressions in the expression converter system, allowing MongoDB aggregation expressions using $getField to be optimized into simpler query conditions. This is especially useful when optimizing queries against embedded model fields.

The format of $getField calls need to meet these conditions:

{ 
    $getField: {
        input: "$path" | { $getField : {...} },  # A string starting with "$" 
        field: "path" # A string with no "$" or "." 
   }
}

This supports nested $getField operations.

Changes in this PR

  • Adds helper methods to detect and convert $getField expressions to dot notation paths
  • Updates binary and $in operators to handle $getField expressions alongside simple field references
  • Adds comprehensive test coverage for $getField conversion scenarios

Test Plan

  • Manual testing through python manage.py shell See an example query below.
  • Added test cases.

Screenshots (grabbed text)

# Before
>>> NullableJSONModel.objects.filter(value__a="b", value__j=None)
{
                "$match": {
                    "$and": [
                        {
                            "$expr": {
                                "$eq": [
                                    {
                                        "$getField": {
                                            "input": "$value",
                                            "field": "a"
                                        }
                                    },
                                    "b"
                                ]
                            }
                        },
                        {
                            "$expr": {
                                "$eq": [
                                    {
                                        "$getField": {
                                            "input": "$value",
                                            "field": "j"
                                        }
                                    },
                                    null
                                ]
                            }
                        }
                    ]
                }
            }

# After 
>>> NullableJSONModel.objects.filter(value__a="b", value__j=None)
{
                "$match": {
                    "$and": [
                        {
                            "value.a": "b"
                        },
                        {
                            "$and": [
                                {
                                    "value.j": {
                                        "$exists": true
                                    }
                                },
                                {
                                    "value.j": null
                                }
                            ]
                        }
                    ]
                }
            }

Callouts

  • This may be the last addition to the optimizer done as it extends the functionality of converting simple lookups on embedded model field paths.
  • I decided not to convert every $getField call unless they also feed into a $match conversion because silently mutating $getField should best be dealt in the larger refactor.
  • NULL checks have to explicitly check for both field existence and equality. As such, anytime a NoneType is used a lookup, the optimization needs to wrap two statements in an $and query.

@Jibola Jibola changed the base branch from main to expr-conversion-poc September 8, 2025 11:13
@timgraham timgraham force-pushed the expr-conversion-poc branch 2 times, most recently from b99aba3 to 94eefad Compare September 8, 2025 11:44
Base automatically changed from expr-conversion-poc to main September 8, 2025 12:21
expr = {
"$expr": {
"$gt": [
{"$getField": {"input": "$price", "field": "value"}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we optimize this down to remove the $getField when the field names don't contain $ or .? Something like:

{  
  "$match": {  
    "$expr": {  
      "$gt": ["$price.value", "$discounted_price.value"]  
    }  
  }  
}  

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could. I'm trying to think if there's any case where this would be an issue if it mistakenly resolved it.

Copy link
Contributor Author

@Jibola Jibola Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would have to be a two-phase conversion. I think it may even be best to still change it in our actual query code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming here that this change ends up being needed. Holding off until we either get customer complaints or the refactor can't solve this makes more sense to me for now.

Copy link
Contributor Author

@Jibola Jibola Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I tried to make it work (and I think I managed to), but the silent mutation of $getField to $field is worrying to me. Even if I manage to make things pass tests, it just feels out of scope of the optimizers goal, which is to explicitly change things from $expr to $match.

@Jibola Jibola requested a review from Copilot September 10, 2025 23:56
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements support for $getField expressions in the expression converter system, allowing MongoDB aggregation expressions using $getField to be optimized into simpler query conditions.

  • Adds helper methods to detect and convert $getField expressions to dot notation paths
  • Updates binary and $in operators to handle $getField expressions alongside simple field references
  • Adds comprehensive test coverage for $getField conversion scenarios

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
django_mongodb_backend/query_conversion/expression_converters.py Core implementation of $getField support with path conversion methods and updated operator handling
tests/expression_converter_/test_op_expressions.py Test cases for $getField conversion across all binary operators and $in operator
tests/expression_converter_/test_match_conversion.py Integration tests for $getField optimization in match expressions

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# Check if first argument is a simple field reference
# Check if second argument is a list of simple values
if (field_name := cls.convert_path_name(field_expr)) and (
isinstance(values, list | tuple | set)
Copy link
Preview

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use explicit tuple syntax (list, tuple, set) instead of union operator | for better compatibility with older Python versions.

Suggested change
isinstance(values, list | tuple | set)
isinstance(values, (list, tuple, set))

Copilot uses AI. Check for mistakes.

@Jibola Jibola changed the title WIP: Expr support getfield INTPYTHON-739: (Addendum) Support converting $expr with $getField paths Sep 11, 2025
@Jibola Jibola marked this pull request as ready for review September 11, 2025 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants