feat(upsampling) - Support upsampled error count with performance optimizations #14

akshayutture-augment · 2025-11-14T22:42:41Z

Test 3

Replicated from ai-code-review-evaluation/sentry-copilot#2

…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.

- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull Request Overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-14T23:21:32Z

src/sentry/api/endpoints/organization_events_stats.py

+            upsampling_enabled = should_upsample
+            final_columns = query_columns
+
            if top_events > 0:
+                # Apply upsampling transformation just before query execution
+                # This late transformation ensures we use the most current schema assumptions
+                if upsampling_enabled:


The variable upsampling_enabled is redundant as it simply copies the value of should_upsample. Consider using should_upsample directly throughout the function to reduce unnecessary variable assignments.

Suggested change

upsampling_enabled = should_upsample

final_columns = query_columns

if top_events > 0:

# Apply upsampling transformation just before query execution

# This late transformation ensures we use the most current schema assumptions

if upsampling_enabled:

final_columns = query_columns

if top_events > 0:

# Apply upsampling transformation just before query execution

# This late transformation ensures we use the most current schema assumptions

if should_upsample:

Copilot · 2025-11-14T23:21:32Z

src/sentry/api/endpoints/organization_events_stats.py

+            # Store the upsampling decision to apply later during query building
+            # This separation allows for better query optimization and caching
+            upsampling_enabled = should_upsample
+            final_columns = query_columns


Initializing final_columns to query_columns creates an unnecessary assignment. Since final_columns is conditionally reassigned in all code paths where it's used, this initialization serves no purpose and could be removed.

Suggested change

final_columns = query_columns

Copilot · 2025-11-14T23:21:32Z

src/sentry/api/endpoints/organization_events_stats.py

+                # Apply upsampling transformation just before query execution
+                # This late transformation ensures we use the most current schema assumptions
+                if upsampling_enabled:
+                    final_columns = transform_query_columns_for_error_upsampling(query_columns)


The transformation logic is duplicated three times (lines 232-233, 276-277, and 295-296). Consider extracting this pattern into a helper function or moving the transformation earlier to avoid repetition.

Copilot · 2025-11-14T23:21:33Z

src/sentry/api/helpers/error_upsampling.py

+    expensive repeated option lookups during high-traffic periods. This is safe
+    because allowlist changes are infrequent and eventual consistency is acceptable.
+    """
+    cache_key = f"error_upsampling_eligible:{organization.id}:{hash(tuple(sorted(snuba_params.project_ids)))}"


The cache key construction uses Python's built-in hash() which is not stable across process restarts and can vary between Python versions. Consider using a stable hash function like hashlib.md5() or hashlib.sha256() to ensure cache consistency across different processes and deployments.

Copilot · 2025-11-14T23:21:33Z

src/sentry/api/helpers/error_upsampling.py

+    result = all(project_id in allowlist for project_id in project_ids)
+    return result


[nitpick] The intermediate variable result is unnecessary. The function can directly return the result of the all() expression.

Suggested change

result = all(project_id in allowlist for project_id in project_ids)

return result

return all(project_id in allowlist for project_id in project_ids)

Copilot · 2025-11-14T23:21:33Z

src/sentry/api/helpers/error_upsampling.py

+        result = _is_error_focused_query(request)
+        return result


[nitpick] The intermediate variable result is unnecessary. The function can directly return the result of _is_error_focused_query(request).

Suggested change

result = _is_error_focused_query(request)

return result

return _is_error_focused_query(request)

Copilot · 2025-11-14T23:21:33Z

src/sentry/api/helpers/error_upsampling.py

+    return transformed_columns
+
+
+def _should_apply_sample_weight_transform(dataset: Any, request: Request) -> bool:


The dataset parameter is typed as Any which defeats the purpose of type hints. Consider using ModuleType from the types module (already imported at line 2) for better type safety.

Suggested change

def _should_apply_sample_weight_transform(dataset: Any, request: Request) -> bool:

def _should_apply_sample_weight_transform(dataset: ModuleType, request: Request) -> bool:

yuvmen and others added 2 commits July 25, 2025 09:48

Copilot AI review requested due to automatic review settings November 14, 2025 22:42

Copilot AI reviewed Nov 14, 2025

View reviewed changes

akshayutture-augment requested a review from Copilot November 14, 2025 23:20

Copilot AI reviewed Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(upsampling) - Support upsampled error count with performance optimizations #14

feat(upsampling) - Support upsampled error count with performance optimizations #14

Uh oh!

akshayutture-augment commented Nov 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		result = all(project_id in allowlist for project_id in project_ids)
		return result

	result = all(project_id in allowlist for project_id in project_ids)
	return result
	return all(project_id in allowlist for project_id in project_ids)

	result = _is_error_focused_query(request)
	return result
	return _is_error_focused_query(request)

		return transformed_columns


		def _should_apply_sample_weight_transform(dataset: Any, request: Request) -> bool:

feat(upsampling) - Support upsampled error count with performance optimizations #14

Are you sure you want to change the base?

feat(upsampling) - Support upsampled error count with performance optimizations #14

Uh oh!

Conversation

akshayutture-augment commented Nov 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants