OccurrenceViewSet list: N+1 queries on detections, best_prediction, best_identification

## Problem

`/api/v2/occurrences/` (list) issues N+1 queries per row in the response. Under low traffic this is mostly hidden; under concurrent multi-user load it can pin gunicorn workers for tens of seconds per request and stall every other endpoint sharing the pool.

Tracked in Sentry as `AMI-PLATFORM-API-BMY` (performance issue type: N+1, ongoing). Most-recent observation 2026-04-28; first seen 2025-09-16. ~2,800 events.

## Evidence

Sentry's representative span shows the offending query repeated per occurrence row:

```sql
SELECT "main_detection"."path"
FROM "main_detection"
WHERE ("main_detection"."occurrence_id" = %s
       AND NOT ("main_detection"."path" IS NULL))
ORDER BY "main_detection"."frame_num" ASC, "main_detection"."timestamp" ASC
```

Same trace shows interleaved redis (cachalot/cache-key) calls per row, amplifying the cost.

In a recent multi-user load period (4-worker gunicorn pool, ~30 concurrent UI sessions), 158 hits to this endpoint clustered into a 10-minute window with individual span durations of **34–40 seconds** (max 40,417 ms). With 4 workers, ~3 concurrent occurrences requests was enough to fully exhaust the pool. UI list/detail endpoints sharing the pool (`/captures/`, `/jobs/`, `/deployments/{pk}/`) backed up behind it; participants experienced "every page feels frozen" during this window.

## Where

`ami/main/api/views.py:1217` — `OccurrenceViewSet.get_queryset`:

```python
def get_queryset(self) -> QuerySet["Occurrence"]:
    ...
    qs = qs.select_related("determination", "deployment", "event")
    qs = qs.with_detections_count().with_timestamps()
    qs = qs.with_identifications()
    ...
    if self.action != "list":
        qs = qs.prefetch_related(
            Prefetch("detections", queryset=Detection.objects.order_by("-timestamp").select_related("source_image"))
        )
    return qs
```

The detections prefetch is gated on `self.action != "list"`, so the list path has no prefetch on `detections`. The list serializer (`OccurrenceListSerializer`, `ami/main/api/serializers.py:1316`) then triggers per-row queries through:

- `detection_images` field → `Occurrence.detection_images()` (`ami/main/models.py:3221`) — does `Detection.objects.filter(occurrence=self).exclude(path=None).values_list("path")`. **This is the exact query Sentry detected.**
- `determination_details` (`ami/main/api/serializers.py:1367`) — already has a `# @TODO convert this to query methods to avoid N+1 queries. Currently at 100+ queries per page of 10 occurrences.` comment.
- `best_machine_prediction` → `Occurrence.best_prediction` (`models.py:3231`) — runs `self.predictions().order_by(...).first()` per row.
- `identifications` (many=True) — needs `prefetch_related("identifications")` on the list queryset; right now the comment block above suggests `with_identifications()` annotates a count but doesn't prefetch the related rows.

`best_identification` (`models.py:3243`) is called from `get_determination_details` and is another per-row hit.

## Proposed fix

Make the list path explicitly prefetch what the list serializer touches. Sketch:

```python
def get_queryset(self) -> QuerySet["Occurrence"]:
    ...
    qs = qs.select_related("determination", "deployment", "event")
    qs = qs.with_detections_count().with_timestamps()

    if self.action == "list":
        # Detections needed only for `detection_images` field — limit to the columns we read.
        qs = qs.prefetch_related(
            Prefetch(
                "detections",
                queryset=Detection.objects.exclude(path=None)
                                          .order_by("frame_num", "timestamp")
                                          .only("id", "occurrence_id", "path", "frame_num", "timestamp"),
                to_attr="prefetched_detections_for_images",
            ),
            "identifications",
            # Plus whatever predictions/best_prediction needs — likely:
            # Prefetch("detections__classifications", queryset=Classification.objects.order_by("-score").select_related("taxon", "algorithm")),
        )
        # Have `detection_images()` consume `prefetched_detections_for_images` when present
        # to avoid re-querying.
    else:
        qs = qs.prefetch_related(
            Prefetch("detections", queryset=Detection.objects.order_by("-timestamp").select_related("source_image"))
        )

    qs = qs.with_identifications()
    qs = qs.apply_default_filters(project, self.request)
    return qs
```

Plus update `Occurrence.detection_images()` to prefer the prefetched attribute when present, and audit `best_prediction` / `best_identification` for whether they can be replaced by query annotations or prefetches.

## Verification

- Under `assertNumQueries`, fetching a page of 25 occurrences should land at a small constant number of queries (single-digit), not `25 + N×detections + N×predictions + N×identifications`.
- Sentry's own profiles for this transaction (sampled at high rate on staging) should show `view.response.render` time drop sharply.

## Related

- Sentry issue ID: `AMI-PLATFORM-API-BMY` (internal)
- Existing TODO comment in `OccurrenceListSerializer.get_determination_details`: "Currently at 100+ queries per page of 10 occurrences"
- Issue #252 ("Change `determination_details` to `determination`") touches the same surface; can be sequenced against this if convenient.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OccurrenceViewSet list: N+1 queries on detections, best_prediction, best_identification #1271

Problem

Evidence

Where

Proposed fix

Verification

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OccurrenceViewSet list: N+1 queries on detections, best_prediction, best_identification #1271

Description

Problem

Evidence

Where

Proposed fix

Verification

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions