/resolved endpoint is the primary scaling bottleneck

## Summary

The \`GET /project/{id}/page/{pageId}/resolved\` endpoint is the single largest performance bottleneck in the TPEN3 stack. It triggers a deep dependency chain across all three services and degrades rapidly under concurrent load.  RERUM fails under the amplified load, cascading 502s through the stack.

## How it works

When called, \`/resolved\` fetches the page data, then makes **N parallel RERUM requests** (one per page item/annotation) to resolve each annotation's full content. For a page with 700 items, that's 700 outgoing HTTP requests fanning out through TinyPEN to RERUM.

## Evidence

From load testing (Run 4):

| Phase | VUs | /resolved p95 | Impact |
|-------|-----|---------------|--------|
| Phase 1 (baseline) | 1 | 2.8-4.5s | Slow even at single user |
| Phase 2 (load) | 20 | ~4.7s | Drives overall read p95 above threshold |
| Phase 3 (stress) | 50+ | 14.5s+ | Causes queuing that affects ALL endpoints |
| Phase 4 (spike) | 80 | 25s+ | Queue saturation, 54s timeouts at 150 VUs |

Key observations:
- At 80+ VUs, \`/resolved\` causes a cascading queue that blocks unrelated endpoints (simple GETs that would normally take <100ms start taking 5-10s)
- The page used for testing has grown to **700+ items** across test runs. Each run adds lines via load/stress/conflict phases, making \`/resolved\` progressively slower (Finding #15 in test plan)
- This is the primary reason the stack can't handle 30+ concurrent users

## Recommendations

In priority order:

1. **Cache resolved responses** — Even a 30-second TTL cache would dramatically reduce RERUM load. Annotations rarely change between page loads.

2. **Add pagination** — Return resolved items in pages (e.g., 50 at a time) so the client can progressively render rather than waiting for all 700 to resolve.

3. **Batch RERUM lookups** — Instead of N individual \`/id/{id}\` requests, use a single query like \`POST /query\` with \`{"@id": {"$in": [...ids]}}\` to fetch many annotations at once.

4. **Limit concurrent outgoing requests** — If N=700, don't fire all 700 simultaneously. Use a concurrency pool (e.g., 20 at a time) to avoid overwhelming RERUM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/resolved endpoint is the primary scaling bottleneck #482

Summary

How it works

Evidence

Recommendations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Phase	VUs	/resolved p95	Impact
Phase 1 (baseline)	1	2.8-4.5s	Slow even at single user
Phase 2 (load)	20	~4.7s	Drives overall read p95 above threshold
Phase 3 (stress)	50+	14.5s+	Causes queuing that affects ALL endpoints
Phase 4 (spike)	80	25s+	Queue saturation, 54s timeouts at 150 VUs

/resolved endpoint is the primary scaling bottleneck #482

Description

Summary

How it works

Evidence

Recommendations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions