This document describes the methodology and philosophy behind performance testing across all packages in the dx-toolkit monorepo.
Processing lag tests measure the overhead introduced by our abstraction layers compared to raw You.com API calls. The goal is to quantify what processing lag our code adds while wrapping APIs.
Important: We cannot improve the You.com API performance itself. These tests measure our code's overhead, not the underlying API speed.
Definition: The absolute time difference between raw API calls and our abstraction layer.
- Measured in: milliseconds (ms)
- Formula:
Abstraction time - Raw API time - Example: If raw API takes 500ms and our wrapper takes 535ms, processing lag is 35ms
- Interpretation:
- Negative values mean our code is faster (caching, optimizations)
- Positive values mean our code adds overhead (validation, transformation)
Definition: The relative overhead as a percentage of raw API time.
- Measured in: percentage (%)
- Formula:
(Processing lag / Raw API time) × 100 - Example: 35ms lag on 500ms raw API = 7% overhead
- Interpretation:
- Fast APIs (50-100ms) will show high % overhead due to fixed costs
- Slow APIs (500ms+) will show low % overhead
- Absolute lag (ms) is often more meaningful than relative %
Definition: The heap memory growth from our abstraction layer.
- Measured in: kilobytes (KB)
- Method: Compare heap size before and after operations with forced GC
- Includes: Memory for data transformation, validation, schemas, buffers
- Interpretation: Sustained memory usage, not temporary allocations
Purpose: Eliminate cold start effects before measurements.
Actions:
- Run each operation once before measuring
- Ensures JIT compilation is complete
- Loads all modules into memory
- Initializes connection pools
Why: First-run operations are 2-10x slower due to:
- JIT compilation
- Module loading
- DNS resolution
- TLS handshake establishment
Purpose: Collect statistically reliable timing data.
Process:
for (let i = 0; i < iterations; i++) {
// Raw API call (baseline)
const rawStart = performance.now();
await rawApiCall();
const rawTime = performance.now() - rawStart;
// Abstraction layer call (with overhead)
const wrapperStart = performance.now();
await wrapperCall();
const wrapperTime = performance.now() - wrapperStart;
}
// Calculate averages
const avgRaw = average(rawTimes);
const avgWrapper = average(wrapperTimes);
const processingLag = avgWrapper - avgRaw;Iterations:
- Fast operations (< 200ms): 10 iterations
- Slow operations (> 500ms): 5 iterations
- Reason: More iterations smooth out network variability
Timing Tool: performance.now() provides microsecond precision
Purpose: Track sustained heap growth from abstraction overhead.
Process:
// Force GC and wait for stabilization
Bun.gc(true);
await new Promise(resolve => setTimeout(resolve, 100));
const heapBefore = heapStats().heapSize;
// Run operations
for (let i = 0; i < iterations; i++) {
await operation();
}
// Force GC again
Bun.gc(true);
await new Promise(resolve => setTimeout(resolve, 100));
const heapAfter = heapStats().heapSize;
const heapGrowth = heapAfter - heapBefore;Why Force GC: Ensures we measure sustained memory, not temporary allocations.
Processing lag: 15ms
Overhead: 3%
Memory: 50KB
✅ Minimal overhead, efficient implementation
Processing lag: 80ms
Overhead: 40%
Memory: 350KB
✅ Within thresholds, acceptable for complex abstractions
Processing lag: 150ms
Overhead: 75%
Memory: 500KB
Processing lag: 500ms
Overhead: 200%
Memory: 2MB
🚨 Significant overhead requiring immediate investigation
Each package should set thresholds based on its architecture. Different package types have different overhead characteristics:
| Package Type | Lag Threshold | Overhead Threshold | Memory Threshold | Rationale |
|---|---|---|---|---|
| Thin library wrappers | < 50ms | < 10% | < 300KB | Minimal transformation, no transport overhead |
| SDK integrations | < 80ms | < 35% | < 350KB | Moderate data transformation and validation |
| MCP servers | < 100ms | < 50% | < 400KB | Includes stdio/JSON-RPC transport + protocol overhead |
| Complex frameworks | < 150ms | < 75% | < 500KB | Multiple abstraction layers, state management |
MCP servers (like @youdotcom-oss/mcp) have higher thresholds (100ms/50%/400KB) compared to library packages (50ms/10%/300KB) because of architectural differences:
MCP Server Overhead Sources:
- Stdio transport: Process IPC adds 20-40ms latency
- JSON-RPC protocol: Serialization/deserialization adds 10-20ms
- Client SDK:
@modelcontextprotocol/sdkprotocol overhead - Process spawning: Each test spawns MCP server as subprocess
- State management: Client state, connection pools, schemas
Library Package Overhead Sources:
- Data transformation: Converting between formats
- Validation: Zod schema validation (5-15ms)
- Error handling: Try/catch and error formatting
Example Comparison:
// Library wrapper (50ms threshold)
export const callApi = async (params) => {
// Just validation + fetch + transform
const validated = schema.parse(params); // 5ms
const response = await fetch(url, validated); // API time (not counted)
return transform(response); // 10ms
// Total overhead: ~15ms
};
// MCP server (100ms threshold)
const result = await client.callTool({ name: 'api', arguments: params });
// Overhead includes:
// - Stdio IPC: 30ms
// - JSON-RPC: 15ms
// - Validation: 5ms
// - Transform: 10ms
// - Client SDK: 20ms
// Total overhead: ~80ms- Fixed overhead: Serialization, validation, setup (same regardless of data size)
- Proportional overhead: Data transformation, parsing (grows with data size)
- Note: Fixed overhead shows high % on fast operations
- Imperceptible: < 50ms
- Perceivable: 50-150ms
- Noticeable: 150-300ms
- Annoying: > 300ms
Aim for thresholds that keep total lag below perception thresholds.
Last Updated: 2026-05-04T13:32:09.948Z Workflow Run: View Results
| Package | Processing Lag | Overhead % | Memory | Status |
|---|---|---|---|---|
| @youdotcom-oss/mcp | ✅ 18.00ms (< 100.00ms) | ✅ 4.48% (< 50.00%) | ✅ 105.24KB (< 400.00KB) | ✅ Pass |
| @youdotcom-oss/ai-sdk-plugin | ✅ 38.31ms (< 80.00ms) | ✅ 10.25% (< 35.00%) | ✅ 15.62KB (< 350.00KB) | ✅ Pass |
View detailed metrics
Timestamp: 2026-05-04T13:31:46.880Z
Processing Lag:
- Raw API avg: 401.59ms
- Wrapper avg: 419.59ms
- Processing lag: 18.00ms
- Threshold: < 100.00ms
- Status: ✅ Pass
Overhead:
- Percentage: 4.48%
- Threshold: < 50.00%
- Status: ✅ Pass
Memory:
- Heap before: 11804.10KB
- Heap after: 11909.34KB
- Growth: 105.24KB
- Threshold: < 400.00KB
- Status: ✅ Pass
Test Configuration:
- Iterations: 20
Timestamp: 2026-05-04T13:32:09.205Z
Processing Lag:
- Raw API avg: 373.85ms
- Wrapper avg: 412.16ms
- Processing lag: 38.31ms
- Threshold: < 80.00ms
- Status: ✅ Pass
Overhead:
- Percentage: 10.25%
- Threshold: < 35.00%
- Status: ✅ Pass
Memory:
- Heap before: 11376.56KB
- Heap after: 11392.18KB
- Growth: 15.62KB
- Threshold: < 350.00KB
- Status: ✅ Pass
Test Configuration:
- Iterations: 20
To measure performance for all packages manually:
# From repository root
bun scripts/performance/measure.ts > results.jsonThis measures all packages in a single run and outputs results as JSON. The weekly workflow uses this same script.
- Processing lag: < 100ms
- Overhead percentage: < 50%
- Memory overhead: < 400KB
- Processing lag: < 80ms
- Overhead percentage: < 35%
- Memory overhead: < 350KB
Symptoms: Consistently exceeds threshold by 2x or more
Common Causes:
- Unnecessary data transformations
- Redundant validation passes
- Inefficient serialization
- Synchronous blocking operations
- N+1 query patterns
Debug Approach:
# Profile with CPU profiler
bun --cpu-prof test src/tests/processing-lag.spec.ts
# Identify hotspots in generated .cpuprofile fileSymptoms: Heap growth exceeds threshold
Common Causes:
- Large schema objects not being reused
- Response data not being garbage collected
- Caching without bounds
- Circular references preventing GC
- Memory leaks in event listeners
Debug Approach:
# Profile with heap profiler
bun --heap-prof test src/tests/processing-lag.spec.ts
# Analyze .heapprofile for large objectsSymptoms: Large variance between test runs (±50% or more)
Common Causes:
- Network variability (use stable connection)
- System load (close other applications)
- Insufficient iterations (increase iteration count)
- Skipped warmup phase
- Background processes interfering
Solutions:
- Run tests on stable network without VPN
- Close resource-intensive applications
- Increase iterations (10 → 20 or 5 → 10)
- Verify warmup phase executes
- Run multiple times and compare results
Symptoms: Tests fail with timeout errors
Common Causes:
- AI processing APIs (Express agent)
- Rate limiting from too many requests
- Network latency spikes
- Insufficient timeout configuration
Solutions:
test.serial('Slow operation', async () => {
// test body
}, { timeout: 60000 }); // 60s timeoutAdd processing lag tests to packages that:
✅ Wrap You.com APIs - Quantify abstraction overhead ✅ Provide SDK interfaces - Track integration costs ✅ Transform data - Measure transformation overhead ✅ Add middleware layers - Quantify middleware costs
Skip processing lag tests for packages that:
❌ Pure utilities - No API wrapping ❌ CLI tools - User interaction dominates timing ❌ Documentation - No runtime code ❌ Configuration - Static data only
- Always include warmup phase
- Use serial execution for consistency (
test.serial()) - Run multiple iterations to smooth variance
- Measure both time and memory
- Add delays between iterations for rate limiting
- Start conservative, adjust based on empirical data
- Document rationale for each threshold
- Review thresholds quarterly
- Update when architecture changes significantly
- Run tests in CI on every PR
- Alert on threshold violations
- Review failures as optimization opportunities
- Keep tests updated with API changes
For general development guidelines, see: