-
-
Notifications
You must be signed in to change notification settings - Fork 792
chore(webapp): More false errors turned to logs #2422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
WalkthroughChanges across the repo adjust logging behavior and error propagation without altering functional control flow or public APIs. Logger.error now checks for an ignoreError flag before invoking Logger.onError. Prisma writer error logs attach ignoreError: true. Multiple modules (runtimeEnvironment, eventRepository, machinePresets, runEngineHandlers, deliverAlert, createCheckpoint, cancelTaskRunV1, shared queue consumer, realtime streams, alerts worker entries, route loader) change log levels or the log method and, in one route, convert a rethrow to returning a structured 422 JSON error. Realtime ingest adds a special ECONNRESET branch. The Redis worker worker.ts adds a logErrors?: boolean catalog flag and centralizes per-item logging behavior and messages. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
… have errors for complete failures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
apps/webapp/app/db.server.ts (2)
308-317
: Parity: also gate Prisma replica error events with ignoreErrorReplica “error” events can be just as noisy as writer events. For consistency and to fully achieve the PR’s objective of reducing false alerts, add
ignoreError: true
to the replica$on("error")
path as well.Apply this diff:
replicaClient.$on("error", (log) => { logger.error("PrismaClient error", { clientType: "reader", event: { timestamp: log.timestamp, message: log.message, target: log.target, }, + ignoreError: true, }); });
1-372
: Add missingignoreError: true
to replica PrismaClient error logThe writer client’s error handler correctly sets
ignoreError: true
, but the replica client’s handler on line 309 does not. For consistency and to prevent alert noise, addignoreError: true
there.• File: apps/webapp/app/db.server.ts
– Line 309: includeignoreError: true
in the replicalogger.error
callSuggested diff:
replicaClient.$on("error", (log) => { - logger.error("PrismaClient error", { + logger.error("PrismaClient error", { clientType: "reader", event: { timestamp: log.timestamp, message: log.message, target: log.target, }, + ignoreError: true, }); });apps/webapp/app/models/runtimeEnvironment.server.ts (1)
33-44
: Potential null dereference when no environment is found
environment
can be null fromfindFirst
. Accessingenvironment.type
will throw instead of returningnull
as per the function signature. Add an early null return before checkingtype
.Apply this diff:
- //don't return deleted projects - if (environment?.project.deletedAt !== null) { - return null; - } - - if (environment.type === "PREVIEW") { + // Don't return deleted projects + if (environment?.project.deletedAt !== null) { + return null; + } + + // If no environment found, return null + if (!environment) { + return null; + } + + if (environment.type === "PREVIEW") {
🧹 Nitpick comments (16)
internal-packages/run-engine/src/engine/machinePresets.ts (1)
28-31
: Prefer logging parse issues and a minimal summary instead of the full configGood downgrade to info. To improve signal while avoiding noisy/PII-prone payloads, log Zod issues and a compact config summary rather than the entire raw config.
Apply this diff:
- logger.info("Failed to parse machine config", { config }); + logger.info("Failed to parse machine config", { + // Log Zod issues for debuggability + issues: parsedConfig.error?.issues, + // Avoid dumping the entire config. Keep a tiny, useful summary. + configSummary: + typeof config === "object" && config !== null + ? { + hasPreset: "preset" in (config as any), + cpu: (config as any)?.cpu, + memory: (config as any)?.memory, + } + : { type: typeof config }, + });apps/webapp/app/v3/services/cancelTaskRunV1.server.ts (1)
46-49
: Include friendlyId to ease log correlation (minor)Including the friendly run ID alongside the internal ID helps correlate with user-facing references and other logs.
Apply this diff:
- logger.info("Task run is not in a cancellable state", { - runId: taskRun.id, - status: taskRun.status, - }); + logger.info("Task run is not in a cancellable state", { + runId: taskRun.id, + friendlyRunId: taskRun.friendlyId, + status: taskRun.status, + });packages/core/src/logger.ts (1)
69-79
: Optional: document and surface ignoreError behaviorConsider a brief doc comment on
Logger.error
explaining theignoreError
flag, since it now materially affects forwarding viaonError
. This reduces surprise for consumers adding structured args.apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts (1)
187-194
: Make ECONNRESET handling type-safe and add stream context to the logThe current
"code" in error && error.code === "ECONNRESET"
will still trip TS sinceError
doesn't declarecode
. Replace with a narrow type guard and includestreamKey
(andcode
) in the log context for faster triage.Apply this diff within the catch:
- if (error instanceof Error) { - if ("code" in error && error.code === "ECONNRESET") { - logger.info("[RealtimeStreams][ingestData] Connection reset during ingestData:", { - error, - }); - return new Response(null, { status: 500 }); - } - } + if (isConnResetError(error)) { + logger.info("[RealtimeStreams][ingestData] Connection reset during ingestData", { + streamKey, + code: error.code, + error, + }); + return new Response(null, { status: 500 }); + }And add this helper (outside the function, e.g., near the top of the module):
function isConnResetError(e: unknown): e is NodeJS.ErrnoException { return e instanceof Error && "code" in (e as any) && (e as any).code === "ECONNRESET"; }apps/webapp/app/v3/machinePresets.server.ts (1)
9-9
: Downgrade to info makes sense; consider logging Zod issues at debugKeeping this at info reduces noise. For debuggability, adding the Zod issues at debug level is low risk and helpful.
if (!parsedConfig.success) { logger.info("Failed to parse machine config", { config }); + logger.debug("Machine config parse issues", { issues: parsedConfig.error.issues });
apps/webapp/app/v3/eventRepository.server.ts (1)
1249-1251
: Good downgrade; add flush/context fields to the log for easier correlationSince this is a retriable path, the info level is appropriate. Add
flushId
,depth
, andeventsCount
to the structured log to correlate with spans and metrics.- logger.info("Failed to insert events, will attempt bisection", { - error: errorDetails, - }); + logger.info("[EventRepository][doFlushBatch] Retriable insert failure; bisection next", { + error: errorDetails, + flushId, + depth, + eventsCount: events.length, + });apps/webapp/app/v3/runEngineHandlers.server.ts (1)
385-409
: Right-severity logging for oversized metadata; consider adding status code to contextUsing warn for a known 413-style validation is appropriate. Consider including an explicit
httpStatus: 413
(or equivalent) in the log context to make filtering/searching easier.- logger.warn("[runMetadataUpdated] Failed to update metadata, too large", { - taskRun: run.id, + logger.warn("[runMetadataUpdated] Failed to update metadata, too large", { + taskRun: run.id, + httpStatus: 413, error: e instanceof Error ? { name: e.name, message: e.message, stack: e.stack, } : e, });apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts (1)
30-36
: Fix misleading log message; consider error payload typing for 422 responses
- The log message says "Failed to suspend run" but this route is a "continue" endpoint.
- Optional: the loader’s declared return type is TypedResponse, but you throw a { error: string } body on failures. If your route builder surfaces thrown JSON directly, this is fine at runtime; otherwise, you may want to widen the return type or return json(...) instead of throwing.
Suggested tweak:
- logger.warn("Failed to suspend run", { runFriendlyId, snapshotFriendlyId, error }); + logger.warn("Failed to continue run", { runFriendlyId, snapshotFriendlyId, error });apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts (1)
624-626
: Downgrading to info for expected “no matching deployment” is appropriate; consider adding run id for quicker triageLGTM on severity. For easier log querying, you might include the run id alongside messageId (it’s already included in later attrs, just not in this log call).
- logger.info("No matching deployment found for task run", { - queueMessage: message.data, - messageId: message.messageId, - }); + logger.info("No matching deployment found for task run", { + queueMessage: message.data, + messageId: message.messageId, + runId: existingTaskRun.id, + });packages/redis-worker/src/worker.ts (5)
39-41
: Clarify JSDoc: errors are downgraded to info when logErrors is falseThe current comment implies “no logging,” but the implementation still logs at info. Clarify intent to avoid confusion.
- /** Defaults to true. If false, errors will not be logged. */ + /** Defaults to true. If false, errors will be logged at info level instead of error. */
546-552
: Improve “no handler/catalog item found” logs (grammar and context)Minor polish for readability and triage. Include the job id to make searching easier and adjust wording.
- this.logger.error(`Worker no handler found for job type: ${job}`); + this.logger.error("Worker: No handler found for job type", { job, id }); ... - this.logger.error(`Worker no catalog item found for job type: ${job}`); + this.logger.error("Worker: No catalog item found for job type", { job, id });
595-606
: Centralized error logging looks good; enrich attributesGood consolidation. Add attempt and deduplicationKey to logAttributes to make downstream analysis simpler.
const logAttributes = { name: this.options.name, id, job, item, visibilityTimeoutMs, error, errorMessage, + attempt, + deduplicationKey, };
646-655
: Reuse logAttributes for consistency (optional)These attributes mirror logAttributes; consider reusing and extending it to keep logs uniform and reduce drift.
- this.logger.info(`Worker requeuing failed item with delay`, { - name: this.options.name, - id, - job, - item, - retryDate, - retryDelay, - visibilityTimeoutMs, - attempt: newAttempt, - }); + this.logger.info("Worker requeuing failed item with delay", { + ...logAttributes, + retryDate, + retryDelay, + attempt: newAttempt, + });
666-675
: Consider applying logErrors gating to requeue failure as wellIf requeue fails, you currently always log at error. If the catalog item opted out of error-level logs, mirror that choice here too.
- this.logger.error( - `Worker failed to requeue item. It will be retried after the visibility timeout.`, - { - name: this.options.name, - id, - job, - item, - visibilityTimeoutMs, - error: requeueError, - } - ); + const requeueLog = { + ...logAttributes, + error: requeueError, + }; + if (shouldLogError) { + this.logger.error( + "Worker failed to requeue item. It will be retried after the visibility timeout.", + requeueLog + ); + } else { + this.logger.info( + "Worker failed to requeue item. It will be retried after the visibility timeout.", + requeueLog + ); + }apps/webapp/app/v3/services/alerts/deliverAlert.server.ts (1)
929-935
: Log the hex signature (or omit), not the raw ArrayBufferYou compute signatureHex but log signature (ArrayBuffer), which is noisy and often useless in logs. Prefer logging signatureHex or drop it entirely.
- logger.info("[DeliverAlert] Failed to send alert webhook", { + logger.info("[DeliverAlert] Failed to send alert webhook", { status: response.status, statusText: response.statusText, url: webhook.url, body: payload, - signature, + signature: signatureHex, });apps/webapp/app/v3/alertsWorker.server.ts (1)
46-46
: Consider logging only on final attempt (feature follow-up)If the current noise is mostly transient retries, consider an enhancement in the redis-worker API to support logging only on the last attempt (or providing an onError hook with attempt metadata) instead of a blanket suppression. This preserves signal for hard failures without spamming logs on transient ones.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (16)
apps/webapp/app/db.server.ts
(1 hunks)apps/webapp/app/models/runtimeEnvironment.server.ts
(1 hunks)apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts
(1 hunks)apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
(1 hunks)apps/webapp/app/v3/alertsWorker.server.ts
(3 hunks)apps/webapp/app/v3/eventRepository.server.ts
(1 hunks)apps/webapp/app/v3/machinePresets.server.ts
(1 hunks)apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
(1 hunks)apps/webapp/app/v3/runEngineHandlers.server.ts
(2 hunks)apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
(1 hunks)apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
(1 hunks)apps/webapp/app/v3/services/createCheckpoint.server.ts
(2 hunks)internal-packages/run-engine/src/engine/machinePresets.ts
(1 hunks)internal-packages/run-engine/src/engine/systems/dequeueSystem.ts
(1 hunks)packages/core/src/logger.ts
(1 hunks)packages/redis-worker/src/worker.ts
(6 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
internal-packages/run-engine/src/engine/machinePresets.ts
apps/webapp/app/db.server.ts
apps/webapp/app/v3/machinePresets.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/alertsWorker.server.ts
apps/webapp/app/models/runtimeEnvironment.server.ts
apps/webapp/app/v3/eventRepository.server.ts
packages/core/src/logger.ts
apps/webapp/app/v3/services/createCheckpoint.server.ts
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts
packages/redis-worker/src/worker.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/db.server.ts
apps/webapp/app/v3/machinePresets.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/alertsWorker.server.ts
apps/webapp/app/models/runtimeEnvironment.server.ts
apps/webapp/app/v3/eventRepository.server.ts
packages/core/src/logger.ts
apps/webapp/app/v3/services/createCheckpoint.server.ts
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}
: In the webapp, all environment variables must be accessed through theenv
export ofenv.server.ts
, instead of directly accessingprocess.env
.
When importing from@trigger.dev/core
in the webapp, never import from the root@trigger.dev/core
path; always use one of the subpath exports as defined in the package's package.json.
Files:
apps/webapp/app/db.server.ts
apps/webapp/app/v3/machinePresets.server.ts
apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
apps/webapp/app/v3/alertsWorker.server.ts
apps/webapp/app/models/runtimeEnvironment.server.ts
apps/webapp/app/v3/eventRepository.server.ts
apps/webapp/app/v3/services/createCheckpoint.server.ts
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
apps/webapp/app/v3/runEngineHandlers.server.ts
apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts
apps/webapp/app/services/**/*.server.ts
📄 CodeRabbit Inference Engine (.cursor/rules/webapp.mdc)
For testable services, separate service logic and configuration, as exemplified by
realtimeClient.server.ts
(service) andrealtimeClientGlobal.server.ts
(configuration).
Files:
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
🧠 Learnings (1)
📚 Learning: 2025-07-18T17:49:24.468Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-18T17:49:24.468Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : We use prisma in internal-packages/database for our database interactions using PostgreSQL
Applied to files:
apps/webapp/app/db.server.ts
🧬 Code Graph Analysis (7)
apps/webapp/app/v3/machinePresets.server.ts (1)
packages/core/src/v3/tracer.ts (1)
logger
(56-64)
apps/webapp/app/models/runtimeEnvironment.server.ts (1)
apps/webapp/app/v3/tracer.server.ts (1)
logger
(101-106)
apps/webapp/app/v3/eventRepository.server.ts (3)
packages/core/src/v3/tracer.ts (1)
logger
(56-64)apps/webapp/app/services/logger.server.ts (1)
logger
(49-59)apps/webapp/app/v3/tracer.server.ts (1)
logger
(101-106)
apps/webapp/app/v3/services/createCheckpoint.server.ts (1)
apps/webapp/app/services/logger.server.ts (1)
logger
(49-59)
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts (1)
apps/webapp/app/services/logger.server.ts (1)
logger
(49-59)
apps/webapp/app/v3/runEngineHandlers.server.ts (1)
apps/webapp/app/utils/packets.ts (1)
MetadataTooLargeError
(4-9)
apps/webapp/app/v3/services/alerts/deliverAlert.server.ts (1)
packages/trigger-sdk/src/v3/index.ts (1)
logger
(39-39)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (9)
apps/webapp/app/db.server.ts (1)
196-206
: LGTM on gating Prisma writer error with ignoreErrorAdding
ignoreError: true
to the Prisma writer error event aligns with the new Logger.onError gating and should reduce false alerts without losing error visibility.apps/webapp/app/v3/services/cancelTaskRunV1.server.ts (1)
46-49
: LGTM on log level downgradeDowngrading this non-actionable path to info makes sense. It’s a frequent state check, not an exception.
apps/webapp/app/models/runtimeEnvironment.server.ts (1)
40-43
: LGTM on downgrading to warnUsing warn here is appropriate: it’s a client input gap rather than a server-side failure.
apps/webapp/app/v3/runEngineHandlers.server.ts (1)
20-20
: Import addition looks correctImporting MetadataTooLargeError from packets aligns with the new conditional logging path.
internal-packages/run-engine/src/engine/systems/dequeueSystem.ts (2)
138-144
: Structured logging improvement is a winSwitching to a concise message with structured context (runId, snapshotId, executionStatus) improves log queryability and reduces duplication.
147-153
: Consistent structured context on warn pathSame here—moving details to the context payload keeps messages readable and machine-parsable.
apps/webapp/app/v3/services/createCheckpoint.server.ts (2)
135-139
: LGTM: log level downgrade for already-resumed child runLogging as info for an expected pre-check is the right call; return shape remains consistent.
171-175
: LGTM: log level downgrade for already-resumed batchSame here—appropriate severity reduction for a non-exceptional condition.
packages/redis-worker/src/worker.ts (1)
624-634
: LGTM: DLQ transition logging respects per-item logErrorsConditionally downgrading to info when logErrors is false matches the catalog contract and reduces noise while preserving context.
1bac20b
to
d402945
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal-packages/run-engine/src/engine/machinePresets.ts (1)
69-74
: “Smallest preset” selection is order-dependent; sort to actually choose the smallest.Current iteration depends on object insertion order and may pick a larger-than-necessary machine. Sort by cpu then memory to honor the function’s contract and reduce cost.
function derivePresetNameFromValues( machines: Record<string, MachinePreset>, cpu: number, memory: number ): MachinePresetName | undefined { - for (const [name, preset] of Object.entries(machines)) { + const sorted = Object.entries(machines).sort(([, a], [, b]) => { + if (a.cpu === b.cpu) return a.memory - b.memory; + return a.cpu - b.cpu; + }); + for (const [name, preset] of sorted) { if (preset.cpu >= cpu && preset.memory >= memory) { return name as MachinePresetName; } } }
🧹 Nitpick comments (13)
internal-packages/run-engine/src/engine/machinePresets.ts (7)
28-31
: Avoid logging the entire config; log structured validation errors and high-level shape instead.Full config objects can be large and may contain sensitive data. Prefer Zod issue summaries and config keys for context.
Apply this diff to improve signal and reduce risk:
- if (!parsedConfig.success) { - logger.info("Failed to parse machine config", { config }); + if (!parsedConfig.success) { + logger.info("Failed to parse machine config", { + errorIssues: parsedConfig.error.issues.map((i) => ({ + path: i.path.join("."), + code: i.code, + })), + configKeys: + typeof config === "object" && config !== null + ? Object.keys(config as Record<string, unknown>) + : undefined, + });
20-21
: Consider downgrading this parse failure to info for consistency.You’ve treated config parse failures as informational. The preset parse failure here seems similar and may also be a frequent, non-exceptional case.
Apply this diff if you agree:
- logger.error("Failed to parse machine preset", { machinePreset: run.machinePreset }); + logger.info("Failed to parse machine preset", { machinePreset: run.machinePreset });
31-31
: Prefer the provided defaultMachine over a hardcoded fallback.Using the caller-provided default keeps behavior consistent with environments that may not include "small-1x".
Proposed change:
- return machinePresetFromName(machines, "small-1x"); + return machinePresetFromName(machines, defaultMachine);If "small-1x" is the intentional canonical fallback across the platform, feel free to ignore.
51-51
: Same here: use defaultMachine instead of hardcoding "small-1x".Keeps fallbacks consistent across all branches.
- return machinePresetFromName(machines, "small-1x"); + return machinePresetFromName(machines, defaultMachine);
38-49
: Don’t use truthiness for numeric checks (0 is falsy).Be explicit about undefined/null to avoid skipping valid values.
- if (parsedConfig.data.cpu && parsedConfig.data.memory) { + if (parsedConfig.data.cpu !== undefined && parsedConfig.data.memory !== undefined) {
58-61
: Add a guard to produce a clear error if the preset name is unknown.Spreading an undefined value throws a confusing TypeError. Fail fast with a clear message.
export function machinePresetFromName( machines: Record<string, MachinePreset>, name: MachinePresetName ): MachinePreset { - return { - ...machines[name], - }; + const preset = machines[name]; + if (!preset) { + throw new Error(`Unknown machine preset: ${name}`); + } + return { ...preset }; }
4-4
: Nit: align logger name with the exported function for easier log tracing.Optional, but it helps when correlating call sites and logs.
-const logger = new Logger("machinePresetFromConfig"); +const logger = new Logger("getMachinePreset");apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts (1)
624-629
: Enrich the info log and tighten the inline commentAdd commonly-needed fields to the log for quicker triage, and reword the comment to reflect the actual flow (we’re about to mark it as WAITING_FOR_DEPLOY here).
Apply this diff:
- // This happens when a run is "WAITING_FOR_DEPLOY" and is expected - logger.info("No matching deployment found for task run", { - queueMessage: message.data, - messageId: message.messageId, - }); + // No matching deployment found; mark run as "WAITING_FOR_DEPLOY" (expected during deploy gaps) + logger.info("No matching deployment found for task run", { + queueMessage: message.data, + messageId: message.messageId, + runId: existingTaskRun.id, + runStatus: existingTaskRun.status, + lockedById: existingTaskRun.lockedById ?? undefined, + lockedToVersionId: existingTaskRun.lockedToVersionId ?? undefined, + environmentId: existingTaskRun.runtimeEnvironmentId, + });apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts (5)
187-194
: LGTM on downgrading ECONNRESET to info; consider a small type-guard + context tweakThis achieves the PR goal (reducing false errors) without changing control flow. To improve readability and add better diagnostics, consider:
- Using a narrow type-guard for ECONNRESET instead of the nested
"code" in error
checks.- Including streamKey/runId/streamId in the info log for correlation.
Apply this diff within the catch block:
- if (error instanceof Error) { - if ("code" in error && error.code === "ECONNRESET") { - logger.info("[RealtimeStreams][ingestData] Connection reset during ingestData:", { - error, - }); - return new Response(null, { status: 500 }); - } - } + if (isConnReset(error)) { + logger.info("[RealtimeStreams][ingestData] Connection reset during ingestData", { + error, + streamKey, + runId, + streamId, + }); + return new Response(null, { status: 500 }); + }Add this helper near the top of the file (outside the selected range):
type ErrorWithCode = Error & { code?: string | number }; function isConnReset(e: unknown): e is ErrorWithCode { return e instanceof Error && (e as ErrorWithCode).code === "ECONNRESET"; }
110-114
: Prefer logger over console.error in cleanup to keep logging consistentUsing the shared logger keeps formatting consistent and lets you attach metadata (e.g., ignoreError).
- await redis.quit().catch(console.error); + await redis.quit().catch((error) => + logger.info( + "[RealtimeStreams][streamResponse] Error during Redis quit in cleanup", + { error, ignoreError: true } + ) + );
136-140
: Demote cleanup failures to info and mark as ignored to reduce noiseCleanup failures are non-critical here. Logging them at error can inflate error budgets. Keep them, but at info with ignoreError for consistency with the PR’s goal.
- } catch (error) { - logger.error("[RedisRealtimeStreams][ingestData] Error in cleanup:", { error }); - } + } catch (error) { + logger.info("[RedisRealtimeStreams][ingestData] Error in cleanup:", { + error, + ignoreError: true, + }); + }
144-169
: Release the reader lock after the read loopAfter you finish reading, explicitly release the reader lock to help avoid holding resources longer than necessary.
while (true) { const { done, value } = await reader.read(); if (done || !value) { break; } logger.debug("[RedisRealtimeStreams][ingestData] Reading data", { streamKey, runId, value, }); await redis.xadd( streamKey, "MAXLEN", "~", String(env.REALTIME_STREAM_MAX_LENGTH), "*", "data", value ); } + + // Ensure the lock on the reader is released before proceeding. + reader.releaseLock();
187-194
: Consider returning 204/200 instead of 500 for ECONNRESET in ingestDataECONNRESET usually indicates the client hung up mid-upload. Returning a 5xx here may trigger upstream retries or alerting even though there’s nothing actionable. Switching to a 204 (or 200) can reduce noise if callers treat non-2xx as retryable.
Affected location:
- apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts (ingestData, lines 187–194)
Proposed diff:
- if (error instanceof Error && "code" in error && error.code === "ECONNRESET") { - logger.info("[RealtimeStreams][ingestData] Connection reset during ingestData:", { error }); - return new Response(null, { status: 500 }); - } + if (error instanceof Error && "code" in error && error.code === "ECONNRESET") { + logger.info("[RealtimeStreams][ingestData] Connection reset during ingestData:", { error }); + // Client disconnected—return 204 to avoid unnecessary retries/alerts + return new Response(null, { status: 204 }); + }• Verify how your callers treat 2xx vs 5xx on this endpoint (e.g. retries, error reporting) before rolling out.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (16)
apps/webapp/app/db.server.ts
(1 hunks)apps/webapp/app/models/runtimeEnvironment.server.ts
(1 hunks)apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts
(1 hunks)apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
(1 hunks)apps/webapp/app/v3/alertsWorker.server.ts
(3 hunks)apps/webapp/app/v3/eventRepository.server.ts
(1 hunks)apps/webapp/app/v3/machinePresets.server.ts
(1 hunks)apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
(1 hunks)apps/webapp/app/v3/runEngineHandlers.server.ts
(2 hunks)apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
(1 hunks)apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
(1 hunks)apps/webapp/app/v3/services/createCheckpoint.server.ts
(2 hunks)internal-packages/run-engine/src/engine/machinePresets.ts
(1 hunks)internal-packages/run-engine/src/engine/systems/dequeueSystem.ts
(1 hunks)packages/core/src/logger.ts
(1 hunks)packages/redis-worker/src/worker.ts
(6 hunks)
✅ Files skipped from review due to trivial changes (1)
- apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
🚧 Files skipped from review as they are similar to previous changes (12)
- apps/webapp/app/v3/runEngineHandlers.server.ts
- internal-packages/run-engine/src/engine/systems/dequeueSystem.ts
- apps/webapp/app/v3/services/alerts/deliverAlert.server.ts
- apps/webapp/app/v3/machinePresets.server.ts
- apps/webapp/app/routes/engine.v1.worker-actions.runs.$runFriendlyId.snapshots.$snapshotFriendlyId.continue.ts
- apps/webapp/app/models/runtimeEnvironment.server.ts
- apps/webapp/app/db.server.ts
- packages/redis-worker/src/worker.ts
- apps/webapp/app/v3/alertsWorker.server.ts
- apps/webapp/app/v3/eventRepository.server.ts
- apps/webapp/app/v3/services/createCheckpoint.server.ts
- packages/core/src/logger.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
internal-packages/run-engine/src/engine/machinePresets.ts
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit Inference Engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}
: In the webapp, all environment variables must be accessed through theenv
export ofenv.server.ts
, instead of directly accessingprocess.env
.
When importing from@trigger.dev/core
in the webapp, never import from the root@trigger.dev/core
path; always use one of the subpath exports as defined in the package's package.json.
Files:
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts
apps/webapp/app/services/**/*.server.ts
📄 CodeRabbit Inference Engine (.cursor/rules/webapp.mdc)
For testable services, separate service logic and configuration, as exemplified by
realtimeClient.server.ts
(service) andrealtimeClientGlobal.server.ts
(configuration).
Files:
apps/webapp/app/services/realtime/redisRealtimeStreams.server.ts
🧬 Code Graph Analysis (1)
apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts (1)
apps/webapp/app/services/logger.server.ts (1)
logger
(49-59)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (4)
internal-packages/run-engine/src/engine/machinePresets.ts (2)
29-29
: Downgrading to info is appropriate and aligns with PR intent.Good call reducing noise for non-exceptional validation failures.
1-75
: Fallback safe: "small-1x" present in all machine maps
Verified that everymachines
map—including run-engine test fixtures, core schema defaults, and webapp presets—defines an entry for"small-1x"
. The hard-coded fallback will not cause missing-key errors.apps/webapp/app/v3/marqs/sharedQueueConsumer.server.ts (2)
624-642
: LGTM: Downgrade to info + WAITING_FOR_DEPLOY ack looks right for expected gapsThis aligns with the PR goal of reducing noisy errors when a deploy isn’t available yet, and setting the run to WAITING_FOR_DEPLOY matches the intended state transition.
630-641
: Re-enqueue Path Exists for WAITING_FOR_DEPLOYWe’ve confirmed that WAITING_FOR_DEPLOY runs are picked up and re-enqueued by a dedicated background service:
• apps/webapp/app/v3/services/executeTasksWaitingForDeploy.ts (around lines 37–43)
– Queries for runs where status = "WAITING_FOR_DEPLOY"
– Callsmarqs.enqueueMessage(...)
for each run to put it back on the shared queueBecause this service handles all ACKed WAITING_FOR_DEPLOY runs, the
ack_and_do_more_work
path in sharedQueueConsumer.server.ts will not starve tasks.
No description provided.