This document describes the architecture of the RunEngine class, the core orchestration engine for task execution in Trigger.dev v3/v4. The RunEngine coordinates task lifecycle management through a composition of 11+ specialized systems, integrating with PostgreSQL, Redis, and background workers to provide distributed task execution with checkpointing, retry logic, and real-time observability.
For information about the execution lifecycle and state machine transitions, see Run Lifecycle and State Machine. For queue management details, see Queue Management. For individual system behaviors (checkpoints, waitpoints, batches, etc.), see their respective sections (#4.4-#4.10).
The RunEngine class is the central orchestrator located at internal-packages/run-engine/src/engine/index.ts76-1393 It follows a composition-based architecture where specialized systems handle different aspects of task execution. The engine is instantiated as a singleton in the webapp at apps/webapp/app/v3/runEngine.server.ts11-192
Sources: internal-packages/run-engine/src/engine/index.ts76-384 apps/webapp/app/v3/runEngine.server.ts11-192
Sources: internal-packages/run-engine/src/engine/index.ts76-384 internal-packages/run-engine/src/engine/types.ts23-107
The RunEngine constructor accepts a RunEngineOptions configuration object and performs the following initialization:
| Component | Purpose | Configuration Source |
|---|---|---|
prisma | Primary database client | Required in options |
readOnlyPrisma | Read replica for queries | Optional, falls back to prisma |
runLockRedis | Redis client for distributed locks | options.runLock.redis with prefix runlock: |
runLock | Redlock-based distributed locker | RunLocker with retry configuration |
runQueue | Fair queue selection and management | RunQueue with FairQueueSelectionStrategy |
worker | Background job processor | Worker with workerCatalog |
batchQueue | DRR-based batch processing queue | BatchQueue with quantum and rate limiting |
eventBus | Real-time event notifications | EventEmitter<EventBusEvents> |
Sources: internal-packages/run-engine/src/engine/index.ts106-384 internal-packages/run-engine/src/engine/types.ts23-107
All specialized systems receive a SystemResources object containing shared dependencies:
This pattern ensures consistent access to core infrastructure across all systems without tight coupling.
Sources: internal-packages/run-engine/src/engine/systems/systems.ts internal-packages/run-engine/src/engine/index.ts269-279
Sources: internal-packages/run-engine/src/engine/index.ts281-383
Systems are initialized with explicit dependencies, ensuring proper composition:
Sources: internal-packages/run-engine/src/engine/index.ts281-383
The trigger() method at internal-packages/run-engine/src/engine/index.ts389-727 handles:
triggerAndWait scenariosDelayedRunSystemEnqueueSystemTtlSystemKey Parameters:
| Parameter | Type | Purpose |
|---|---|---|
friendlyId | string | Human-readable run identifier |
environment | MinimalAuthenticatedEnvironment | Execution context |
taskIdentifier | string | Task to execute |
payload | string | Serialized task input |
delayUntil | Date? | Delayed execution timestamp |
debounce | object? | Debounce configuration |
ttl | string? | Time-to-live expiration |
maxAttempts | number? | Retry limit |
batch | object? | Batch association |
Sources: internal-packages/run-engine/src/engine/index.ts389-727 internal-packages/run-engine/src/engine/types.ts117-187
The dequeue process at internal-packages/run-engine/src/engine/index.ts735-777 and internal-packages/run-engine/src/engine/systems/dequeueSystem.ts105-623 involves:
RunQueue.dequeueMessageFromWorkerQueue() uses fair selection strategyRunLocker prevents concurrent processingBackgroundWorker and BackgroundWorkerTasklockedById, lockedToVersionId, lockedQueueIdPENDING_EXECUTING snapshotEXECUTING with new attempt numberSources: internal-packages/run-engine/src/engine/index.ts735-777 internal-packages/run-engine/src/engine/systems/dequeueSystem.ts105-623 internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts294-628
The completeRunAttempt() method at internal-packages/run-engine/src/engine/index.ts833-853 delegates to RunAttemptSystem.completeRunAttempt() at internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts630-667 which routes to either:
COMPLETED_SUCCESSFULLY, completes associated waitpoint, resumes parent if applicable, triggers batch completionSources: internal-packages/run-engine/src/engine/index.ts833-853 internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts630-1162
The RunLocker class at internal-packages/run-engine/src/engine/locking.ts70-556 provides distributed locking using Redlock with the following features:
| Feature | Implementation | Purpose |
|---|---|---|
| Redlock algorithm | redlock npm package | Multi-node lock consensus |
| AsyncLocalStorage | Node.js async_hooks | Lock context tracking |
| Automatic extension | Background timer | Prevents timeout during long operations |
| Exponential backoff | Configurable retry policy | Lock acquisition under contention |
| Signal-based cancellation | RedlockAbortSignal | Graceful lock release |
| Nested lock detection | Context checking | Prevents deadlocks from re-locking |
Retry Configuration:
Lock Key Patterns:
engine:runlock:{lockType}:{resource1}:{resource2}:... for most operationsRUN_ENGINE_RUN_LOCK_DURATION)Sources: internal-packages/run-engine/src/engine/locking.ts70-556 apps/webapp/app/env.server.ts594-601
All critical state transitions acquire a distributed lock on the run ID to prevent race conditions across multiple webapp instances.
Sources: internal-packages/run-engine/src/engine/index.ts internal-packages/run-engine/src/engine/locking.ts196-368
The workerCatalog at internal-packages/run-engine/src/engine/workerCatalog.ts defines background jobs processed by the EngineWorker:
| Job Name | Purpose | Scheduling |
|---|---|---|
finishWaitpoint | Complete waitpoint at scheduled time | availableAt = waitpoint completion time |
heartbeatSnapshot | Detect stalled snapshots | Periodic based on heartbeat timeouts |
repairSnapshot | Recover from stalled states | Delayed after heartbeat detection |
expireRun | TTL-based run expiration | availableAt = TTL expiration time |
cancelRun | Asynchronous run cancellation | Immediate |
queueRunsPendingVersion | Enqueue runs waiting for deployment | After deployment finalization |
tryCompleteBatch | Check batch completion | After batch item completion |
continueRunIfUnblocked | Resume run after waitpoint completion | 50ms delay (debounced) |
enqueueDelayedRun | Enqueue delayed run at scheduled time | availableAt = delayUntil timestamp |
Job Registration Example:
Sources: internal-packages/run-engine/src/engine/index.ts199-244 internal-packages/run-engine/src/engine/workerCatalog.ts
The EventBus at internal-packages/run-engine/src/engine/index.ts90 is a Node.js EventEmitter providing real-time notifications for:
| Event | Emitted By | Payload |
|---|---|---|
runCreated | trigger() | { time, runId } |
runLocked | DequeueSystem | { time, run, organization, project, environment } |
runAttemptStarted | RunAttemptSystem.startRunAttempt() | { time, run, organization, project, environment } |
runAttemptCompleted | RunAttemptSystem.attemptSucceeded() | { time, run, organization, project, environment } |
runAttemptFailed | RunAttemptSystem.attemptFailed() | { time, run, organization, project, environment } |
runStatusChanged | Various systems | { time, run, organization, project, environment } |
runDelayRescheduled | DelayedRunSystem | { time, run, organization, project, environment } |
incomingCheckpointDiscarded | CheckpointSystem | { time, run, checkpoint, snapshot } |
cachedRunCompleted | WaitpointSystem | { time, span, blockedRunId, hasError, cachedRunId } |
These events drive:
Sources: internal-packages/run-engine/src/engine/eventBus.ts internal-packages/run-engine/src/engine/index.ts90
The RunEngine is instantiated as a singleton in the webapp at apps/webapp/app/v3/runEngine.server.ts11-192:
Key Configuration Sources:
| Configuration | Environment Variables | Default |
|---|---|---|
| Worker count | RUN_ENGINE_WORKER_COUNT | 4 |
| Tasks per worker | RUN_ENGINE_TASKS_PER_WORKER | 10 |
| Worker concurrency | RUN_ENGINE_WORKER_CONCURRENCY_LIMIT | 10 |
| Poll interval | RUN_ENGINE_WORKER_POLL_INTERVAL | 100ms |
| Lock duration | RUN_ENGINE_RUN_LOCK_DURATION | 5000ms |
| Heartbeat timeouts | RUN_ENGINE_TIMEOUT_* | Various (60s-600s) |
| Retry warm start | RUN_ENGINE_RETRY_WARM_START_THRESHOLD_MS | 30000ms |
Sources: apps/webapp/app/v3/runEngine.server.ts11-192 apps/webapp/app/env.server.ts559-619
The RunEngine uses three separate Redis connections to avoid key collisions and enable independent scaling:
Each Redis connection can be configured with separate:
Sources: apps/webapp/app/v3/runEngine.server.ts84-103 apps/webapp/app/env.server.ts620-684
The webapp's RunEngineTriggerTaskService at apps/webapp/app/runEngine/services/triggerTask.server.ts51-417 demonstrates how to integrate with the RunEngine:
This integration demonstrates the concern-based architecture where specialized components handle:
Sources: apps/webapp/app/runEngine/services/triggerTask.server.ts51-417 apps/webapp/app/runEngine/types.ts59-161
Refresh this wiki
This wiki was recently refreshed. Please wait 1 day to refresh again.