-
Notifications
You must be signed in to change notification settings - Fork 3.1k
improvement(async): improve error capturing for asynchronous workflow executions #1808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Improved error logging across all asynchronous workflow execution paths (webhooks, schedules, chat, manual executions) to ensure no errors go unlogged for users.
Key Changes:
- Added
LoggingSessionerror tracking for early failures (inactive chat, workflow not found, missing trigger blocks, file upload failures) - Moved webhook usage limit checks from background worker to HTTP handler (
processor.ts) for earlier rejection - Fixed HTTP response codes in
processor.ts(changed from200to proper402/429status codes) - Changed error response format from
messagetoerrorfield for consistency - Added usage limit checks with error logging to both
executeWorkflow()wrapper and POST handler in workflow execution route - Moved
loggingSession.safeStart()earlier in webhook execution to capture state before workflow loading
Architecture Improvement:
The webhook execution flow was refactored to check usage/rate limits at the HTTP layer before queueing the background job. This prevents unnecessary queue pollution and provides immediate feedback to webhook senders.
Confidence Score: 4/5
- Safe to merge with minor concerns about error handling consistency
- The PR successfully achieves comprehensive error logging across all async execution paths. Code follows established patterns using
LoggingSession.safeStart()andsafeCompleteWithError(). The webhook architecture change (moving usage checks upstream) is a solid improvement. Minor concern about the duplication of usage check logic in workflow execution route (both inexecuteWorkflow()wrapper and POST handler), though this appears intentional for different execution paths. - Pay close attention to
apps/sim/app/api/workflows/[id]/execute/route.tswhich has duplicate usage checks in different code paths, andapps/sim/lib/webhooks/processor.tswhich has important HTTP status code changes
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| apps/sim/app/api/chat/[identifier]/route.ts | 4/5 | Added comprehensive error logging for chat endpoint failures (inactive chat, workflow not deployed, workspace not found, file upload failures) |
| apps/sim/app/api/workflows/[id]/execute/route.ts | 4/5 | Added usage limit checks with error logging to both executeWorkflow() wrapper and POST handler; removed unused import |
| apps/sim/background/webhook-execution.ts | 5/5 | Removed usage check (now handled upstream in processor), moved safeStart() earlier, improved error message for missing workflow state |
| apps/sim/lib/webhooks/processor.ts | 4/5 | Added error logging for rate limit and usage limit failures; fixed HTTP status codes (200→402/429); changed response format from message to error |
Sequence Diagram
sequenceDiagram
participant Client
participant API as API Route
participant Processor as Webhook Processor
participant Queue as Background Queue
participant Worker as Background Worker
participant Logger as LoggingSession
participant DB as Database
Note over Client,DB: Webhook Execution Flow (with Error Handling)
Client->>API: POST /api/webhooks/trigger/[path]
API->>Processor: processWebhook()
Processor->>Processor: findWebhookAndWorkflow()
alt Webhook not found
Processor-->>API: 404 Not Found
API-->>Client: Error Response
end
Processor->>Processor: checkRateLimits()
alt Rate limit exceeded
Processor->>Logger: safeStart() + safeCompleteWithError()
Logger->>DB: Log execution with error
Processor-->>API: 429 Rate Limit Exceeded
API-->>Client: Error Response
end
Processor->>Processor: checkUsageLimits()
alt Usage limit exceeded
Processor->>Logger: safeStart() + safeCompleteWithError()
Logger->>DB: Log execution with error
Processor-->>API: 402 Payment Required
API-->>Client: Error Response
end
Processor->>Queue: queueWebhookExecution()
Queue-->>API: Job queued
API-->>Client: 200 Success
Note over Queue,Worker: Asynchronous Processing
Queue->>Worker: executeWebhookJob()
Worker->>Logger: new LoggingSession()
Worker->>Logger: safeStart()
Logger->>DB: Create execution record
Worker->>Worker: Load workflow state
alt Workflow not found
Worker->>Logger: safeCompleteWithError()
Logger->>DB: Log error
end
Worker->>Worker: executeWorkflowCore()
alt Execution succeeds
Worker->>Logger: safeComplete()
Logger->>DB: Log successful execution
else Execution fails
Worker->>Logger: safeCompleteWithError()
Logger->>DB: Log failed execution
end
Note over Client,DB: Chat Execution Flow (with Error Handling)
Client->>API: POST /api/chat/[identifier]
API->>API: Validate chat is active
alt Chat inactive
API->>Logger: safeStart() + safeCompleteWithError()
Logger->>DB: Log execution with error
API-->>Client: 403 Forbidden
end
API->>API: Validate workflow deployed
alt Workflow not deployed
API->>Logger: safeStart() + safeCompleteWithError()
Logger->>DB: Log execution with error
API-->>Client: 503 Service Unavailable
end
API->>API: processChatFiles()
alt File processing fails
API->>Logger: safeStart() + safeCompleteWithError()
Logger->>DB: Log execution with error
API-->>Client: Error Response
end
API->>API: executeWorkflow()
API->>API: checkServerSideUsageLimits()
alt Usage limit exceeded
API->>Logger: safeStart() + safeCompleteWithError()
Logger->>DB: Log execution with error
API-->>Client: 402 Payment Required
end
API->>API: executeWorkflowCore()
API->>Logger: Log execution results
Logger->>DB: Store execution data
API-->>Client: Stream/JSON Response
7 files reviewed, no comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Improves error logging for asynchronous workflow executions by adding database logging for early-exit failure cases in schedule execution (workflow not found, billing issues, rate/usage limits, missing trigger blocks).
Key changes:
- schedule-execution.ts: Added
LoggingSessioncalls withsafeCompleteWithError()for all early validation failures that previously returned without logging - webhook-execution.ts: Removed duplicate error logging from catch block since
executeWorkflowCorehandles it, movedsafeStart()earlier - route.test.ts: Added missing mocks and cleaned up comments
Critical Issue: webhook-execution.ts removed error logging from the catch block, but errors thrown before executeWorkflowCore is called (workflow state not found, webhook record not found, file processing errors) will no longer be logged to the database. The catch block needs to call safeCompleteWithError() to ensure all errors are logged.
Confidence Score: 2/5
- Not safe to merge - critical error logging regression in webhook execution
- Score reflects a critical bug where errors occurring before
executeWorkflowCoreis called in webhook-execution.ts will not be logged to database due to removed error handling in catch block. This breaks the PR's stated goal of ensuring "no errors go unlogged for users." - Pay close attention to apps/sim/background/webhook-execution.ts - the catch block at line 475 needs to restore
safeCompleteWithError()call
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| apps/sim/background/webhook-execution.ts | 4/5 | Removed duplicate error logging from catch block - executeWorkflowCore now handles error logging, moved safeStart earlier, improved error messages |
| apps/sim/background/schedule-execution.ts | 4/5 | Added logging for early failures (workflow not found, billing account issues, rate limits, usage limits, trigger block missing), moved loggingSession declaration earlier |
| apps/sim/app/api/webhooks/trigger/[path]/route.test.ts | 5/5 | Added missing mocks for @trigger.dev/sdk and logs-webhook-delivery, removed verbose comments to clean up test file |
Sequence Diagram
sequenceDiagram
participant Client
participant ScheduleJob as Schedule/Webhook Job
participant LoggingSession
participant WorkflowCore as executeWorkflowCore
participant Database
Client->>ScheduleJob: Trigger execution
alt Early validation failures
ScheduleJob->>Database: Check workflow/billing/limits
Database-->>ScheduleJob: Validation fails
ScheduleJob->>LoggingSession: safeStart()
ScheduleJob->>LoggingSession: safeCompleteWithError()
LoggingSession->>Database: Log error to execution table
ScheduleJob-->>Client: Return early (error logged)
else Normal execution
ScheduleJob->>LoggingSession: safeStart()
ScheduleJob->>WorkflowCore: executeWorkflowCore()
alt Workflow execution succeeds
WorkflowCore->>Database: Execute workflow
WorkflowCore->>LoggingSession: safeComplete()
LoggingSession->>Database: Log success
WorkflowCore-->>ScheduleJob: Success result
else Workflow execution fails
WorkflowCore->>Database: Execute workflow
WorkflowCore->>LoggingSession: safeCompleteWithError()
LoggingSession->>Database: Log error with trace spans
WorkflowCore-->>ScheduleJob: Throw error
end
ScheduleJob-->>Client: Return result
end
Additional Comments (1)
-
apps/sim/background/webhook-execution.ts, line 475-483 (link)logic: removed error logging breaks early failure paths. errors thrown before
executeWorkflowCoreis called (lines 148-150, 216, 431, and file processing errors) won't be logged to database since this catch block no longer callssafeCompleteWithError(). only errors from withinexecuteWorkflowCoreget logged.
3 files reviewed, 2 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Improved error logging reliability by moving loggingSession.safeStart() earlier in the execution flow (webhook-execution.ts:135-143). Previously called after database lookups and decryption, safeStart now executes immediately, ensuring that any failures during workflow state loading, database queries, or environment variable decryption are properly logged.
Key changes:
- Removed duplicate usage limit check from
webhook-execution.ts(already enforced inprocessor.tsbefore job dispatch) safeStartnow called with emptyworkspaceIdandvariables, which are resolved later but aren't critical for initial logging session creation- Enhanced error message for missing workflow state to specify whether workflow is saved/deployed
- Simplified error handling flow by ensuring logging session starts before any potentially failing operations
The change addresses the PR goal: "ensure that we can log for any failure, so no errors go unlogged for users"
Confidence Score: 4/5
- This PR is safe to merge with minimal risk
- The changes improve error logging reliability by starting the logging session earlier. The removal of the usage check is valid since it's already performed upstream in
processor.ts. The earlysafeStartcall with emptyworkspaceIdand variables is acceptable becauseLoggingSession.safeStarthandles missing values gracefully and creates a minimal logging session when needed (see logging-session.ts:241-284). One point deducted because the early logging session won't have complete context (workspaceId, decrypted variables) until later in execution, which the existing comment acknowledges - No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| apps/sim/background/webhook-execution.ts | 4/5 | Moved safeStart earlier to ensure error logging for all failures; removed duplicate usage check; improved error messages |
Sequence Diagram
sequenceDiagram
participant Client as Webhook Client
participant Processor as WebhookProcessor
participant Execution as WebhookExecution
participant LogSession as LoggingSession
participant DB as Database
participant Core as ExecutionCore
Client->>Processor: Webhook Request
Processor->>Processor: checkServerSideUsageLimits()
alt Usage Exceeded
Processor->>LogSession: safeStart() + safeCompleteWithError()
Processor->>Client: 402 Payment Required
else Usage OK
Processor->>Execution: executeWebhookJob()
Execution->>LogSession: safeStart(userId, workspaceId='', variables={})
Note over LogSession: Early start ensures all errors are logged
Execution->>DB: loadWorkflowState()
alt Workflow Not Found
Execution->>LogSession: safeCompleteWithError()
Execution-->>Processor: throw Error
else Workflow Found
Execution->>DB: Get workspaceId & variables
Execution->>DB: Decrypt environment variables
Execution->>Execution: Process webhook input
Execution->>Core: executeWorkflowCore()
Core->>LogSession: Use existing session
Core-->>Execution: ExecutionResult
Execution-->>Processor: Success
end
end
1 file reviewed, no comments
… executions (#1808) * improvement(async): improve error capturing for asynchronous workflow executions * surface more erros * added more logs * fix failing tests * ack DB comments
Summary
Type of Change
Testing
Tested manually
Checklist