Skip to content

Conversation

@waleedlatif1
Copy link
Collaborator

Summary

  • improve error capturing for asynchronous workflow executions and various trigger methods
  • ensure that we can log for any failure, so no errors go unlogged for users

Type of Change

  • Bug fix

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link

vercel bot commented Nov 5, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
docs Skipped Skipped Nov 5, 2025 2:04am

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Improved error logging across all asynchronous workflow execution paths (webhooks, schedules, chat, manual executions) to ensure no errors go unlogged for users.

Key Changes:

  • Added LoggingSession error tracking for early failures (inactive chat, workflow not found, missing trigger blocks, file upload failures)
  • Moved webhook usage limit checks from background worker to HTTP handler (processor.ts) for earlier rejection
  • Fixed HTTP response codes in processor.ts (changed from 200 to proper 402/429 status codes)
  • Changed error response format from message to error field for consistency
  • Added usage limit checks with error logging to both executeWorkflow() wrapper and POST handler in workflow execution route
  • Moved loggingSession.safeStart() earlier in webhook execution to capture state before workflow loading

Architecture Improvement:
The webhook execution flow was refactored to check usage/rate limits at the HTTP layer before queueing the background job. This prevents unnecessary queue pollution and provides immediate feedback to webhook senders.

Confidence Score: 4/5

  • Safe to merge with minor concerns about error handling consistency
  • The PR successfully achieves comprehensive error logging across all async execution paths. Code follows established patterns using LoggingSession.safeStart() and safeCompleteWithError(). The webhook architecture change (moving usage checks upstream) is a solid improvement. Minor concern about the duplication of usage check logic in workflow execution route (both in executeWorkflow() wrapper and POST handler), though this appears intentional for different execution paths.
  • Pay close attention to apps/sim/app/api/workflows/[id]/execute/route.ts which has duplicate usage checks in different code paths, and apps/sim/lib/webhooks/processor.ts which has important HTTP status code changes

Important Files Changed

File Analysis

Filename Score Overview
apps/sim/app/api/chat/[identifier]/route.ts 4/5 Added comprehensive error logging for chat endpoint failures (inactive chat, workflow not deployed, workspace not found, file upload failures)
apps/sim/app/api/workflows/[id]/execute/route.ts 4/5 Added usage limit checks with error logging to both executeWorkflow() wrapper and POST handler; removed unused import
apps/sim/background/webhook-execution.ts 5/5 Removed usage check (now handled upstream in processor), moved safeStart() earlier, improved error message for missing workflow state
apps/sim/lib/webhooks/processor.ts 4/5 Added error logging for rate limit and usage limit failures; fixed HTTP status codes (200→402/429); changed response format from message to error

Sequence Diagram

sequenceDiagram
    participant Client
    participant API as API Route
    participant Processor as Webhook Processor
    participant Queue as Background Queue
    participant Worker as Background Worker
    participant Logger as LoggingSession
    participant DB as Database

    Note over Client,DB: Webhook Execution Flow (with Error Handling)
    
    Client->>API: POST /api/webhooks/trigger/[path]
    API->>Processor: processWebhook()
    
    Processor->>Processor: findWebhookAndWorkflow()
    alt Webhook not found
        Processor-->>API: 404 Not Found
        API-->>Client: Error Response
    end
    
    Processor->>Processor: checkRateLimits()
    alt Rate limit exceeded
        Processor->>Logger: safeStart() + safeCompleteWithError()
        Logger->>DB: Log execution with error
        Processor-->>API: 429 Rate Limit Exceeded
        API-->>Client: Error Response
    end
    
    Processor->>Processor: checkUsageLimits()
    alt Usage limit exceeded
        Processor->>Logger: safeStart() + safeCompleteWithError()
        Logger->>DB: Log execution with error
        Processor-->>API: 402 Payment Required
        API-->>Client: Error Response
    end
    
    Processor->>Queue: queueWebhookExecution()
    Queue-->>API: Job queued
    API-->>Client: 200 Success
    
    Note over Queue,Worker: Asynchronous Processing
    
    Queue->>Worker: executeWebhookJob()
    Worker->>Logger: new LoggingSession()
    Worker->>Logger: safeStart()
    Logger->>DB: Create execution record
    
    Worker->>Worker: Load workflow state
    alt Workflow not found
        Worker->>Logger: safeCompleteWithError()
        Logger->>DB: Log error
    end
    
    Worker->>Worker: executeWorkflowCore()
    alt Execution succeeds
        Worker->>Logger: safeComplete()
        Logger->>DB: Log successful execution
    else Execution fails
        Worker->>Logger: safeCompleteWithError()
        Logger->>DB: Log failed execution
    end
    
    Note over Client,DB: Chat Execution Flow (with Error Handling)
    
    Client->>API: POST /api/chat/[identifier]
    API->>API: Validate chat is active
    alt Chat inactive
        API->>Logger: safeStart() + safeCompleteWithError()
        Logger->>DB: Log execution with error
        API-->>Client: 403 Forbidden
    end
    
    API->>API: Validate workflow deployed
    alt Workflow not deployed
        API->>Logger: safeStart() + safeCompleteWithError()
        Logger->>DB: Log execution with error
        API-->>Client: 503 Service Unavailable
    end
    
    API->>API: processChatFiles()
    alt File processing fails
        API->>Logger: safeStart() + safeCompleteWithError()
        Logger->>DB: Log execution with error
        API-->>Client: Error Response
    end
    
    API->>API: executeWorkflow()
    API->>API: checkServerSideUsageLimits()
    alt Usage limit exceeded
        API->>Logger: safeStart() + safeCompleteWithError()
        Logger->>DB: Log execution with error
        API-->>Client: 402 Payment Required
    end
    
    API->>API: executeWorkflowCore()
    API->>Logger: Log execution results
    Logger->>DB: Store execution data
    API-->>Client: Stream/JSON Response
Loading

7 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Improves error logging for asynchronous workflow executions by adding database logging for early-exit failure cases in schedule execution (workflow not found, billing issues, rate/usage limits, missing trigger blocks).

Key changes:

  • schedule-execution.ts: Added LoggingSession calls with safeCompleteWithError() for all early validation failures that previously returned without logging
  • webhook-execution.ts: Removed duplicate error logging from catch block since executeWorkflowCore handles it, moved safeStart() earlier
  • route.test.ts: Added missing mocks and cleaned up comments

Critical Issue: webhook-execution.ts removed error logging from the catch block, but errors thrown before executeWorkflowCore is called (workflow state not found, webhook record not found, file processing errors) will no longer be logged to the database. The catch block needs to call safeCompleteWithError() to ensure all errors are logged.

Confidence Score: 2/5

  • Not safe to merge - critical error logging regression in webhook execution
  • Score reflects a critical bug where errors occurring before executeWorkflowCore is called in webhook-execution.ts will not be logged to database due to removed error handling in catch block. This breaks the PR's stated goal of ensuring "no errors go unlogged for users."
  • Pay close attention to apps/sim/background/webhook-execution.ts - the catch block at line 475 needs to restore safeCompleteWithError() call

Important Files Changed

File Analysis

Filename Score Overview
apps/sim/background/webhook-execution.ts 4/5 Removed duplicate error logging from catch block - executeWorkflowCore now handles error logging, moved safeStart earlier, improved error messages
apps/sim/background/schedule-execution.ts 4/5 Added logging for early failures (workflow not found, billing account issues, rate limits, usage limits, trigger block missing), moved loggingSession declaration earlier
apps/sim/app/api/webhooks/trigger/[path]/route.test.ts 5/5 Added missing mocks for @trigger.dev/sdk and logs-webhook-delivery, removed verbose comments to clean up test file

Sequence Diagram

sequenceDiagram
    participant Client
    participant ScheduleJob as Schedule/Webhook Job
    participant LoggingSession
    participant WorkflowCore as executeWorkflowCore
    participant Database

    Client->>ScheduleJob: Trigger execution
    
    alt Early validation failures
        ScheduleJob->>Database: Check workflow/billing/limits
        Database-->>ScheduleJob: Validation fails
        ScheduleJob->>LoggingSession: safeStart()
        ScheduleJob->>LoggingSession: safeCompleteWithError()
        LoggingSession->>Database: Log error to execution table
        ScheduleJob-->>Client: Return early (error logged)
    else Normal execution
        ScheduleJob->>LoggingSession: safeStart()
        ScheduleJob->>WorkflowCore: executeWorkflowCore()
        
        alt Workflow execution succeeds
            WorkflowCore->>Database: Execute workflow
            WorkflowCore->>LoggingSession: safeComplete()
            LoggingSession->>Database: Log success
            WorkflowCore-->>ScheduleJob: Success result
        else Workflow execution fails
            WorkflowCore->>Database: Execute workflow
            WorkflowCore->>LoggingSession: safeCompleteWithError()
            LoggingSession->>Database: Log error with trace spans
            WorkflowCore-->>ScheduleJob: Throw error
        end
        
        ScheduleJob-->>Client: Return result
    end
Loading

Additional Comments (1)

  1. apps/sim/background/webhook-execution.ts, line 475-483 (link)

    logic: removed error logging breaks early failure paths. errors thrown before executeWorkflowCore is called (lines 148-150, 216, 431, and file processing errors) won't be logged to database since this catch block no longer calls safeCompleteWithError(). only errors from within executeWorkflowCore get logged.

3 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1
Copy link
Collaborator Author

@greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Improved error logging reliability by moving loggingSession.safeStart() earlier in the execution flow (webhook-execution.ts:135-143). Previously called after database lookups and decryption, safeStart now executes immediately, ensuring that any failures during workflow state loading, database queries, or environment variable decryption are properly logged.

Key changes:

  • Removed duplicate usage limit check from webhook-execution.ts (already enforced in processor.ts before job dispatch)
  • safeStart now called with empty workspaceId and variables, which are resolved later but aren't critical for initial logging session creation
  • Enhanced error message for missing workflow state to specify whether workflow is saved/deployed
  • Simplified error handling flow by ensuring logging session starts before any potentially failing operations

The change addresses the PR goal: "ensure that we can log for any failure, so no errors go unlogged for users"

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk
  • The changes improve error logging reliability by starting the logging session earlier. The removal of the usage check is valid since it's already performed upstream in processor.ts. The early safeStart call with empty workspaceId and variables is acceptable because LoggingSession.safeStart handles missing values gracefully and creates a minimal logging session when needed (see logging-session.ts:241-284). One point deducted because the early logging session won't have complete context (workspaceId, decrypted variables) until later in execution, which the existing comment acknowledges
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
apps/sim/background/webhook-execution.ts 4/5 Moved safeStart earlier to ensure error logging for all failures; removed duplicate usage check; improved error messages

Sequence Diagram

sequenceDiagram
    participant Client as Webhook Client
    participant Processor as WebhookProcessor
    participant Execution as WebhookExecution
    participant LogSession as LoggingSession
    participant DB as Database
    participant Core as ExecutionCore

    Client->>Processor: Webhook Request
    Processor->>Processor: checkServerSideUsageLimits()
    alt Usage Exceeded
        Processor->>LogSession: safeStart() + safeCompleteWithError()
        Processor->>Client: 402 Payment Required
    else Usage OK
        Processor->>Execution: executeWebhookJob()
        Execution->>LogSession: safeStart(userId, workspaceId='', variables={})
        Note over LogSession: Early start ensures all errors are logged
        Execution->>DB: loadWorkflowState()
        alt Workflow Not Found
            Execution->>LogSession: safeCompleteWithError()
            Execution-->>Processor: throw Error
        else Workflow Found
            Execution->>DB: Get workspaceId & variables
            Execution->>DB: Decrypt environment variables
            Execution->>Execution: Process webhook input
            Execution->>Core: executeWorkflowCore()
            Core->>LogSession: Use existing session
            Core-->>Execution: ExecutionResult
            Execution-->>Processor: Success
        end
    end
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@waleedlatif1 waleedlatif1 merged commit f65d62e into staging Nov 5, 2025
9 checks passed
@waleedlatif1 waleedlatif1 deleted the sim-315 branch November 5, 2025 02:16
waleedlatif1 added a commit that referenced this pull request Nov 12, 2025
… executions (#1808)

* improvement(async): improve error capturing for asynchronous workflow executions

* surface more erros

* added more logs

* fix failing tests

* ack DB comments
@waleedlatif1 waleedlatif1 mentioned this pull request Nov 12, 2025
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants