fix: prevent memory exhaustion in loops with bounded iteration outputs #2527

aadamsx · 2025-12-22T19:05:34Z

Summary

This PR addresses memory exhaustion issues that occur when running workflows with loops containing agent blocks that make many tool calls (e.g., MCP file operations).

Fixes #2525

Problem

Memory accumulated unbounded in two key areas during workflow execution:

allIterationOutputs in LoopScope - every loop iteration pushed results to this array with no limit
blockLogs in ExecutionContext - every block execution added logs with no pruning

This caused OOM crashes on systems with 64GB+ RAM during long-running workflow executions with loops.

Solution

Added memory management with configurable limits in two places:

Loop Orchestrator (`apps/sim/executor/orchestrators/loop.ts`)

New addIterationOutputsWithMemoryLimit() method
Limits stored iterations to MAX_STORED_ITERATION_OUTPUTS (default: 100)
Monitors memory size with MAX_ITERATION_OUTPUTS_SIZE_BYTES (default: 50MB)
Discards oldest iterations when limits exceeded
Logs warning when truncation occurs

Block Executor (`apps/sim/executor/execution/block-executor.ts`)

New addBlockLogWithMemoryLimit() method
Limits stored logs to MAX_BLOCK_LOGS (default: 500)
Monitors memory size with MAX_BLOCK_LOGS_SIZE_BYTES (default: 100MB)
Periodic size checks every 50 logs to avoid frequent JSON serialization
Logs warning when truncation occurs

New Constants (`apps/sim/executor/constants.ts`)

MAX_STORED_ITERATION_OUTPUTS: 100
MAX_ITERATION_OUTPUTS_SIZE_BYTES: 50MB
MAX_BLOCK_LOGS: 500
MAX_BLOCK_LOGS_SIZE_BYTES: 100MB

Trade-offs

Final aggregated loop.results will contain only the most recent iterations (up to 100)
Block logs in execution data will contain only the most recent logs (up to 500)
Warnings are logged when truncation occurs, allowing users to see if limits were hit

Testing

For loop: Execute a workflow with 200+ loop iterations, verify memory doesn't grow unbounded
Agent in loop: Run a loop with agent blocks making 50+ tool calls per iteration, verify no OOM

Files Changed

apps/sim/executor/constants.ts - Added new configurable limits
apps/sim/executor/orchestrators/loop.ts - Added memory-bounded iteration storage
apps/sim/executor/execution/block-executor.ts - Added memory-bounded log storage

… fixes, subflow resize clamping

…ribe, auth checks, new db indexes

…dioai#2481) The realtime service network policy was missing the custom egress rules section that allows configuration of additional egress rules via values.yaml. This caused the realtime pods to be unable to connect to external databases (e.g., PostgreSQL on port 5432) when using external database configurations. The app network policy already had this section, but the realtime network policy was missing it, creating an inconsistency and preventing the realtime service from accessing external databases configured via networkPolicy.egress values. This fix adds the same custom egress rules template section to the realtime network policy, matching the app network policy behavior and allowing users to configure database connectivity via values.yaml.

…rovements, additional kb tag types

vercel · 2025-12-22T19:05:40Z

@aadamsx is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

greptile-apps · 2025-12-22T19:09:23Z

Greptile Summary

This PR implements memory management for long-running workflow loops by adding bounded storage for iteration outputs and block logs.

Key Changes:

Added configurable memory limits in constants.ts (100 iterations/50MB for loops, 500 logs/100MB for blocks)
Implemented addIterationOutputsWithMemoryLimit() in loop orchestrator that truncates oldest iterations when limits exceeded
Implemented addBlockLogWithMemoryLimit() in block executor that discards older logs when limits exceeded
Both use periodic memory size checks (every 10 iterations / every 50 logs) to avoid frequent serialization overhead
Added custom egress rules support to Helm network policy (unrelated infrastructure change)

Considerations:

Memory size checks use JSON.stringify().length * 2 as approximation, which may not reflect actual V8 heap usage
Periodic checking (modulo arithmetic) means memory can grow beyond limits between checks
Users lose access to older iteration results/logs after truncation (trade-off documented in PR)

Confidence Score: 4/5

safe to merge with minor timing edge case in memory checks
implementation correctly addresses the OOM issue with reasonable defaults, but periodic memory checking has a timing gap where memory can exceed limits between checks (especially after count-based truncation resets the modulo counter)
pay attention to apps/sim/executor/orchestrators/loop.ts due to memory check timing issue

Important Files Changed

Filename	Overview
apps/sim/executor/constants.ts	added memory limit constants for loop iterations and block logs with clear documentation
apps/sim/executor/orchestrators/loop.ts	implemented memory-bounded iteration storage with count and size limits; memory check timing may cause issues
apps/sim/executor/execution/block-executor.ts	implemented memory-bounded log storage with count and size limits, properly handles serialization failures

Sequence Diagram

sequenceDiagram
    participant WF as Workflow Executor
    participant BE as BlockExecutor
    participant LO as LoopOrchestrator
    participant LS as LoopScope
    participant CTX as ExecutionContext
    
    Note over WF,CTX: Loop Execution with Memory Management
    
    WF->>LO: initializeLoopScope(ctx, loopId)
    LO->>LS: create LoopScope
    LS-->>LO: scope with empty allIterationOutputs[]
    
    loop Each Iteration
        WF->>BE: execute(ctx, node, block)
        BE->>BE: createBlockLog()
        BE->>BE: addBlockLogWithMemoryLimit(ctx, blockLog)
        
        alt blockLogs.length > MAX_BLOCK_LOGS (500)
            BE->>CTX: slice and discard oldest logs
            BE->>BE: log warning
        end
        
        alt blockLogs.length % 50 === 0
            BE->>BE: estimateBlockLogsSize()
            alt size > MAX_BLOCK_LOGS_SIZE_BYTES (100MB)
                BE->>CTX: discard oldest half of logs
                BE->>BE: log warning
            end
        end
        
        BE->>CTX: blockLogs.push(blockLog)
        BE-->>WF: NormalizedBlockOutput
        
        WF->>LO: storeLoopNodeOutput(ctx, loopId, nodeId, output)
        LO->>LS: currentIterationOutputs.set(nodeId, output)
        
        WF->>LO: evaluateLoopContinuation(ctx, loopId)
        LO->>LO: collect iterationResults from currentIterationOutputs
        LO->>LO: addIterationOutputsWithMemoryLimit(scope, results)
        
        alt allIterationOutputs.length > MAX_STORED_ITERATION_OUTPUTS (100)
            LO->>LS: slice and discard oldest iterations
            LO->>LO: log warning
        end
        
        alt allIterationOutputs.length % 10 === 0
            LO->>LO: estimateObjectSize()
            alt size > MAX_ITERATION_OUTPUTS_SIZE_BYTES (50MB)
                LO->>LS: discard oldest half of iterations
                LO->>LO: log warning
            end
        end
        
        LO->>LS: allIterationOutputs.push(results)
        LO->>LS: currentIterationOutputs.clear()
        
        LO->>LO: evaluateCondition()
        alt condition false
            LO->>LO: createExitResult()
            LO-->>WF: shouldExit=true
        else condition true
            LO->>LS: increment iteration
            LO-->>WF: shouldContinue=true
        end
    end
    
    Note over WF,CTX: Memory bounded to ~150MB total

greptile-apps

Additional Comments (5)

apps/sim/executor/orchestrators/loop.ts, line 519-529 (link)

logic: JSON.stringify() runs on EVERY iteration before size check, defeating the purpose of memory optimization. With 1000 iterations, this serializes potentially GBs of data 1000 times.

Move size check to only run periodically (e.g., every 10 iterations) like block-executor.ts:

Then in addIterationOutputsWithMemoryLimit at line 497:
```
// Check memory size limit periodically (every 10 iterations to avoid frequent serialization)
if (scope.allIterationOutputs.length % 10 === 0) {
  const estimatedSize = this.estimateObjectSize(scope.allIterationOutputs)
  if (estimatedSize > DEFAULTS.MAX_ITERATION_OUTPUTS_SIZE_BYTES) {
```
apps/sim/executor/execution/block-executor.ts, line 704-710 (link)

logic: returns max limit on error, which prevents cleanup and allows unbounded growth if serialization consistently fails
apps/sim/executor/orchestrators/loop.ts, line 519-529 (link)

logic: same issue - returning max limit prevents cleanup on serialization errors
apps/sim/executor/orchestrators/loop.ts, line 483-494 (link)

logic: slicing from discardCount removes NEWEST iterations instead of oldest. The intent is to keep the most recent 100 iterations.
apps/sim/executor/execution/block-executor.ts, line 673-682 (link)

logic: slicing from discardCount removes NEWEST logs instead of oldest

_{3 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

This change addresses memory exhaustion issues that occur when running workflows with loops containing agent blocks that make many tool calls. Problem: Memory accumulated unbounded in two key areas: 1. allIterationOutputs in LoopScope - every iteration pushed results 2. blockLogs in ExecutionContext - every block execution added logs Solution: Added memory management with configurable limits in constants.ts: - MAX_STORED_ITERATION_OUTPUTS (100) and MAX_ITERATION_OUTPUTS_SIZE_BYTES (50MB) - MAX_BLOCK_LOGS (500) and MAX_BLOCK_LOGS_SIZE_BYTES (100MB) Loop orchestrator (loop.ts): - New addIterationOutputsWithMemoryLimit() method - Periodic size checks (every 10 iterations) to avoid serialization overhead - Discards oldest iterations when limits exceeded Block executor (block-executor.ts): - New addBlockLogWithMemoryLimit() method - Periodic size checks (every 50 logs) - Discards oldest logs when limits exceeded Trade-offs: - Final aggregated results contain only recent iterations - Logs show warning when truncation occurs for debugging Fixes simstudioai#2525

aadamsx · 2025-12-27T02:36:33Z

Response to Greptile Review

The issues identified in the review on commit aae7c40 have been addressed in the current HEAD (387efb6):

✅ Performance Issue (Fixed)

Added periodic size checking - JSON.stringify() now only runs every 10 iterations for loop outputs and every 50 logs for block logs, rather than on every iteration.

✅ Error Handling (Fixed)

Changed error return to MAX + 1 instead of MAX so that serialization failures properly trigger cleanup.

ℹ️ Array Slicing Logic (Already Correct)

The Greptile comment about array slicing is incorrect. The current logic is working as intended:

scope.allIterationOutputs = scope.allIterationOutputs.slice(discardCount)

When we have 110 items and MAX is 100:

discardCount = 110 - 100 = 10
slice(10) returns items at indices 10-109 (newest 100 items)
Items at indices 0-9 (oldest 10) are discarded

This correctly implements "discard oldest, keep newest" as stated in the docstring. The array is built by pushing new items, so index 0 is oldest and the last index is newest.

greptile-apps

Additional Comments (1)

apps/sim/executor/orchestrators/loop.ts, line 496-497 (link)

logic: memory check happens only on exact multiples of 10, missing iterations 101-109 after first truncation at iteration 100

if iteration 100 hits count limit and truncates to 100 items, then iteration 101-109 won't check memory size until iteration 110, potentially allowing another 50MB+ growth

_{4 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

icecrasher321 and others added 7 commits December 18, 2025 16:23

v0.5.35: helm updates, copilot improvements, 404 for docs, salesforce…

eb07a08

… fixes, subflow resize clamping

v0.5.36: hitl improvements, opengraph, slack fixes, one-click unsubsc…

4d1a9a3

…ribe, auth checks, new db indexes

v0.5.37: redaction utils consolidation, logs updates, autoconnect imp…

3e697d9

…rovements, additional kb tag types

v0.5.38: snap to grid, copilot ux improvements, billing line items

4827866

v0.5.39: notion, workflow variables fixes

0f4ec96

v0.5.40: supabase ops to allow non-public schemas, jira uuid

3d9d9cb

greptile-apps bot reviewed Dec 22, 2025

View reviewed changes

aadamsx force-pushed the fix/memory-accumulation-in-loops branch from 36bbe99 to 387efb6 Compare December 22, 2025 19:19

aadamsx changed the base branch from main to staging December 27, 2025 03:07

waleedlatif1 deleted the branch simstudioai:staging December 27, 2025 05:25

waleedlatif1 closed this Dec 27, 2025

waleedlatif1 reopened this Dec 27, 2025

greptile-apps bot reviewed Dec 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: prevent memory exhaustion in loops with bounded iteration outputs #2527

fix: prevent memory exhaustion in loops with bounded iteration outputs #2527

aadamsx commented Dec 22, 2025 •

edited

Loading

Uh oh!

vercel bot commented Dec 22, 2025

Uh oh!

greptile-apps bot commented Dec 22, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

aadamsx commented Dec 27, 2025

Uh oh!

greptile-apps bot left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: prevent memory exhaustion in loops with bounded iteration outputs #2527

Are you sure you want to change the base?

fix: prevent memory exhaustion in loops with bounded iteration outputs #2527

Conversation

aadamsx commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Loop Orchestrator (apps/sim/executor/orchestrators/loop.ts)

Block Executor (apps/sim/executor/execution/block-executor.ts)

New Constants (apps/sim/executor/constants.ts)

Trade-offs

Testing

Files Changed

Uh oh!

vercel bot commented Dec 22, 2025

Uh oh!

greptile-apps bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (5)

Uh oh!

aadamsx commented Dec 27, 2025

Response to Greptile Review

✅ Performance Issue (Fixed)

✅ Error Handling (Fixed)

ℹ️ Array Slicing Logic (Already Correct)

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Additional Comments (1)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aadamsx commented Dec 22, 2025 •

edited

Loading

Loop Orchestrator (`apps/sim/executor/orchestrators/loop.ts`)

Block Executor (`apps/sim/executor/execution/block-executor.ts`)

New Constants (`apps/sim/executor/constants.ts`)

greptile-apps bot commented Dec 22, 2025 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading

greptile-apps bot left a comment •

edited

Loading