Add readonly async pool settings for pre-warming and optimized pool recycling#7211
Add readonly async pool settings for pre-warming and optimized pool recycling#7211
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Greptile OverviewGreptile SummaryOverviewThis PR adds configuration options to optimize the async read-only database connection pool for higher throughput. It introduces 6 new settings: pool size/overflow, pre-ping, skip rollback, autocommit, and pre-warming. Key Changes
Critical Issues Found1. Prewarm Configuration Never Used (CRITICAL)The 2. Incorrect Type AnnotationLine 149-151 assigns Design ConcernsAggressive Performance DefaultsThe PR enables both
Consider whether these should be opt-in rather than default, or add explicit documentation about the required database configuration. Prewarm Default ValueThe prewarm feature defaults to Confidence Score: 2/5
Important Files ChangedFile Analysis
|
src/fides/api/db/ctl_session.py
Outdated
| session = readonly_async_session_factory() | ||
|
|
||
| try: | ||
| yield session | ||
| await session.commit() |
There was a problem hiding this comment.
There's an inconsistency: the session is created after acquiring and releasing the lock, but the session commit/rollback operations happen outside the lock. This means if warming fails or is still in progress when multiple requests arrive, some sessions could be created before the pool is fully warmed.
Additionally, for a read-only database with AUTOCOMMIT isolation level and pool_reset_on_return=None, calling await session.commit() on line 121 is redundant. The commit is effectively a no-op since autocommit is enabled and there's no transaction to commit.
Consider whether the commit/rollback logic should be simplified for read-only sessions, especially when autocommit is enabled.
There was a problem hiding this comment.
I believe the lock should be acquired at the beginning of the generator by whomever calls it and given to whoever requests the lock first; if anyone else is currently warming the pool, block until the pool is warm, at which point it will discover that the pool is warmed and then the lock will be held only briefly by anyone from then onwards; the session should not be inside the lock because then the others will block until the transaction is done.
| async_readonly_database_pool_skip_rollback: bool = Field( | ||
| default=True, | ||
| description="If true, the async engine will skip rolling back connections when they are returned to the pool.", | ||
| ) | ||
| async_readonly_database_autocommit: bool = Field( | ||
| default=True, | ||
| description="If true, the async engine will autocommit transactions. This should effectively be a no-op because it's a readonly database.", | ||
| ) |
There was a problem hiding this comment.
The combination of async_readonly_database_pool_skip_rollback=True (line 202-205) and async_readonly_database_autocommit=True (line 206-209) as defaults is aggressive and could cause issues if the database isn't truly read-only or if application code assumes standard transaction semantics.
Concerns:
-
AUTOCOMMIT with read-only: While the comment says "this should effectively be a no-op because it's a readonly database," this assumes database-level enforcement. If the readonly restriction is only enforced at the user/role level and there's a misconfiguration, this could cause unintended writes to auto-commit.
-
Skipping rollback: Combined with AUTOCOMMIT, this means sessions won't be cleaned up properly if they encounter errors. While this improves performance, it could leave connections in unexpected states.
Consider adding more documentation about the assumptions these settings make (e.g., database must enforce read-only at the user/role level) and potentially making these opt-in rather than default.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
@johnewart do we want set defaults as false for now and just change the settings to true in nightly?
There was a problem hiding this comment.
Updated defaults to false
| async_readonly_database_prewarm: bool = Field( | ||
| default=False, | ||
| description="Whether to warm the asynchronous read-only database pool on startup.", | ||
| exclude=True, | ||
| ) |
There was a problem hiding this comment.
The PR description states that pool pre-warming is one of the features being added "to achieve higher read throughput on async endpoints." However, this setting defaults to False, which means users won't get this benefit unless they explicitly enable it.
Additionally, due to the bug in ctl_session.py lines 143-147, this setting is currently not being used at all - the pool is always warmed when a readonly URI is configured.
Consider whether:
- The default should be
Trueto match the PR's intent of improving throughput - The setting should be better documented about when to enable/disable it
- There should be a warning logged when warming is disabled but a readonly URI is configured
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Defaults are false to avoid introducing default behavior unintentionally
…ed a logic issue with choosing the prewarmed connection pool, fixed type annotations, put connection yield inside of the lock
erosselli
left a comment
There was a problem hiding this comment.
Approving with my open comments
This change adds some new configuration flags and changes the way the readonly async connection pool behaves to achieve higher read throughput on async endpoints. These include: