fix connection config list endpoint performance regression#7501
fix connection config list endpoint performance regression#7501
Conversation
The selectinload(ConnectionConfig.system) triggered all of System's default lazy="selectin" relationships (privacy_declarations, data_stewards, connection_configs, assets, system_groups), creating a cascade of unnecessary queries. Restrict to load_only the two columns needed (fides_key, name) and noload all relationships. Made-with: Cursor
Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Move the selectinload + load_only + noload pattern into a reusable
linked_system_load_options() helper in system_integration_link/repository.py.
Apply the same optimization to the GET /connection/{key} detail endpoint
which was still triggering the cascade via lazy loading.
Made-with: Cursor
Greptile SummaryThis PR addresses a critical performance regression introduced in #7458. The issue was that eager-loading the The fix introduces a shared
The detail endpoint was refactored from using The implementation is clean, well-documented, and follows SQLAlchemy best practices. No functional changes to the API responses. Confidence Score: 5/5
Important Files Changed
Last reviewed commit: 3ed330b |
Ticket ENG-2590
Description Of Changes
Fixes a performance regression introduced in #7458 on the
GET /api/v1/connection(list),GET /api/v1/connection/{key}(detail), andGET /system/{fides_key}/connectionendpoints. Theselectinload(ConnectionConfig.system)added to populate the newlinked_systemsresponse field was loading the fullSystemORM model, which triggered a cascade of 5 additionallazy="selectin"relationship queries on every request:privacy_declarations— all privacy declarations for all loaded systemsdata_stewards— all FidesUser records via the systemmanager join tableconnection_configs— the ConnectionConfig back again (circular)assets— all Asset records for all loaded systemssystem_groups— all SystemGroup records via system_group_member join tableNone of this data is needed — the
LinkedSystemInfoschema only usesfides_keyandname. On an environment with 73 integrations and 181 systems, this caused the list endpoint to take ~13s to respond.The fix introduces a shared
linked_system_load_options()helper insystem_integration_link/repository.pythat appliesload_only(System.fides_key, System.name)andnoload("*")to restrict loaded columns and suppress the cascade. This helper is applied to all three endpoints, including the detail endpoint which was triggering the same cascade via lazy loading.Code Changes
src/fides/system_integration_link/repository.py— Addedlinked_system_load_options()helper that returns a properly scopedselectinloadwithload_only+noload("*")src/fides/api/api/v1/endpoints/connection_endpoints.py— List endpoint uses the shared helper; detail endpoint rewritten to use an explicit query with the same helper instead ofget_connection_config_or_error(which loaded via plain ORM with no eager-load options)src/fides/api/api/v1/endpoints/system.py— System connections endpoint uses the shared helperLocal Benchmarks
Seeded local DB to production-like scale (74 connection configs, 61 system links, 7,371 privacy declarations, 6,625 assets) and ran an A/B comparison on the list endpoint:
~2x improvement locally. The improvement in production environments with remote DB latency
willmay be significantly larger, since each eliminated cascade query adds a full network round-trip.Note: we are seeing ~13s latency for this endpoint on plus-rc.
Steps to Confirm
GET /api/v1/connection?size=100— each item should includelinked_systems: []orlinked_systems: [{fides_key: "...", name: "..."}], same as beforeGET /api/v1/connection/{key}— should includelinked_systemsin the detail responseGET /system/{fides_key}/connection— should includelinked_systemsfor each connectionPre-Merge Checklist
CHANGELOG.mdupdatedAdd a db-migration labelAdd a high-risk label