Skip to content

fix: snapshot SaasTemplateDataset baseline before batch template update loop#7578

Merged
Linker44 merged 14 commits intomainfrom
fix/saas-template-dataset-merge-baseline-snapshot
Mar 10, 2026
Merged

fix: snapshot SaasTemplateDataset baseline before batch template update loop#7578
Linker44 merged 14 commits intomainfrom
fix/saas-template-dataset-merge-baseline-snapshot

Conversation

@Linker44
Copy link
Copy Markdown
Contributor

@Linker44 Linker44 commented Mar 5, 2026

Ticket ENG-2947

Summary

  • SaasTemplateDataset stores one row per connector type as the merge baseline for the 3-way dataset merge (customer edits vs old template vs new template)
  • update_existing_connection_configs_for_connector_type loops over all connection configs of a given type and calls upsert_dataset_config_from_template for each — which both reads and writes SaasTemplateDataset on every iteration
  • The first iteration's write bumps the stored baseline to the new template, so all subsequent iterations use the new template as their baseline instead of the old one
  • Result: the 3-way merge incorrectly treats unchanged customer fields as customer modifications and silently skips applying template updates on those configs
  • Affects any deployment with multiple connection configs of the same SaaS connector type, whether on the same system or different systems

Fix: read SaasTemplateDataset once before the loop and pass it as an optional stored_dataset_json snapshot through update_saas_instanceupsert_dataset_config_from_template. All iterations merge against the same correct pre-update baseline. Single-integration callers (instantiate_connection) are unchanged — they pass nothing and the existing read-from-DB path applies.

Changes

  • upsert_dataset_config_from_template: new optional stored_dataset_json param; uses it as merge baseline when provided, otherwise reads SaasTemplateDataset as before
  • update_saas_instance: threads the optional stored_dataset_json param through
  • update_existing_connection_configs_for_connector_type: snapshots SaasTemplateDataset once before the loop, passes snapshot to each iteration

Test plan

  • New test test_update_existing_connection_configs_preserves_merge_baseline_across_multiple_configs — creates two connection configs of the same type, runs the batch update, and asserts both received the template update (fails without the fix on whichever config is processed second)
  • Existing test_upsert_dataset_with_merge and test_update_existing_connection_configs_case_insensitive continue to pass

Summary by CodeRabbit

  • Bug Fixes

    • Ensure batch updates for SaaS connectors merge each config against a consistent pre-update baseline, preventing earlier updates from corrupting later ones.
    • Made connector-type matching case-insensitive to ensure correct batch selection and updates.
  • Tests

    • Added a test verifying batch updates preserve the same merge baseline across multiple connector configurations.

…te loop

When update_existing_connection_configs_for_connector_type processes multiple
connection configs of the same type, each call to upsert_dataset_config_from_template
was reading and then writing SaasTemplateDataset mid-loop. The first iteration's write
would set the stored baseline to the new template, so subsequent iterations used the
new template as their merge baseline instead of the old one — causing the 3-way merge
to misidentify unchanged customer fields as customer modifications and silently skip
applying template updates on those configs.

Fix reads the stored baseline once before the loop and passes it as an optional
snapshot parameter through update_saas_instance → upsert_dataset_config_from_template.
All iterations now merge against the same correct pre-update baseline. The write-back
on each iteration is idempotent (all write the same value) so it is left in place.

Affects any deployment with multiple connection configs of the same SaaS connector type,
regardless of whether they belong to the same system or different systems.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Mar 10, 2026 4:37pm
fides-privacy-center Ignored Ignored Mar 10, 2026 4:37pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Capture a pre-update snapshot of SaasTemplateDataset.dataset_json and thread it as stored_dataset_json through ConnectionService update flows so batch updates use a consistent baseline for 3‑way merges, preventing first-iteration overwrites from corrupting subsequent iterations.

Changes

Cohort / File(s) Summary
Baseline snapshot extraction
src/fides/api/util/saas_config_updater.py
Retrieve the second return value from SaasTemplateDataset.get_or_create, compute stored_dataset_json_snapshot when dataset_json is a dict, lowercase connector type in the DB query, and pass the snapshot into subsequent update calls.
Snapshot threading & merge logic
src/fides/service/connection/connection_service.py
Add optional stored_dataset_json parameter to update_saas_instance and upsert_dataset_config_from_template; compute effective_stored_dataset_json (parameter or DB snapshot) and use it for merge_datasets; snapshot stored dataset once before batch loops and forward it to each per-config update.
Tests for baseline preservation
tests/service/test_connection_service.py
Add test_update_existing_connection_configs_preserves_merge_baseline_across_multiple_configs and import Dataset as CtlDataset to verify multiple configs of the same SaaS type merge against the same pre-update baseline and avoid mid-loop baseline drift.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler
    participant ConnSvc as ConnectionService
    participant Updater as saas_config_updater.py
    participant DB as Database/SaasTemplateDataset
    participant Merger as merge_datasets

    Scheduler->>ConnSvc: update_existing_connection_configs_for_connector_type(connector_type, template)
    ConnSvc->>DB: read SaasTemplateDataset (snapshot)
    DB-->>ConnSvc: stored_dataset_json
    ConnSvc->>ConnSvc: iterate configs -> call update_saas_instance(..., stored_dataset_json)
    ConnSvc->>Updater: update_saas_instance(..., stored_dataset_json)
    Updater->>Merger: upsert_dataset_config_from_template(..., stored_dataset_json)
    Merger-->>Updater: merged dataset (uses provided stored_dataset_json)
    Updater->>DB: persist updated DatasetConfig
    DB-->>ConnSvc: ack
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • johnewart

Poem

🐰 I grabbed the snapshot, soft and neat,
Held it close before the loop could meet.
Each merge hops on that steady ground,
Two configs, one baseline, not confound.
Hooray—no drift, just tidy ground. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description is comprehensive, clearly explaining the problem, the fix, code changes, and test plan. However, the Pre-Merge Checklist section is incomplete with unchecked boxes and no resolution of required items. Complete the Pre-Merge Checklist by checking applicable items (e.g., mark 'No migrations', 'No documentation updates required', or 'No followup issues' as appropriate) or provide specific status for each required checklist item.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main fix: snapshotting SaasTemplateDataset before the batch template update loop to preserve the merge baseline.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/saas-template-dataset-merge-baseline-snapshot

Comment @coderabbitai help to get the list of available commands and usage tips.

@Linker44 Linker44 requested a review from galvana March 5, 2026 19:32
@Linker44 Linker44 self-assigned this Mar 5, 2026
…loop

update_saas_configs (run on every server startup) had the same mid-loop
SaasTemplateDataset write corruption as update_existing_connection_configs_for_
connector_type. Snapshot the stored baseline before the per-connector-type loop
and pass it through to update_saas_instance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Linker44 Linker44 marked this pull request as ready for review March 5, 2026 19:39
@Linker44 Linker44 requested a review from a team as a code owner March 5, 2026 19:39
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 5, 2026

Greptile Summary

This PR fixes a real correctness bug in the SaaS template batch-update path: when multiple connection configs of the same connector type are updated together, upsert_dataset_config_from_template was writing the new template back to SaasTemplateDataset on every iteration, causing every iteration after the first to use the new template as the 3-way merge baseline instead of the old one. The fix — snapshot the old baseline once before the loop and thread it through update_saas_instanceupsert_dataset_config_from_template — is logically sound, and the docstrings explain the reason clearly.

Key issues found:

  • Weak regression test (tests/service/test_connection_service.py, lines 1697–1794): The new test uses stored_dataset (which already contains a "products" collection) as both the v1 baseline and the customer data, and _get_template_yaml() (also containing "products") as the v2 template. Because the customer data already has "products", the assertion assert "products" in collection_names is trivially satisfied on both configs even without the snapshot fix. The test should introduce a new collection (e.g., "orders") in v2 that is absent from v1 to actually detect the baseline corruption.

  • Redundant DB query on every batch iteration (connection_service.py, lines 738–744): upsert_dataset_config_from_template unconditionally reads SaasTemplateDataset from the database even when a non-None stored_dataset_json snapshot is already provided by the caller. On large batches this adds a pointless SELECT per connection config.

Confidence Score: 3/5

  • The production fix is logically correct, but the new test does not actually detect the bug it was written to guard against, leaving no regression coverage.
  • The snapshot mechanism in both saas_config_updater.py and connection_service.py is correct: merge_datasets deep-copies its stored_dataset argument so the shared snapshot cannot be mutated across iterations, and the write-back reassigns the attribute rather than mutating in place. The critical weakness is the test — since both v1 and v2 templates have "products" and the customer dataset also already has "products", the assertion trivially passes without the fix, providing false confidence. A proper test should demonstrate a collection (or field) that appears in v2 but not in v1, verifying that the second config in the batch receives the new template content.
  • tests/service/test_connection_service.py — the new test needs to be redesigned so that the assertion actually fails without the snapshot fix applied.

Important Files Changed

Filename Overview
src/fides/api/util/saas_config_updater.py Correctly snapshots the SaasTemplateDataset baseline before the loop and passes it to each update_saas_instance call. The get_or_create return value is properly unpacked to retrieve the stored template for snapshotting.
src/fides/service/connection/connection_service.py Core fix is sound: snapshots the baseline before the loop in update_existing_connection_configs_for_connector_type and threads it through update_saas_instance → upsert_dataset_config_from_template. Minor issue: upsert_dataset_config_from_template still performs a DB read of SaasTemplateDataset even when the caller provides a non-None snapshot, which is an unnecessary query on every batch iteration.
tests/service/test_connection_service.py New test has a design flaw: the stored_dataset fixture already contains a "products" collection identical to what _get_template_yaml() produces at the collection level. The assertion "products" in collection_names passes trivially (customer data already has "products") and does not actually detect the baseline corruption bug being fixed.

Comments Outside Diff (1)

  1. src/fides/service/connection/connection_service.py, line 738-744 (link)

    Unnecessary DB read when snapshot is already provided

    stored_dataset_template is always fetched from the database here, even when stored_dataset_json is a non-None pre-read snapshot and stored_dataset_template will not be used for the merge. On every iteration of a batch update this issues an extra SELECT that provides no benefit.

    Consider guarding the read behind the stored_dataset_json is None check, or only reading when connector_type is available and the snapshot was not provided:

    stored_dataset_template = (
        SaasTemplateDataset.get_by(
            self.db, field="connection_type", value=connector_type
        )
        if connector_type and stored_dataset_json is None
        else (
            SaasTemplateDataset.get_by(
                self.db, field="connection_type", value=connector_type
            )
            if connector_type
            else None
        )
    )

    Or more simply, since stored_dataset_template is only used to (a) supply a fallback merge baseline and (b) write back the new template at the end, the read can be skipped for the merge path but is still needed for the write-back at line 812. A cleaner refactor would separate those two concerns.

Last reviewed commit: 0f552ab

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/service/test_connection_service.py (1)

1680-1794: Good test coverage for the baseline snapshot fix.

The test correctly validates that both connection configs receive the template update when processed in a batch. This directly exercises the fix for the baseline corruption issue.

Consider adding cleanup for created test resources.

Other tests in this file (e.g., test_upsert_dataset_with_merge at line 926) explicitly clean up created resources. This test creates two ConnectionConfig and two DatasetConfig objects but doesn't delete them, which could leave test data in the database and potentially affect other tests.

🧹 Suggested cleanup addition
             assert "products" in collection_names, (
                 f"Template update did not land on {instance_key}: "
                 f"collections={collection_names}"
             )
+
+        # Clean up
+        for suffix in ("alpha", "beta"):
+            instance_key = f"multi_config_{suffix}"
+            connection_config = ConnectionConfig.get_by(
+                db, field="key", value=f"multi_config_connection_{suffix}"
+            )
+            if connection_config:
+                connection_config.delete(db)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/service/test_connection_service.py` around lines 1680 - 1794, This test
creates two ConnectionConfig and two DatasetConfig records (via
ConnectionConfig.create and DatasetConfig.create_or_update) and a
SaasTemplateDataset (via SaasTemplateDataset.get_or_create) but doesn't remove
them; add cleanup at the end of
test_update_existing_connection_configs_preserves_merge_baseline_across_multiple_configs
to delete the created resources by filtering for the instance fides_key values
("multi_config_alpha", "multi_config_beta") and the connector_type and removing
the matching DatasetConfig and ConnectionConfig records and the
SaasTemplateDataset so subsequent tests are not affected.
src/fides/service/connection/connection_service.py (1)

847-859: Add .lower() to snapshot lookup for consistency with case-insensitive connection config query.

The connection config query uses connector_type.lower() (line 837), but the snapshot retrieval uses connector_type directly (line 852). Although SaaSConfig enforces lowercase normalization on the type field, adding .lower() here improves consistency with the defensive case-insensitive pattern already established for the connection config query.

🔧 Suggested fix
         stored_dataset_template = SaasTemplateDataset.get_by(
-            self.db, field="connection_type", value=connector_type
+            self.db, field="connection_type", value=connector_type.lower()
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fides/service/connection/connection_service.py` around lines 847 - 859,
The snapshot lookup should use a lowercased connector_type to match the
case-insensitive query earlier: change the SaasTemplateDataset.get_by call that
assigns stored_dataset_template to pass connector_type.lower() instead of
connector_type, and keep the subsequent stored_dataset_json_snapshot logic
unchanged so the snapshot remains a dict when present; update references to
connector_type in that get_by call only (SaasTemplateDataset.get_by,
stored_dataset_template, stored_dataset_json_snapshot).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/fides/service/connection/connection_service.py`:
- Around line 847-859: The snapshot lookup should use a lowercased
connector_type to match the case-insensitive query earlier: change the
SaasTemplateDataset.get_by call that assigns stored_dataset_template to pass
connector_type.lower() instead of connector_type, and keep the subsequent
stored_dataset_json_snapshot logic unchanged so the snapshot remains a dict when
present; update references to connector_type in that get_by call only
(SaasTemplateDataset.get_by, stored_dataset_template,
stored_dataset_json_snapshot).

In `@tests/service/test_connection_service.py`:
- Around line 1680-1794: This test creates two ConnectionConfig and two
DatasetConfig records (via ConnectionConfig.create and
DatasetConfig.create_or_update) and a SaasTemplateDataset (via
SaasTemplateDataset.get_or_create) but doesn't remove them; add cleanup at the
end of
test_update_existing_connection_configs_preserves_merge_baseline_across_multiple_configs
to delete the created resources by filtering for the instance fides_key values
("multi_config_alpha", "multi_config_beta") and the connector_type and removing
the matching DatasetConfig and ConnectionConfig records and the
SaasTemplateDataset so subsequent tests are not affected.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bdd5a19d-0fc1-417a-868e-d288488c4113

📥 Commits

Reviewing files that changed from the base of the PR and between e7fa470 and 0f552ab.

📒 Files selected for processing (3)
  • src/fides/api/util/saas_config_updater.py
  • src/fides/service/connection/connection_service.py
  • tests/service/test_connection_service.py

- Make test actually exercise the bug by introducing a new "orders"
  collection in v2 that is absent from v1, so the assertion is not
  trivially satisfied
- Add test cleanup for created ConnectionConfig, DatasetConfig, and
  SaasTemplateDataset resources
- Use connector_type.lower() in snapshot lookup for consistency with
  case-insensitive connection config query

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/fides/service/connection/connection_service.py (1)

701-824: ⚠️ Potential issue | 🟠 Major

Baseline can still advance on partial batch failures.

update_existing_connection_configs_for_connector_type continues after exceptions (Line 877), but upsert_dataset_config_from_template still persists SaasTemplateDataset on each successful iteration (Line 812). If one config fails mid-batch, the shared baseline is already moved forward, and retrying failed configs can merge against the wrong baseline.

🛠 Suggested fix direction
diff --git a/src/fides/service/connection/connection_service.py b/src/fides/service/connection/connection_service.py
@@
-    def upsert_dataset_config_from_template(
+    def upsert_dataset_config_from_template(
         self,
         connection_config: ConnectionConfig,
         template: ConnectorTemplate,
         template_values: SaasConnectionTemplateValues,
         stored_dataset_json: Optional[dict] = None,
+        persist_template_baseline: bool = True,
     ) -> DatasetConfig:
@@
-        if stored_dataset_template:
+        if persist_template_baseline and stored_dataset_template:
             stored_dataset_template.dataset_json = yaml.safe_load(template.dataset)[
                 "dataset"
             ][0]
             stored_dataset_template.save(db=self.db)
@@
-    def update_saas_instance(
+    def update_saas_instance(
         self,
         connection_config: ConnectionConfig,
         template: ConnectorTemplate,
         saas_config_instance: SaaSConfig,
         stored_dataset_json: Optional[dict] = None,
+        persist_template_baseline: bool = True,
     ) -> None:
@@
         self.upsert_dataset_config_from_template(
             connection_config,
             template,
             template_vals,
             stored_dataset_json=stored_dataset_json,
+            persist_template_baseline=persist_template_baseline,
         )
@@
-        for connection_config in connection_configs:
+        all_updates_succeeded = True
+        for connection_config in connection_configs:
             ...
             try:
                 self.update_saas_instance(
                     connection_config,
                     template,
                     saas_config_instance,
                     stored_dataset_json=stored_dataset_json_snapshot,
+                    persist_template_baseline=False,
                 )
             except Exception:
+                all_updates_succeeded = False
                 logger.exception(...)
+
+        if all_updates_succeeded and stored_dataset_template:
+            stored_dataset_template.dataset_json = yaml.safe_load(template.dataset)["dataset"][0]
+            stored_dataset_template.save(db=self.db)

Also applies to: 847-881

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/fides/service/connection/connection_service.py` around lines 701 - 824,
The bug is that upsert_dataset_config_from_template persists SaasTemplateDataset
(stored_dataset_template.dataset_json and save) on each successful iteration,
advancing the shared baseline during batch updates in
update_existing_connection_configs_for_connector_type; to fix, avoid
mutating/persisting stored_dataset_template inside
upsert_dataset_config_from_template when a pre-read snapshot is supplied: if
stored_dataset_json is not None (the batch caller provided the baseline), skip
the block that assigns stored_dataset_template.dataset_json and calls
stored_dataset_template.save(db=self.db), so the shared baseline only changes
when callers intentionally allow a DB write (or add an explicit flag to control
persistence) — update the logic around stored_dataset_template and its save call
in upsert_dataset_config_from_template to respect stored_dataset_json.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/fides/service/connection/connection_service.py`:
- Around line 701-824: The bug is that upsert_dataset_config_from_template
persists SaasTemplateDataset (stored_dataset_template.dataset_json and save) on
each successful iteration, advancing the shared baseline during batch updates in
update_existing_connection_configs_for_connector_type; to fix, avoid
mutating/persisting stored_dataset_template inside
upsert_dataset_config_from_template when a pre-read snapshot is supplied: if
stored_dataset_json is not None (the batch caller provided the baseline), skip
the block that assigns stored_dataset_template.dataset_json and calls
stored_dataset_template.save(db=self.db), so the shared baseline only changes
when callers intentionally allow a DB write (or add an explicit flag to control
persistence) — update the logic around stored_dataset_template and its save call
in upsert_dataset_config_from_template to respect stored_dataset_json.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9acbe206-0abe-4d5b-b980-b8cf454ce6cd

📥 Commits

Reviewing files that changed from the base of the PR and between 0f552ab and 61f70c6.

📒 Files selected for processing (2)
  • src/fides/service/connection/connection_service.py
  • tests/service/test_connection_service.py

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fides/service/connection/connection_service.py`:
- Around line 847-859: The snapshot lookup uses connector_type.lower() but the
earlier lookup in upsert_dataset_config_from_template uses connector_type
without normalization, causing inconsistent get_by exact-match behavior on
SaasTemplateDataset; update the lookup in upsert_dataset_config_from_template
(and any other SaasTemplateDataset.get_by calls like in
update_existing_connection_configs_for_connector_type) to normalize
connector_type with .lower() so both paths use the same lowercase key when
calling SaasTemplateDataset.get_by and producing stored_dataset_json_snapshot.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1af8e17e-64f5-4a67-9458-82d0e8773a2a

📥 Commits

Reviewing files that changed from the base of the PR and between 61f70c6 and ab6b412.

📒 Files selected for processing (1)
  • src/fides/service/connection/connection_service.py

Copy link
Copy Markdown
Contributor

@Vagoasdf Vagoasdf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the logic is sound and good. Just a few nitpicks for consistency, and some suggestions to avoid possible unintended effects

- Normalize connector_type with .lower() in upsert_dataset_config_from_template
  and saas_config_updater for defensive consistency
- Add explicit CtlDataset cleanup in test (not cascade-deleted with ConnectionConfig)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/service/test_connection_service.py (1)

1836-1862: Consider using a pytest fixture for cleanup resilience.

The inline cleanup won't execute if the test fails before reaching it, potentially leaving test resources in the database. A fixture with yield or try/finally would ensure cleanup regardless of test outcome.

That said, this is consistent with the cleanup pattern used in other tests in this file.

♻️ Optional: Convert to fixture-based cleanup
`@pytest.fixture`
def multi_config_test_resources(self, db: Session, stored_dataset: Dict[str, Any]):
    """Creates test resources and ensures cleanup."""
    connector_type = "multi_config_connector"
    created_configs = []
    
    # ... setup code ...
    
    yield {
        "connector_type": connector_type,
        "configs": created_configs,
        # ... other resources ...
    }
    
    # Cleanup runs even on test failure
    for suffix in ("alpha", "beta"):
        # ... cleanup code ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/service/test_connection_service.py` around lines 1836 - 1862, The
inline cleanup loop in the test (removing DatasetConfig, CtlDataset,
ConnectionConfig, and SaasTemplateDataset for keys like "multi_config_{suffix}"
and "multi_config_connection_{suffix}") can be skipped if the test fails early;
refactor by extracting the resource setup/teardown into a pytest fixture (or
wrap the test body in try/finally) that yields control and always performs
cleanup: create a fixture (e.g., multi_config_test_resources) that sets up the
configs, yields the created resource identifiers (connector_type, created config
keys), and in its teardown iterates suffix in ("alpha","beta") to find and
delete DatasetConfig (DatasetConfig.filter / .first), delete associated
CtlDataset (query CtlDataset by id and db.delete/db.commit), delete
ConnectionConfig (ConnectionConfig.get_by / .delete), and remove
SaasTemplateDataset (SaasTemplateDataset.get_by / .delete) so cleanup runs even
on test failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fides/api/util/saas_config_updater.py`:
- Around line 60-62: The comparison in the query uses
ConnectionConfig.saas_config["type"].astext == connector_type.lower(), which
only lowercases the literal side and breaks case-insensitive matching; update
the left side to use sa_func.lower(ConnectionConfig.saas_config["type"].astext)
so both sides are compared in lowercase (matching the approach in
connection_service.py), and ensure sa_func is imported/available in
saas_config_updater.py if not already.

---

Nitpick comments:
In `@tests/service/test_connection_service.py`:
- Around line 1836-1862: The inline cleanup loop in the test (removing
DatasetConfig, CtlDataset, ConnectionConfig, and SaasTemplateDataset for keys
like "multi_config_{suffix}" and "multi_config_connection_{suffix}") can be
skipped if the test fails early; refactor by extracting the resource
setup/teardown into a pytest fixture (or wrap the test body in try/finally) that
yields control and always performs cleanup: create a fixture (e.g.,
multi_config_test_resources) that sets up the configs, yields the created
resource identifiers (connector_type, created config keys), and in its teardown
iterates suffix in ("alpha","beta") to find and delete DatasetConfig
(DatasetConfig.filter / .first), delete associated CtlDataset (query CtlDataset
by id and db.delete/db.commit), delete ConnectionConfig (ConnectionConfig.get_by
/ .delete), and remove SaasTemplateDataset (SaasTemplateDataset.get_by /
.delete) so cleanup runs even on test failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1c34aa90-872c-41d9-b06e-f4eaf96f62b1

📥 Commits

Reviewing files that changed from the base of the PR and between d6ca0cf and 50ebe20.

📒 Files selected for processing (2)
  • src/fides/api/util/saas_config_updater.py
  • tests/service/test_connection_service.py

Linker44 and others added 5 commits March 10, 2026 11:10
The baseline snapshot held a reference to the ORM object's dict, which
SQLAlchemy could refresh in-place during a session flush. deepcopy
detaches the snapshot so it stays stable across loop iterations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aligns saas_config_updater.py with connection_service.py by lowercasing
the DB column value too, not just the Python side. Prevents misses if a
stored saas_config["type"] has mixed case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test cleanup was deleting CtlDataset first, which caused SQLAlchemy
to NULL out datasetconfig.ctl_dataset_id — violating its NOT NULL
constraint. Reorder to delete ConnectionConfig (cascading to
DatasetConfig) first, then the orphaned CtlDataset.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Linker44 Linker44 force-pushed the fix/saas-template-dataset-merge-baseline-snapshot branch from 83643c4 to 9cab58b Compare March 10, 2026 16:36
@Linker44 Linker44 added this pull request to the merge queue Mar 10, 2026
Merged via the queue into main with commit 0166600 Mar 10, 2026
58 checks passed
@Linker44 Linker44 deleted the fix/saas-template-dataset-merge-baseline-snapshot branch March 10, 2026 19:07
mfbrown pushed a commit that referenced this pull request Mar 12, 2026
…te loop (#7578)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants