feat: Migrate POST /setup/tag endpoint to FastAPI by 0xmaster7 · Pull Request #257 · openml/server-api

0xmaster7 · 2026-02-26T04:59:11Z

fixes #64

Migrated the POST /setup/tag endpoint to the new fastapi backend. Added auth and ownership checks so only admins or the original uploader can tag a setup, which fixes the issue mentioned in #19. Also hooked up the DB queries for the algorithm_setup and setup_tag tables.

for more information, see https://pre-commit.ci

coderabbitai · 2026-02-26T04:59:33Z

Walkthrough

Adds a new database module src/database/setups.py with three helpers: get(id_, connection), get_tags_for(id_, connection), and tag(id_, tag_, *, user_id, connection). Introduces a new router src/routers/openml/setups.py exposing POST /setup/tag that enforces authentication, verifies setup existence, checks authorization (admin or uploader), detects duplicate tags (case-insensitive), inserts a new tag, and returns updated tags. Adds two HTTP error constructor helpers for authentication failure and existing-tag cases. Registers the new router in src/main.py.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: migrating the POST /setup/tag endpoint to FastAPI, which is the primary objective of this PR.
Description check	✅ Passed	The description directly relates to the changeset by explaining the endpoint migration, authentication/ownership checks, and database query implementation, all of which are present in the code changes.
Linked Issues check	✅ Passed	The PR successfully implements the POST /setup/tag endpoint (`#64`) with auth and ownership checks (`#19`), adding three database helpers and the endpoint router with proper validation logic.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing the POST /setup/tag endpoint and its supporting infrastructure, with no extraneous modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've left some high level feedback:

The POST /setup/tag endpoint currently takes setup_id from the body but tag will default to a query parameter (no Body() annotation), which may not match the legacy API expectations; consider explicitly declaring tag: Annotated[str, Body(...)] or introducing a request model for clarity and consistency.
In get_tags_for, you only need the tag values but are doing SELECT *; narrowing this to SELECT tag will reduce unnecessary data transfer and make the intent clearer.
The create_tag_exists_error helper returns HTTP 500 for an already-tagged entity, which semantically represents a client conflict rather than a server error; consider using a 4xx status (e.g., 409) unless you must preserve legacy behavior tied to code 473.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The POST /setup/tag endpoint currently takes `setup_id` from the body but `tag` will default to a query parameter (no `Body()` annotation), which may not match the legacy API expectations; consider explicitly declaring `tag: Annotated[str, Body(...)]` or introducing a request model for clarity and consistency.
- In `get_tags_for`, you only need the tag values but are doing `SELECT *`; narrowing this to `SELECT tag` will reduce unnecessary data transfer and make the intent clearer.
- The `create_tag_exists_error` helper returns HTTP 500 for an already-tagged entity, which semantically represents a client conflict rather than a server error; consider using a 4xx status (e.g., 409) unless you must preserve legacy behavior tied to code 473.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/routers/openml/setups.py (1)
15-20: Add a Python unit test for the 103 Authentication failed path.

Given migration-suite limitations, this path should be validated in standard unit tests for this router module.

If useful, I can draft the exact pytest cases for the unauthenticated path and expected payload shape.
Based on learnings: "Migration tests for OpenML API endpoints that rely on php_api for data staging cannot test authentication failure paths (Code 103) when the legacy PHP API returns XML instead of JSON. These paths should be tested in standard Python unit tests after dependent endpoints are migrated."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/setups.py` around lines 15 - 20, Add a unit test that
invokes create_authentication_failed_error() (or the router function that
returns the authentication failure) and asserts it raises/returns an
HTTPException with status_code HTTPStatus.PRECONDITION_FAILED and detail
matching {"code":"103","message":"Authentication failed"}; implement the test in
pytest by importing create_authentication_failed_error from
src.routers.openml.setups, calling it (or catching the raised HTTPException if
the router endpoint raises one), and asserting both the status_code and exact
detail payload shape and types to ensure the 103 authentication-failed path is
validated independent of migration-suite integration tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/database/setups.py`:
- Around line 20-32: The get_tags_for function currently selects all columns and
lacks deterministic ordering; update the SQL in connection.execute to select
only the tag column (e.g., SELECT tag FROM setup_tag) and add an explicit ORDER
BY (for example ORDER BY tag ASC) while keeping the parameterized filter (WHERE
setup_id = :setup_id) so you return a stable, minimal list of tags for the given
setup_id.

In `@src/routers/openml/setups.py`:
- Around line 58-64: The current check-then-insert in the route using
database.setups.get_tags_for and database.setups.tag is race-prone; add a UNIQUE
constraint on (setup_id, tag) at the DB schema level and update the tag write
path to handle duplicate-key errors by translating them into the existing
create_tag_exists_error(setup_id, tag) response. Concretely, add the unique
index in migrations, then catch the DB-specific unique-violation exception
either inside database.setups.tag or around its call in this router and on that
exception raise/create the same "already tagged" error instead of propagating
the DB error.

---

Nitpick comments:
In `@src/routers/openml/setups.py`:
- Around line 15-20: Add a unit test that invokes
create_authentication_failed_error() (or the router function that returns the
authentication failure) and asserts it raises/returns an HTTPException with
status_code HTTPStatus.PRECONDITION_FAILED and detail matching
{"code":"103","message":"Authentication failed"}; implement the test in pytest
by importing create_authentication_failed_error from src.routers.openml.setups,
calling it (or catching the raised HTTPException if the router endpoint raises
one), and asserting both the status_code and exact detail payload shape and
types to ensure the 103 authentication-failed path is validated independent of
migration-suite integration tests.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a0fed5 and 4451c7d.

📒 Files selected for processing (3)

src/database/setups.py
src/main.py
src/routers/openml/setups.py

coderabbitai · 2026-02-26T05:04:04Z

src/database/setups.py

+def get_tags_for(id_: int, connection: Connection) -> list[str]:
+    """Get all tags for a specific setup."""
+    rows = connection.execute(
+        text(
+            """
+            SELECT *
+            FROM setup_tag
+            WHERE id = :setup_id
+            """,
+        ),
+        parameters={"setup_id": id_},
+    )
+    return [row.tag for row in rows]


⚠️ Potential issue | 🟡 Minor

Return tags in a deterministic order and avoid over-fetching.

SELECT * without ordering can yield unstable tag ordering in responses and tests. Fetch only tag and add an explicit ORDER BY.

💡 Suggested query refinement

def get_tags_for(id_: int, connection: Connection) -> list[str]: """Get all tags for a specific setup.""" rows = connection.execute( text( """ - SELECT * + SELECT tag FROM setup_tag WHERE id = :setup_id + ORDER BY tag """, ), parameters={"setup_id": id_}, ) return [row.tag for row in rows]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/database/setups.py` around lines 20 - 32, The get_tags_for function currently selects all columns and lacks deterministic ordering; update the SQL in connection.execute to select only the tag column (e.g., SELECT tag FROM setup_tag) and add an explicit ORDER BY (for example ORDER BY tag ASC) while keeping the parameterized filter (WHERE setup_id = :setup_id) so you return a stable, minimal list of tags for the given setup_id.

coderabbitai · 2026-02-26T05:04:04Z

src/routers/openml/setups.py

+    tags = database.setups.get_tags_for(setup_id, expdb_db)
+    if tag.casefold() in [t.casefold() for t in tags]:
+        raise create_tag_exists_error(setup_id, tag)
+
+    # 5. APPLY THE TAG
+    database.setups.tag(setup_id, tag, user_id=user.user_id, connection=expdb_db)
+


⚠️ Potential issue | 🟠 Major

Duplicate prevention is race-prone under concurrent requests.

The current check-then-insert flow can still insert duplicates when two requests run at the same time. Enforce uniqueness in the DB (e.g., unique key on setup/tag) and map duplicate-write failure to the existing “already tagged” error.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/setups.py` around lines 58 - 64, The current check-then-insert in the route using database.setups.get_tags_for and database.setups.tag is race-prone; add a UNIQUE constraint on (setup_id, tag) at the DB schema level and update the tag write path to handle duplicate-key errors by translating them into the existing create_tag_exists_error(setup_id, tag) response. Concretely, add the unique index in migrations, then catch the DB-specific unique-violation exception either inside database.setups.tag or around its call in this router and on that exception raise/create the same "already tagged" error instead of propagating the DB error.

…ster7/server-api into feature/setup-tag-migration

for more information, see https://pre-commit.ci

coderabbitai

♻️ Duplicate comments (1)

src/routers/openml/setups.py (1)

59-65: ⚠️ Potential issue | 🟠 Major

Duplicate check is still race-prone under concurrency.

The check-then-insert flow can still admit duplicates when requests race. Enforce uniqueness at the DB level (setup_tag(id, tag)) and translate duplicate-key failures to create_tag_exists_error(...) in the write path.

💡 Suggested router-side handling (plus DB unique constraint in migrations)

+from sqlalchemy.exc import IntegrityError
@@
-    tags = database.setups.get_tags_for(setup_id, expdb_db)
-    if tag.casefold() in [t.casefold() for t in tags]:
+    tags = database.setups.get_tags_for(setup_id, expdb_db)
+    if tag.casefold() in {t.casefold() for t in tags}:
         raise create_tag_exists_error(setup_id, tag)
 
-    # 5. APPLY THE TAG
-    database.setups.tag(setup_id, tag, user_id=user.user_id, connection=expdb_db)
+    # 5. APPLY THE TAG
+    try:
+        database.setups.tag(setup_id, tag, user_id=user.user_id, connection=expdb_db)
+    except IntegrityError as exc:
+        # Requires UNIQUE(id, tag) at DB level
+        raise create_tag_exists_error(setup_id, tag) from exc
 
-    return {
-        "setup_tag": {"id": str(setup_id), "tag": [*tags, tag]},
-    }
+    updated_tags = database.setups.get_tags_for(setup_id, expdb_db)
+    return {"setup_tag": {"id": str(setup_id), "tag": updated_tags}}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/setups.py` around lines 59 - 65, The current
check-then-insert (using database.setups.get_tags_for then database.setups.tag)
is race-prone; add a DB-level unique constraint on the setup_tag(id, tag) column
and modify the write path that calls database.setups.tag to catch the database
unique-violation/duplicate-key exception (e.g., IntegrityError/UniqueViolation
from your DB driver) and translate it into create_tag_exists_error(setup_id,
tag). You can keep the case-insensitive pre-check for fast-fail using
database.setups.get_tags_for, but rely on the DB constraint as the source of
truth and handle the duplicate-key error in the same scope where
database.setups.tag is invoked so concurrent inserts return
create_tag_exists_error instead of creating duplicates.

🧹 Nitpick comments (1)

src/routers/openml/setups.py (1)
37-37: Consolidate the tag parameter to use only SystemString64 as the validation constraint.

The current annotation has two conflicting Body() instances in the metadata, which creates ambiguity during constraint processing. This pattern is inconsistent with similar parameters elsewhere (e.g., datasets.py). Use Annotated[str, SystemString64] to align with the codebase convention, or explicitly merge the constraints if the embed=False behavior is required:
tag: Annotated[str, SystemString64]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/setups.py` at line 37, The tag parameter currently
includes conflicting metadata (Body(..., embed=False) and SystemString64);
remove the Body() metadata and consolidate to use only SystemString64 for
validation by changing the annotation for tag to use Annotated[str,
SystemString64] (or if embed=False is actually required, move that constraint
into a single explicit validator instead of leaving two metadata entries) so the
parameter in the function/signature that declares tag uses only SystemString64
for constraint processing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/routers/openml/setups.py`:
- Around line 59-65: The current check-then-insert (using
database.setups.get_tags_for then database.setups.tag) is race-prone; add a
DB-level unique constraint on the setup_tag(id, tag) column and modify the write
path that calls database.setups.tag to catch the database
unique-violation/duplicate-key exception (e.g., IntegrityError/UniqueViolation
from your DB driver) and translate it into create_tag_exists_error(setup_id,
tag). You can keep the case-insensitive pre-check for fast-fail using
database.setups.get_tags_for, but rely on the DB constraint as the source of
truth and handle the duplicate-key error in the same scope where
database.setups.tag is invoked so concurrent inserts return
create_tag_exists_error instead of creating duplicates.

---

Nitpick comments:
In `@src/routers/openml/setups.py`:
- Line 37: The tag parameter currently includes conflicting metadata (Body(...,
embed=False) and SystemString64); remove the Body() metadata and consolidate to
use only SystemString64 for validation by changing the annotation for tag to use
Annotated[str, SystemString64] (or if embed=False is actually required, move
that constraint into a single explicit validator instead of leaving two metadata
entries) so the parameter in the function/signature that declares tag uses only
SystemString64 for constraint processing.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4451c7d and 8dfd48a.

📒 Files selected for processing (2)

src/database/setups.py
src/routers/openml/setups.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/database/setups.py

0xmaster7 and others added 2 commits February 26, 2026 10:21

feat: migrate POST /setup/tag endpoint

f33fbd2

[pre-commit.ci] auto fixes from pre-commit.com hooks

4451c7d

for more information, see https://pre-commit.ci

sourcery-ai bot reviewed Feb 26, 2026

View reviewed changes

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

0xmaster7 and others added 3 commits February 26, 2026 10:43

made changes based on automated bot review

d20c523

Merge branch 'feature/setup-tag-migration' of https://github.com/0xma…

af6b92a

…ster7/server-api into feature/setup-tag-migration

[pre-commit.ci] auto fixes from pre-commit.com hooks

8dfd48a

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Migrate POST /setup/tag endpoint to FastAPI#257

feat: Migrate POST /setup/tag endpoint to FastAPI#257
0xmaster7 wants to merge 5 commits intoopenml:mainfrom
0xmaster7:feature/setup-tag-migration

0xmaster7 commented Feb 26, 2026

Uh oh!

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 26, 2026

Uh oh!

coderabbitai bot Feb 26, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

0xmaster7 commented Feb 26, 2026

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Feb 26, 2026 •

edited

Loading