fix: handle malformed api_constraints without crashing endpoint (#273) by Gokul-social · Pull Request #285 · openml/server-api

Gokul-social · 2026-03-21T15:11:05Z

Summary

Hardens the /tasktype/{task_type_id} endpoint against inconsistent api_constraints data. Fixes #273.

The Problem

The existing implementation was fragile, relying on strict json.loads() and direct key access. This led to unnecessary 500 errors when encountering:

Malformed JSON strings or unexpected dict types in the persistence layer.
Missing data_type keys, triggering unhandled KeyErrors.
Type mismatches that caused the parser to bail instead of degrading gracefully.

The Solution

I've introduced a defensive parse_api_constraints helper to act as a buffer between the DB and the response.

Type Agnostic Parsing: Gracefully handles both JSON strings and pre-parsed dictionaries.
Non-Breaking Validation: Switched to .get() with strict type-checking for data_type. If a value is invalid, it is omitted from the output rather than crashing the request.
Observability: Added structured logging (WARNING for structural failures, DEBUG for missing data) prefixed with api_constraints: to help us audit bad data in production.

Testing

Comprehensive unit tests added to cover:

Valid/Malformed JSON and raw dict inputs.
Boundary cases: Empty values, nulls, and unsupported schema types.
Verification that response structure remains backward compatible.

…ng (openml#273)

coderabbitai · 2026-03-21T15:11:18Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0e9e6ecc-f175-4e86-9593-d74da9333067

📥 Commits

Reviewing files that changed from the base of the PR and between f08c487 and ea00485.

📒 Files selected for processing (1)

src/routers/openml/tasktype.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/routers/openml/tasktype.py

Walkthrough

Adds module-level logging and a new helper parse_api_constraints(api_constraints: Any, *, task_type_id: int, input_name: str) -> str | None in src/routers/openml/tasktype.py that defensively handles None, dict, and string (JSON) inputs and returns the data_type string or None. get_task_type now includes input[].data_type only when the helper returns a non-None value and its docstring documents this optional field. New tests tests/routers/openml/test_parse_api_constraints.py cover parsing edge cases and expected logging.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely describes the main change: adding defensive handling for malformed api_constraints in the tasktype endpoint to prevent crashes.
Description check	✅ Passed	The description is well-related to the changeset, clearly explaining the problem, solution, and testing approach for handling malformed api_constraints.
Linked Issues check	✅ Passed	The PR fully addresses issue `#273` requirements: implements defensive parse_api_constraints helper handling JSON strings/dicts, uses safe lookups for data_type, catches malformed JSON gracefully, adds observability logging, and includes comprehensive unit tests.
Out of Scope Changes check	✅ Passed	All changes are directly in scope: the parse_api_constraints helper function, modifications to get_task_type endpoint behavior, and comprehensive test coverage for the new functionality align with issue `#273` objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've left some high level feedback:

In parse_api_constraints, you currently only accept plain dict instances; if upstream ever passes a Mapping (e.g. dict-like) object it will be treated as unsupported, so consider loosening the type check to collections.abc.Mapping to make the helper more robust to different ORM or serialization layers.
The test_valid_constraint_no_warning assertion len(caplog.records) == 0 can be brittle if any other logger emits during the test run; it would be safer to assert that there are no records from the specific logger or containing the api_constraints: prefix instead of enforcing an absolutely empty log.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `parse_api_constraints`, you currently only accept plain `dict` instances; if upstream ever passes a `Mapping` (e.g. `dict`-like) object it will be treated as unsupported, so consider loosening the type check to `collections.abc.Mapping` to make the helper more robust to different ORM or serialization layers.
- The `test_valid_constraint_no_warning` assertion `len(caplog.records) == 0` can be brittle if any other logger emits during the test run; it would be safer to assert that there are no records from the specific logger or containing the `api_constraints:` prefix instead of enforcing an absolutely empty log.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Hardens the /tasktype/{task_type_id} endpoint so malformed or inconsistent api_constraints data no longer causes 500s, aligning behavior with issue #273.

Changes:

Added parse_api_constraints helper to defensively parse api_constraints and safely extract data_type.
Updated task type response construction to include input[].data_type only when valid.
Added unit tests covering valid/malformed inputs and logging behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`src/routers/openml/tasktype.py`	Introduces defensive parsing + logging and uses it when building the `/tasktype/{id}` response.
`tests/routers/openml/test_parse_api_constraints.py`	Adds unit tests for parsing outcomes and emitted log levels/messages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-21T15:14:19Z

src/routers/openml/tasktype.py

+    if isinstance(api_constraints, dict):
+        constraint = api_constraints
+    elif isinstance(api_constraints, str):


parse_api_constraints only treats pre-parsed constraints as valid when they are a concrete dict. If the persistence layer returns a dict-like object (e.g., collections.abc.Mapping / SQLAlchemy row mapping), this will be logged as unsupported_type and data_type will be dropped even though it could be read safely. Consider accepting Mapping (and casting to dict if needed) instead of only dict to match the “dict-like” requirement and existing patterns (e.g., core/conversions.py uses Mapping).

Copilot · 2026-03-21T15:14:20Z

src/routers/openml/tasktype.py

+        logger.debug(
+            "api_constraints: missing_data_type for task_type_id=%d, input=%s",
+            task_type_id,
+            input_name,
+        )
+        return None
+
+    return data_type


The log label missing_data_type is emitted for all invalid data_type values, including cases where the key exists but is empty or not a string. That makes the log message misleading and reduces the usefulness of the new observability. Consider distinguishing between “missing” (key absent/None) vs “invalid” (wrong type/empty) and logging the latter with a different label (and possibly WARNING) including the received type/value shape.

Copilot · 2026-03-21T15:14:20Z

tests/routers/openml/test_parse_api_constraints.py

+@pytest.mark.parametrize(
+    ("api_constraints", "expected_data_type"),
+    [
+        # 1. Valid JSON string with data_type
+        ('{"data_type": "matrix"}', "matrix"),
+        # 2. Valid dict with data_type
+        ({"data_type": "matrix"}, "matrix"),
+        # 3. Malformed JSON string → None
+        ("{bad json", None),
+        # 4. Empty string → None
+        ("", None),
+        # 5. Valid JSON/dict without data_type → None
+        ('{"other_key": "val"}', None),
+        ({"other_key": "val"}, None),
+        # 6. data_type present but empty string → None


If parse_api_constraints is updated to accept dict-like inputs (e.g., collections.abc.Mapping), add a parametrized test case for a non-dict mapping instance to ensure this scenario remains supported (this is one of the failure modes described in #273 / PR description).

fix(tasktype): handle malformed api_constraints safely without crashi…

f08c487

…ng (openml#273)

Copilot AI review requested due to automatic review settings March 21, 2026 15:11

Copilot started reviewing on behalf of Gokul-social March 21, 2026 15:11 View session

sourcery-ai bot reviewed Mar 21, 2026

View reviewed changes

[pre-commit.ci] auto fixes from pre-commit.com hooks

ea00485

for more information, see https://pre-commit.ci

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle malformed api_constraints without crashing endpoint (#273)#285

fix: handle malformed api_constraints without crashing endpoint (#273)#285
Gokul-social wants to merge 2 commits intoopenml:mainfrom
Gokul-social:fix/273-graceful-api-constraints

Gokul-social commented Mar 21, 2026

Uh oh!

coderabbitai bot commented Mar 21, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Copilot AI Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Gokul-social commented Mar 21, 2026

Summary

The Problem

The Solution

Testing

Uh oh!

coderabbitai bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Mar 21, 2026 •

edited

Loading