fix: handle malformed api_constraints without crashing endpoint (#273)#285
fix: handle malformed api_constraints without crashing endpoint (#273)#285Gokul-social wants to merge 2 commits intoopenml:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughAdds module-level logging and a new helper 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In
parse_api_constraints, you currently only accept plaindictinstances; if upstream ever passes aMapping(e.g.dict-like) object it will be treated as unsupported, so consider loosening the type check tocollections.abc.Mappingto make the helper more robust to different ORM or serialization layers. - The
test_valid_constraint_no_warningassertionlen(caplog.records) == 0can be brittle if any other logger emits during the test run; it would be safer to assert that there are no records from the specific logger or containing theapi_constraints:prefix instead of enforcing an absolutely empty log.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `parse_api_constraints`, you currently only accept plain `dict` instances; if upstream ever passes a `Mapping` (e.g. `dict`-like) object it will be treated as unsupported, so consider loosening the type check to `collections.abc.Mapping` to make the helper more robust to different ORM or serialization layers.
- The `test_valid_constraint_no_warning` assertion `len(caplog.records) == 0` can be brittle if any other logger emits during the test run; it would be safer to assert that there are no records from the specific logger or containing the `api_constraints:` prefix instead of enforcing an absolutely empty log.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Pull request overview
Hardens the /tasktype/{task_type_id} endpoint so malformed or inconsistent api_constraints data no longer causes 500s, aligning behavior with issue #273.
Changes:
- Added
parse_api_constraintshelper to defensively parseapi_constraintsand safely extractdata_type. - Updated task type response construction to include
input[].data_typeonly when valid. - Added unit tests covering valid/malformed inputs and logging behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/routers/openml/tasktype.py |
Introduces defensive parsing + logging and uses it when building the /tasktype/{id} response. |
tests/routers/openml/test_parse_api_constraints.py |
Adds unit tests for parsing outcomes and emitted log levels/messages. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if isinstance(api_constraints, dict): | ||
| constraint = api_constraints | ||
| elif isinstance(api_constraints, str): |
There was a problem hiding this comment.
parse_api_constraints only treats pre-parsed constraints as valid when they are a concrete dict. If the persistence layer returns a dict-like object (e.g., collections.abc.Mapping / SQLAlchemy row mapping), this will be logged as unsupported_type and data_type will be dropped even though it could be read safely. Consider accepting Mapping (and casting to dict if needed) instead of only dict to match the “dict-like” requirement and existing patterns (e.g., core/conversions.py uses Mapping).
| logger.debug( | ||
| "api_constraints: missing_data_type for task_type_id=%d, input=%s", | ||
| task_type_id, | ||
| input_name, | ||
| ) | ||
| return None | ||
|
|
||
| return data_type |
There was a problem hiding this comment.
The log label missing_data_type is emitted for all invalid data_type values, including cases where the key exists but is empty or not a string. That makes the log message misleading and reduces the usefulness of the new observability. Consider distinguishing between “missing” (key absent/None) vs “invalid” (wrong type/empty) and logging the latter with a different label (and possibly WARNING) including the received type/value shape.
| @pytest.mark.parametrize( | ||
| ("api_constraints", "expected_data_type"), | ||
| [ | ||
| # 1. Valid JSON string with data_type | ||
| ('{"data_type": "matrix"}', "matrix"), | ||
| # 2. Valid dict with data_type | ||
| ({"data_type": "matrix"}, "matrix"), | ||
| # 3. Malformed JSON string → None | ||
| ("{bad json", None), | ||
| # 4. Empty string → None | ||
| ("", None), | ||
| # 5. Valid JSON/dict without data_type → None | ||
| ('{"other_key": "val"}', None), | ||
| ({"other_key": "val"}, None), | ||
| # 6. data_type present but empty string → None |
There was a problem hiding this comment.
If parse_api_constraints is updated to accept dict-like inputs (e.g., collections.abc.Mapping), add a parametrized test case for a non-dict mapping instance to ensure this scenario remains supported (this is one of the failure modes described in #273 / PR description).
Summary
Hardens the
/tasktype/{task_type_id}endpoint against inconsistentapi_constraintsdata. Fixes #273.The Problem
The existing implementation was fragile, relying on strict
json.loads()and direct key access. This led to unnecessary 500 errors when encountering:dicttypes in the persistence layer.data_typekeys, triggering unhandledKeyErrors.The Solution
I've introduced a defensive
parse_api_constraintshelper to act as a buffer between the DB and the response..get()with strict type-checking fordata_type. If a value is invalid, it is omitted from the output rather than crashing the request.WARNINGfor structural failures,DEBUGfor missing data) prefixed withapi_constraints:to help us audit bad data in production.Testing
Comprehensive unit tests added to cover: