Skip to content

Fix _codecs.escape_decode return type#15531

Open
ledvinap wants to merge 2 commits intopython:mainfrom
ledvinap:fix-codecs-escape-decode-bytes
Open

Fix _codecs.escape_decode return type#15531
ledvinap wants to merge 2 commits intopython:mainfrom
ledvinap:fix-codecs-escape-decode-bytes

Conversation

@ledvinap
Copy link

Summary

Fix the stdlib stub for _codecs.escape_decode so it returns tuple[bytes, int] instead of tuple[str, int].

Also add regression coverage in stdlib/@tests/test_cases/check_codecs.py to verify the inferred return type for codecs.escape_decode.

Details

Python's docs describe codecs.escape_decode as returning bytes, and CPython runtime behavior matches that. The old stub returned str, which then propagated
to codecs.escape_decode because codecs.pyi re-exports _codecs.

I kept the input type as str | ReadableBuffer rather than narrowing it to bytes-like only, because current CPython still accepts str input at runtime and
returns (bytes, int).

AI assistance

This PR was prepared by OpenAI Codex/GPT-5 and is human-validated.

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

mitmproxy (https://github.com/mitmproxy/mitmproxy)
+ mitmproxy/utils/strutils.py:125: error: Unused "type: ignore" comment  [unused-ignore]

Copy link
Collaborator

@brianschubert brianschubert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it seems that this was an edge case since the Python 2 days, where this function returned a str while the other *_decode functions returned a unicode object. See remark below about adding a test.

Comment on lines +14 to +20

assert_type(codecs.escape_decode("ab"), tuple[bytes, int])
assert_type(codecs.escape_decode(b"ab"), tuple[bytes, int])

decoded, consumed = codecs.escape_decode(b"ab")
assert_type(decoded, bytes)
assert_type(consumed, int)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases are only needed for weird and/or complicated cases that might regress in the future. These tests basically repeat the signature, so I don't think they're necessary

Suggested change
assert_type(codecs.escape_decode("ab"), tuple[bytes, int])
assert_type(codecs.escape_decode(b"ab"), tuple[bytes, int])
decoded, consumed = codecs.escape_decode(b"ab")
assert_type(decoded, bytes)
assert_type(consumed, int)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants