Skip to content

Conversation

@whym
Copy link
Contributor

@whym whym commented Dec 11, 2025

Long log lines can be split internally when it is very long. The default decoder (str.decode) raises UnicodeDecodeError against partial fragments of multi-byte characters. Incremental decoder doesn't have the same problem.

Fixes #1308

About tests:

I have done manual testing with the yaml file below:

services:
  s1:
    image: docker.io/python:3.11-slim
    command: env LANG=ja_JP.utf8 python -c'print("あ" * 10000000 + " end")'

I realize I'm expected to add unit tests, but I don't know what test to add here. Any advice?

  • I think it should interact with the decoder's internal buffer size for reproducibility. I don't see that part exposed in the decoder API.
  • I think it should handle potential freezes (maybe with a hard time limit of N seconds), because the potential failure mode is not wrong result, but freeze here. I don't know how to imprement that in a unit test in a clean way.

Long log lines can be split internally when it is very long. The default decoder (str.decode) raises
UnicodeDecodeError against partial fragments of multi-byte characters.  Incremental decoder doesn't
have the same problem.

Fixes containers#1308

Signed-off-by: Yusuke Matsubara <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Logging freeze with multi-byte characters in long output lines

1 participant