Skip to content

Prompt resyncronizes too late after invalid UTF-8 #4801

@egmontkob

Description

@egmontkob

Is there an existing issue for this?

  • I have searched the existing issues

Midnight Commander version and build configuration

master

Operating system

all

Is this issue reproducible using the latest version of Midnight Commander?

  • I confirm the issue is still reproducible with the latest version of Midnight Commander

How to reproduce

UTF-8 locale, UTF-8 terminal.

Set up prompt in .bashrc to print the directory (if you don't have it already), e.g.

PS1='\w$ '

Create a directory containing an invalid UTF-8 in its name, such as the string abcdéfghi encoded in latin1:

mkdir $'abcd\351fghi'

Enter this directory.

Outside of mc, the directory in the prompt should appear as abcd�fghi or so.

Inside mc, in its command line prompt, it's displayed as abcdhi: no replacement symbol, and the next two letters fg are swallowed.


To spice it up, let's use some color changing escape sequences, and put the invalid UTF-8 at the end of the directory name:

PS1='\[\e[31m\]\u@\w\[\e[0m\]$ '
mkdir $'abcd\351'  # abcdé encoded in latin1

The prompt in mc appears as abcd0m$ . The two stripped bytes are now ESC [, resulting in the rest of the escape sequence getting interpreted verbatim.

Expected behavior

Synchronize immediately after an invalid UTF-8, preserve and interpret every valid UTF-8 character (including the escape byte starting the escape sequence).

Actual behavior

Characters get lost after invalid UTF-8.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: coreIssues not related to a specific subsystemprio: mediumHas the potential to affect progress

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions