Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doesAnyCharRequireEscaping() conflicts with UTF-8 #1551

Open
janhec opened this issue Aug 22, 2024 · 2 comments
Open

doesAnyCharRequireEscaping() conflicts with UTF-8 #1551

janhec opened this issue Aug 22, 2024 · 2 comments

Comments

@janhec
Copy link

janhec commented Aug 22, 2024

toStyledString() produces \u escapes for UTF-8 characters such as ï.
This is caused by escaping on the ground that c > 0x7F in doesAnyCharRequireEscaping(), called from valueToQuotedStringN() from BuiltStyledStreamWriter::writeValue(). The last part of the condition imo should be removed;
After removing c > 0x7F, I get normal UTF-8 which I need and is imo in line with the specs.

@BillyDonahue
Copy link
Contributor

BillyDonahue commented Aug 22, 2024

Can you clarify what code you're referring to with some links (or pasted code, or both) ?
It's difficult to follow the description as-is. Thx.

@janhec
Copy link
Author

janhec commented Aug 26, 2024

json_writer.cpp:180.
I commented out the original body of the lambda and replaced it with a shorter one, because UTF-8 was getting escaped.
This change is only useful in case of UTF-8 in the json text, and a desire to keep things that way,
so probably this should be done with more nuance than this.
Anyway, it helped in my specific case.

static bool doesAnyCharRequireEscaping(char const* s, size_t n) {
  assert(s || !n);

  return std::any_of(s, s + n, [](unsigned char c) {
    //return c == '\\' || c == '"' || c < 0x20 || c > 0x7F; // c > 0x7F conflicts with UTF-8
    return c == '\\' || c == '"' || c < 0x20;
  });
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants