Skip to content

feat: Transliterate non-ASCII characters in MCP server headers#11472

Open
kenzaelk98 wants to merge 2 commits intodanny-avila:mainfrom
leondape:feat/transliterate-non-ascii-mcp-headers
Open

feat: Transliterate non-ASCII characters in MCP server headers#11472
kenzaelk98 wants to merge 2 commits intodanny-avila:mainfrom
leondape:feat/transliterate-non-ascii-mcp-headers

Conversation

@kenzaelk98
Copy link
Contributor

This PR improves upon the fix in (#11432) by replacing Base64 encoding with character transliteration for non-ASCII characters in MCP server headers.

Why this approach is better:

  • More readable: ĐorđeDorde (vs b64:xJBvcsSRZQ==)
  • Universal compatibility: Works with all MCP servers immediately without requiring decoding logic
  • Simpler architecture: MCP servers receive plain ASCII text - no special handling needed
  • Broader support: Handles Latin, Cyrillic, Arabic, CJK, and more via the transliteration package

Background:
The original issue solved in (#11432) was that non-ASCII characters (Unicode > 255) in user names/emails caused ByteString errors when passed as HTTP headers to MCP servers. The initial fix used Base64 encoding, but this required MCP servers to implement custom decoding logic to convert b64:... strings back to readable text. This added complexity and wasn't practical for third-party MCP servers that shouldn't need special handling for LibreChat headers.

This PR:

  • Replaces encodeHeaderValue() (Base64) with sanitizeHeaderValue() (transliteration)
  • Uses the well-maintained transliteration npm package
  • Updates all tests to reflect the new approach
  • No breaking changes - headers remain ASCII-safe

Testing

Tested locally with an MCP server that uses X-User-Name and X-User-Email headers containing non-ASCII characters.

Test Configuration:

  • Created test user with name: Đorđe Marić (contains Đ=272, đ=273, ć=263 - all > 255)
  • Configured MCP server to receive headers with user placeholders
  • Verified MCP server received transliterated values: Dorde Maric
  • Confirmed no ByteString errors in LibreChat logs
  • All unit tests passing (14 new transliteration tests added)

Test Results:

  • ✅ Headers successfully transliterated before sending
  • ✅ MCP server received plain ASCII text
  • ✅ No ByteString errors
  • ✅ All 2012 tests passing (including 14 new transliteration tests)

kenzaelk98 and others added 2 commits January 22, 2026 12:46
- Use transliteration to convert special/accented characters to ASCII equivalents
- Prevents ByteString errors (characters > 255) in HTTP headers
- More readable than Base64 encoding (Đorđe → Dorde vs b64:...)
- Works with all MCP servers immediately (no decoding needed)
- Supports Latin, Cyrillic, Arabic, CJK, and more via 'transliteration' package

Changes:
- Add sanitizeHeaderValue() function using transliteration library
- Apply to user name, username, and email fields in MCP headers
- Add 14 comprehensive tests for transliteration edge cases
- Install 'transliteration' npm package

Tested locally with MCP server integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant