-
Notifications
You must be signed in to change notification settings - Fork 215
Description
The constant evaluator for the ascii() built-in function incorrectly interprets strings with \xHH escapes as UTF-8, causing it to count certain bytes (e.g., \xFF) as multiple bytes instead of a single byte. This misinterpretation leads to erroneous "ascii string is too long" compile-time errors for valid ASCII strings within the 32-byte limit.
Minimal Example:
const SeventeenFFs: String = "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF"; // 17 bytes
const AsciiValue17: Int = ascii(SeventeenFFs); // Should compile successfully
contract TestAsciiLengthBug {
val: Int;
init() {
self.val = AsciiValue17;
}
}Compiler Output:
Error: Cannot evaluate expression to a constant: ascii string is too long, expected up to 32 bytes, got 34
Expected Behavior:
Each \xFF escape sequence must be counted as exactly one byte (ASCII), resulting in a total length of 17 bytes, which is within the 32-byte limit.
Explanation:
The current implementation incorrectly calculates byte length using UTF-8 encoding, turning each non-ASCII codepoint (\xFF) into multiple bytes, thereby exceeding the allowed limit. The correct behavior is to treat each \xHH escape as exactly one byte, adhering to ASCII semantics.
LLM Fuzzing discovery (see #2490)