ARM64Tokenizer mishandles a single input string. call() forwards straight to tokenize(), and tokenize() assumes texts is iterable over full samples; when given one string, it iterates character-by-character and json.loads() fails. That is a public API footgun because tokenizer callers normally expect both str and list[str] to work.
ARM64Tokenizer mishandles a single input string. call() forwards straight to tokenize(), and tokenize() assumes texts is iterable over full samples; when given one string, it iterates character-by-character and json.loads() fails. That is a public API footgun because tokenizer callers normally expect both str and list[str] to work.