This proposal is in stage 3 of the TC39 process.
In regular expression patterns, the dot . matches a single character, regardless of what character it is. In ECMAScript, there are two exceptions to this:
.doesn’t match astral characters. Setting theu(unicode) flag fixes that..doesn’t match line terminator characters.
ECMAScript recognizes the following line terminator characters:
- U+000A LINE FEED (LF) (
\n) - U+000D CARRIAGE RETURN (CR) (
\r) - U+2028 LINE SEPARATOR
- U+2029 PARAGRAPH SEPARATOR
However, there are more characters that, depending on the use case, could be considered as newline characters:
- U+000B VERTICAL TAB (
\v) - U+000C FORM FEED (
\f) - U+0085 NEXT LINE
This makes the current behavior of . problematic:
- By design, it excludes some newline characters, but not all of them, which often does not match the developer’s use case.
- It’s commonly used to match any character, which it doesn’t do.
The proposal you’re looking at right now addresses the latter issue.
Developers wishing to truly match any character, including these line terminator characters, cannot use .:
/foo.bar/.test('foo\nbar');
// → falseInstead, developers have to resort to cryptic workarounds like [\s\S] or [^]:
/foo[^]bar/.test('foo\nbar');
// → trueSince the need to match any character is quite common, other regular expression engines support a mode in which . matches any character, including line terminators.
- Engines that support constants to enable regular expression flags implement
DOTALLorSINGLELINE/smodifiers. - Engines that support embedded flag expressions implement
(?s). - Engines that support regular expression flags implement the flag
s.
Note the established tradition of naming these modifiers s (short for singleline) and dotAll.
One exception is Ruby, where the m flag (Regexp::MULTILINE) also enables dotAll mode. Unfortunately, we cannot do the same thing for the m flag in JavaScript without breaking backwards compatibility.
We propose the addition of a new s flag for ECMAScript regular expressions that makes . match any character, including line terminators.
/foo.bar/s.test('foo\nbar');
// → trueconst re = /foo.bar/s; // Or, `const re = new RegExp('foo.bar', 's');`.
re.test('foo\nbar');
// → true
re.dotAll
// → true
re.flags
// → 's'The meaning of existing regular expression patterns isn’t affected by this proposal since the new s flag is required to opt-in to the new behavior.
This question might come up since the s flag stands for singleline, which seems to contradict m / multiline — except it doesn’t. This is a bit unfortunate, but we’re just following the established naming tradition in other regular expression engines. Picking any other flag name would only cause more confusion. The accessor name dotAll gives a much better description of the flag’s effect. For this reason, we recommend referring to this mode as dotAll mode rather than singleline mode.
Both modes are independent and can be combined. multiline mode only affects anchors, and dotAll mode only affects ..
When both the s (dotAll) and m (multiline) flags are set, . matches any character while still allowing ^ and $ to match, respectively, just after and just before line terminators within the string.
- V8
- regexpu (transpiler) with the
{ dotAllFlag: true }option enabled - Compat-transpiler of RegExp Tree