-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I recently created the regex package, which is a template tag for regexes as raw strings. Among other features, it always implicitly enables flag x based on details in this proposal (which are fairly limited right now, so I also based it on flag xx from Perl and PCRE).
It always uses flag v (if available) or u implicitly, so it can't be used to test x in Unicode-unaware mode. And since it uses template strings, it doesn't need to worry about ()[]/ in comments (since comments are stripped before passing to the RegExp constructor). But with those caveats, you can use it to test x behavior even for edge cases.
Just for example:
regex`[a- -b]`is an error (range ofato unescaped/invalid-).regex`[a& &b]`is equivalent to/[a&b]/.regex`[a && b]`is equivalent to/[a&&b]/v.regex`\0 1`is equivalent to/\0(?:)1/.\c Aand(? :)are errors, and( ?:)is an error because you can't quantify(.- Quantifiers following whitespace and/or comments apply to the preceding token, so
x +is equivalent tox+. - Whitespace and/or comments are allowed to separate a quantifier and the
?that makes it lazy. - Only space and tab are insignificant within character classes and
[\q{…}], not#or other whitespace. - Outside of character classes, the insignificant whitespace characters are those matched natively by
\s. - Excluding
[\q{…}], whitespace is significant in enclosed tokens.- Outside of character classes:
\u{…},\p{…},\P{…},(?<…>),\k{…}, and{…}. - Within character classes:
\u{…},\p{…}, and\P{…}.
- Outside of character classes:
If additional details are clarified in this proposal and they don't match regex's handling, I will update it to stay in line.