We are currently manually negating the input set and encoding the full negated set, rather than checking that a set does NOT include a character (for which we already have logic, except for the built-in character sets).
This has a minor to moderate unnecessary memory cost. We should be using whichever set (negated or natural) which contains the fewest characters, which minimizes memory cost -- which also happens to include selecting an ASCII-only set, if possible, which saves lots of memory by avoiding the creation of the non-ASCII external data structure for a character set.
This means pushing the set negation check into the bytecode (for which we already have opcodes).
See #5592 (comment)