I think there may be something there.
Perhaps we could include a syntax for regular expression parsing as part of syntactical analysis.
Eg. for regular use it could be told:
OK, so it's a terribly contrived example, syntactically ambiguous (the regexp isn't great) but it does define the point, and possibly the best way to define a pair in XML, for example. (I wouldn't claim it was
the best way, but better than having empty tags everywhere)
I am coming round - slowly, mind you - to the argument of XML being involved.
By also adding regexp handling in syntax files, and by our quality checking, we can create some quite wonderful syntaxes and their handling.
As for handling of keywords, think about it this way: (extract from PHP4/5)
The benefit here should hopefully be obvious - instead of matching a phrase to 3,000+ keywords (in PHP 5), you're matching it to a few hundred prefixes first, then if you match it to a prefix, you can then try matching it to the whole thing.
The equivalent RE would be image(2wbmp|_type_to_mime_type|alphablending|etc....) - if it doesn't match against image, it will fail the entire expression as it is required.
As for nesting comments, this is always going to be an issue for any editor, but if you have regexp handling, you could say:
To match /* /* /* comment */
I know it looks messy but it essentially looks for any number of /* followed by zero or more whitespace characters - as a single unit, so it would match /* and a tab, followed by /* and two spaces, and even /* followed by no space and so on.
We also know that wxWidgets includes a regexp parser (the same one CE uses, although it may be a different version), so we should hopefully be able to add this content in.