Advanced Regular Expressions
Advanced Regular Expressions
Back to Basics
Quantifiers are by default GREEDY: they first match everything and then fallback if forced Adding ? after a quantifier makes it LAZY (non-greedy) \d+? _??(\w+)
Examples: .*?
Difference between greedy and lazy: choosing between make or skip the attempt
Are alternations greedy ? Traditional NFA uses the ordered alternation (tourn|to|tournament)
After exiting the atomic group all inner states are thrown away
Atomic groups can eliminate several permutations / paths and speed up the fail (?>.*?) (?>.+?) (?>\w+) \b(?>int|integer)\b applied on "integer" Possessive quantifiers are greedy and never give up the match
Examples: .*+
Non capturing groups: (?: ) Lookarounds are zero-width assertions and atomic
Lookarounds can be positive or negative Positive lookahead (?= ) checks upcoming characters for a positive match: q(?=u)i applied on "quit"
Negative lookahead (?! ) checks upcoming not to match q(?!i)u applied on "quit"
Lookahead can contain a full regular expression Lookbehind can only contain fixed-length strings Positive lookbehind: (?<= ) (?<=a)b thingamabob
References
http://www.regular-expressions.info/