Skip to content

Advanced | Regular Expressions In Python

Skip to the problems!

Lookbehinds and Lookaheads

A lookbehind lets you qualify an expression A such that, some other expression B does or does not come immediately before A.

Similarly, a lookahead lets you qualify an expression A such that, some other expression B does or does not come immediately after A.

Pattern Name Description Example
(?<=2)dog positive lookbehind Match dog with 2 before it 1dog2dog3dog
(?<!2)dog negative lookbehind Match dog without 2 before it 1dog2dog3dog
dog(?=2) positive lookahead Match dog with 2 after it 1dog2dog3dog
dog(?!2) negative lookahead Match dog without 2 after it 1dog2dog3dog

Lazy Search Operator

By default, regular expressions are greedy meaning they attempt to find the longest matching substring. To make a pattern non greedy, give it the non greedy qualifier, ?. For example, given

"dogcatmouserat"

the pattern dog.*a matches:

  • dog dog
  • followed by any character with zero or more repetitions .*
  • followed by a a

There are multiple substrings that meet these criteria!

dogcatmouserat  <- greedy
dogcatmouserat  <- non greedy

By default, Python's regex engine returns the greedy result.

re.search(pattern="dog.*a", string="dogcatmouserat")
# <re.Match object; span=(0, 13), match='dogcatmousera'>

If you want the non greedy result, use *?.

re.search(pattern="dog.*?a", string="dogcatmouserat")
# <re.Match object; span=(0, 5), match='dogca'>

The non greedy qualifier can make these expressions non greedy:

Greedy Non Greedy
* *?
+ +?
? ??
{m,n} {m,n}?