0% found this document useful (0 votes)

245 views10 pages

Regex Cheat Sheet for Developers

Uploaded by

Nanda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

245 views10 pages

Regex Cheat Sheet for Developers

Uploaded by

Nanda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Introduction

A regular expression (regex in short) is a pattern in input text that the regular expression
engine tries to match. One or more character literals, operators, or structures make up a
pattern. It's particularly good at searching for and manipulating text strings, as well as
processing text files. A single regex line can easily replace dozens of lines of programming
code. In this blog, you will come across the basic to advanced regex concepts that would help
you in your development task.

Why Regex?

Searching and replacement operations are made easy by regular expressions. Finding a
substring that matches a pattern and replacing it with something else is a common use case.
Regex is renowned for the IT expertise that dramatically boosts productivity in all computer
tasks.

Basic Regex Commands

1. Characters

The character class is the most basic regex concept. It converts a small group of characters to
match a larger number of characters.

1
Character Description
\ Escape character
. Any character
\s whitespace
\S Not white space
\d Digit
\D Not digit
\w Word character
\W Not word character
\b Word boundary
\B Not word boundary
^ Beginning of string
$ End of string

For example:

1. [xyz] matches x or y or z
2. [^pqr] matches any character _except_ p, q, or r (negation)
3. [a-zA-Z] matches a through z or A through Z, inclusive (range)

2. Groups

To operate on all items in a group, use a group expression. You can use a group expression
to apply an operator to a group or to find a specified text before or after each member of the
group, for example. The grouping operator is the parentheses, while the "|" is used to divide
the elements.

For example:

• My (red|pink|blue) dress
o Here "My red dress", "My blue kite", and "My pink dress" match the expression.
"My yellow kite" or "My dress" do not match.

2
Let’s look into some of these operators.

Group Description
[ ] Characters in brackets are matched.
[^ ] Characters not in brackets are matched.
| Either, or
( ) Capturing the group

More examples:

1. x(yz) parentheses create a capturing group with value yz

2. p(?:qr)* using ?: you disable the capturing group
3. a(?<ok>bc) using ?<ok> we put a name to the group

3. Quantifiers

Quantifiers specify how many characters or expressions should be matched.

Quantifiers Description
* 0 or more (Kleene star)
+ 1 or more (Kleene plus)
? 0 or 1
{ } Exact number of characters
{min,max} Range of characters

The quantifiers ( * + {}) are also known as greedy operators. They expand the match as far as
they can via the input text.

For example:

1. <[^<>]+> matches any of the characters except < or > included one or more times
inside < and >
2. \w+?\d\d\w+ matches abcdef42ghijklmnfhaeij

More examples for quantifiers

3
x{3} Exactly 3 of x
x{3,} 3 or more of x
x{3,6} Between 3 and 6 of x
x* Greedy quantifier
x*? Lazy quantifier
x*+ Possessive quantifier

4. Anchors

Based on the current position in the string, determines whether the match will succeed or fail.

Anchors Description
^ Beginning of the string
$ End of the string
\A The match is at the beginning of the string.
\G Beginning of the match
\Z At the end of the string or before \n at the end of the string, the match occurs.
\z Absolute end of the string
\B No word boundary
\b Word boundary

For example:

1. ^\d{3} matched 444 in 444-888-999-..

2. -\d{3}\Z matches -220 in 110-220
3. \babc\b performs a "whole words only" search
4. \Bend\w*\b matches "ends", "ender" in "end sends endure lender"
5. ^Hello me$ matches the string Hello me

Now, that you have got some brief idea about the characters, quantifiers and groups. Let’s look
into some combined examples.

4
1. ^(\d*) [.,](\d+)$ matches numbers like 12,3 or 12.3
2. ^[a-zA-Z0-9 ]*$ matches any alphanumeric with spaces.
3. ^[\s]*(.*?)[\s]*$ matches the text by avoiding the extra spaces
4. ^([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})*$ could be used for matching email
5. (https?)://(www)?.?(\\w+).(\\w+)/?(\\w+)? can be used matching URLs

Advanced Regex Commands

1. Lookarounds

When the regex engine processes the lookaround expression, it first creates a substring from
the present place to the beginning (lookbehind) or end (lookahead) of the original string, and
then runs Regex. With the use of the lookaround pattern, IsMatch on that picked substring. A
positive or negative assertion might be used to assess the outcome's success.

Lookaround Description
(?=check) Positive Lookahead
(?!check) Negative Lookahead
(?<=check) Positive Lookbehind
(?<!check) Negative Lookbehind

Examples:

1. (?=\d{10})\d{4} matches 2348 in 2348856787

2. (?<=\d)rat matches mat in 2mat
3. (?!theatre)the\w+ matches theme
4. \w{3}(?<!mon)ster matches munster

Explore InterviewBit’s Exclusive Live Events

2. POSIX commands

A character class is a combination of a small number of characters and a large number of

characters. Only within bracket expressions can we use POSIX character classes. To generate
regular expressions, the POSIX standard supports the following character classes.

5
POSIX Description
[:alpha:] PCRE (C, PHP, R…): ASCII letters A-Z and a-z
[:alpha:] Ruby 2: Unicode letter or ideogram
[:alnum:] PCRE (C, PHP, R…): ASCII digits and letters A-Z and a-z
[:alnum:] Ruby 2: Unicode digit, letter or ideogram
[:punct:] PCRE (C, PHP, R…): ASCII punctuation mark
[:punct:] Ruby: Unicode punctuation mark

Examples:

1. [8[:alpha:]]+ sample match could beWellDone88

2. [[:alpha:]\d]+ sample match could beкошка99
3. [[:alnum:]]{10} sample match could beABC1275851
4. [[:alnum:]]{10} sample match could be кошка67810
5. [[:punct:]]+ sample match could be ?!.,:;
6. [[:punct:]]+ sample match could be ‽,:〽⁆

3. Unicode property escapes

The following are examples of Unicode property escapes:

1. \p{prop=value}: All characters with the prop property have the value value.
2. \P{prop=value}: All characters without a property prop with the value value are matched.
Match \p{bin_prop}: all characters with the bin prop binary property set to True.
3. \P{bin_prop}: All characters with the binary attribute bin prop set to False will be
matched.

4. Flags

A flag is a parameter that can be added to a regex to change how it searches. A flag modifies
a regular expression's default searching behaviour. It does a regex search in a unique method.
A single lowercase alphabetic character is used to represent a flag. There are six flags in the
JavaScript regex, each providing a different purpose.

6
Flag Description
i Makes the expression search case-insensitively.
g Makes the expression search for all occurrences.
s Makes a wild character. match newlines as well.
Instead of matching the beginning and conclusion of the entire string, the boundary
m characters ^ and $ match the beginning and ending of each individual line.
y Starts the expression's search from the index specified by the lastIndex attribute.
Assumes that individual characters are code points rather than code units, and so matches
u 32-bit characters.

When employing the forward slashes / to build an expression, flags come after the second
slash. This can be expressed in general notation as follows: \pattern\flag

For example, if the flag i was added to the regex /a/, the result would be /a/i.

To provide a regex with many flags, we write them one by one (without any spaces or other
delimiters).

If we gave the flags i and g to the regex /a/, for example, we'd write /a/ig (or equivalently /a/gi,
as the order doesn't matter).

Note: The sequence in which flags appear is irrelevant; flags simply change the behaviour of
searching, thus placing one before the other makes no difference.

5. Recurse

The following engines enable recursion: PCRE (C, PHP, R...) Perl Ruby 2+ Python via the
alternate regex package JGSoft (not available in a programming language)

The most common application of recursion is to match balanced constructions.

Command Description
(?R) Recurse entire pattern
(?1) Recurse the first subpattern
(?+1) Recurse first relative subpattern

7
Command Description
(?&name) Recurse subpattern name
(?P=name) Match subpattern name
(?P>name) Recurse subpattern name

a(?R)?z, a(?0)?z, and a\g<0>?z are all regexes that match one or more letters followed by the
same number of letters z.

Examples:

• ($(?R)?$) match parentheses like ((()))

• ($(?R)*$) match parentheses like (()()())
• \w{3}\d{4}(?R)? matches patterns like ccc8888ggg9999

Possible Performance pitfalls

1. Regular Expression Performance Pitfalls

Because two "equivalent" regexes might have substantial changes in processing performance,
you should understand how your regex engine works.

1. It is feasible to construct regexes that match in exponential time, but you must essentially
TRY to do so.
2. Regexes that run in quadratic time are more commonly created by accident.
3. Problems of many kinds
• Recompilation (from forgetting to compile regexes used multiple times)
• The Middle Dot-star (which causes backtracking)
o The first approach, use a character class that is negated.
o Use reluctant quantifiers as a second option.

Tips to increase the performance

1. Improving the performance of regular expressions

• When you require parentheses but not capture, use non-capturing groups.

8
• Do a quick check before attempting a match, if the regex is very complex, e.g.
o Does an email address contain '@'?
• Present the most likely option(s) first, e.g.
o light green|dark green|brown|yellow|green|pink leaf
• Minimize the amount of looping
o \d\d\d\d\d\d is faster than \d{6}
o aaaaaa+ is faster than a{6,}
• Avoid obvious backtracking, e.g.
o Mr|Ms|Mrs should be M(?:rs?|s)
o Good night|Good morning should be Good (?:night|morning)

Conclusion

In this blog, you came across some of the interesting regex command concepts that would be
very helpful in the matching of different kinds of strings. You also learned about some of the
pitfalls that could happen when using regex and that could affect the performance of your
application too. Further to get over these pitfalls, the blog also discusses some of the common
tips that could help you in overcoming these obstacles.

9
10

Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
No ratings yet
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
206 pages
Regex Patterns and Operations Guide
No ratings yet
Regex Patterns and Operations Guide
7 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
28 pages
Introduction to Regular Expressions
No ratings yet
Introduction to Regular Expressions
5 pages
Regular Expresions
No ratings yet
Regular Expresions
27 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
73 pages
Mastering Regular Expressions in PHP
100% (2)
Mastering Regular Expressions in PHP
209 pages
Python RegEx Tutorial for Beginners
No ratings yet
Python RegEx Tutorial for Beginners
8 pages
Comprehensive Regex Tutorial Guide
No ratings yet
Comprehensive Regex Tutorial Guide
103 pages
Javascript Regexp Object
No ratings yet
Javascript Regexp Object
4 pages
Text Normalization and Regex in NLP
No ratings yet
Text Normalization and Regex in NLP
91 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
56 pages
Regex Special Characters and Syntax Guide
No ratings yet
Regex Special Characters and Syntax Guide
17 pages
Writing Effective Regular Expressions
No ratings yet
Writing Effective Regular Expressions
3 pages
Regex Concatenation Guide
No ratings yet
Regex Concatenation Guide
5 pages
Understanding Regular Expressions in Python
No ratings yet
Understanding Regular Expressions in Python
10 pages
Understanding Regular Expressions in Unix
No ratings yet
Understanding Regular Expressions in Unix
67 pages
Regular Expressions
No ratings yet
Regular Expressions
9 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
32 pages
Regex Basics for NLP Applications
No ratings yet
Regex Basics for NLP Applications
42 pages
Perl Regex Cheat Sheet Overview
No ratings yet
Perl Regex Cheat Sheet Overview
2 pages
Regular Expression Syntax: Literals
No ratings yet
Regular Expression Syntax: Literals
5 pages
Understanding Regular Expressions in NLP
No ratings yet
Understanding Regular Expressions in NLP
34 pages
Perl Regex Quick Reference Guide
No ratings yet
Perl Regex Quick Reference Guide
2 pages
BBEdit Regex and PCRE Guide
No ratings yet
BBEdit Regex and PCRE Guide
4 pages
SQL Regex Patterns and Examples
No ratings yet
SQL Regex Patterns and Examples
13 pages
Python Regular Expressions Guide
No ratings yet
Python Regular Expressions Guide
62 pages
Comprehensive RegEx Cheat Sheet
No ratings yet
Comprehensive RegEx Cheat Sheet
5 pages
Python Regex Cheat Sheet
No ratings yet
Python Regex Cheat Sheet
1 page
Understanding Regular Expressions
No ratings yet
Understanding Regular Expressions
11 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
16 pages
Regex Applications in NLP
No ratings yet
Regex Applications in NLP
16 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
23 pages
Regex All in One Guide
No ratings yet
Regex All in One Guide
16 pages
PHP Regular Expressions Guide
No ratings yet
PHP Regular Expressions Guide
14 pages
Regular Expressions in JavaScript
No ratings yet
Regular Expressions in JavaScript
11 pages
How to Write Regular Expressions
No ratings yet
How to Write Regular Expressions
7 pages
Unix Regular Expressions Explained
No ratings yet
Unix Regular Expressions Explained
7 pages
Understanding Regular Expressions in NLP
No ratings yet
Understanding Regular Expressions in NLP
34 pages
Understanding Regular Expressions in Programming
No ratings yet
Understanding Regular Expressions in Programming
35 pages
PHP Regular Expressions Guide
No ratings yet
PHP Regular Expressions Guide
7 pages
Understanding Regular Expressions in Linux
No ratings yet
Understanding Regular Expressions in Linux
26 pages
C++ Regex Basics and Usage Guide
No ratings yet
C++ Regex Basics and Usage Guide
5 pages
JavaScript RegExp Basics and Usage
No ratings yet
JavaScript RegExp Basics and Usage
9 pages
Introduction to Regular Expressions
No ratings yet
Introduction to Regular Expressions
24 pages
PHP Regular Expressions Overview
No ratings yet
PHP Regular Expressions Overview
5 pages
Chapter 5 Css
No ratings yet
Chapter 5 Css
52 pages
Understanding Regular Expressions in JS
No ratings yet
Understanding Regular Expressions in JS
12 pages
Perl Regex Quick Start Guide
No ratings yet
Perl Regex Quick Start Guide
9 pages
Regex Fundamentals by Andrei Zmievski
100% (1)
Regex Fundamentals by Andrei Zmievski
148 pages
Mastering Regex in Python
No ratings yet
Mastering Regex in Python
20 pages
Comparing Numbers with Regex
No ratings yet
Comparing Numbers with Regex
30 pages
Redhat Chap2
No ratings yet
Redhat Chap2
15 pages
Understanding Regular Expressions Basics
No ratings yet
Understanding Regular Expressions Basics
13 pages
Python Regular Expressions Guide
No ratings yet
Python Regular Expressions Guide
15 pages
Regular Expressions in PHP
No ratings yet
Regular Expressions in PHP
22 pages
Perl 5.10 Regular Expressions Guide
No ratings yet
Perl 5.10 Regular Expressions Guide
6 pages
Regex Meta Characters and Usage Guide
No ratings yet
Regex Meta Characters and Usage Guide
3 pages
Understanding POSIX Regular Expressions
No ratings yet
Understanding POSIX Regular Expressions
5 pages
Creating Effective Family Tables
No ratings yet
Creating Effective Family Tables
2 pages
Employee Vacant Post Details
No ratings yet
Employee Vacant Post Details
396 pages
SEA PostColonial Literature Research
No ratings yet
SEA PostColonial Literature Research
2 pages
MOTELx 2014 Festival Overview
No ratings yet
MOTELx 2014 Festival Overview
92 pages
Analyzing A Thousand Splendid Suns
No ratings yet
Analyzing A Thousand Splendid Suns
4 pages
Nadwatul Ulema: Reforming Islamic Education
No ratings yet
Nadwatul Ulema: Reforming Islamic Education
8 pages
Rizal Transes Midterm
No ratings yet
Rizal Transes Midterm
11 pages
EFLU Ph.D. Comparative Literature Interview
No ratings yet
EFLU Ph.D. Comparative Literature Interview
2 pages
PHP Book Management System Code
No ratings yet
PHP Book Management System Code
9 pages
Buddhist Planetary Worship Insights
No ratings yet
Buddhist Planetary Worship Insights
4 pages
Cambridge ESL Stage 2 Practice Test
No ratings yet
Cambridge ESL Stage 2 Practice Test
3 pages
Audio Editing in Adobe Premiere Pro
No ratings yet
Audio Editing in Adobe Premiere Pro
18 pages
NLP Week 03 Assignment 03 Solutions
No ratings yet
NLP Week 03 Assignment 03 Solutions
6 pages
Notable Authors from Region 1
No ratings yet
Notable Authors from Region 1
3 pages
H Nakamura Ways of Thinking of Eastern Peoples India China Tibet Japan
100% (3)
H Nakamura Ways of Thinking of Eastern Peoples India China Tibet Japan
735 pages
Vocabulary Mastery and Writing Skills
No ratings yet
Vocabulary Mastery and Writing Skills
20 pages
Model Exam Schedule for English Courses
No ratings yet
Model Exam Schedule for English Courses
1 page
Classical Greek Paper 1 HL Markscheme
No ratings yet
Classical Greek Paper 1 HL Markscheme
11 pages
Multilevel FFT for Dielectric Field Propagation
No ratings yet
Multilevel FFT for Dielectric Field Propagation
14 pages
DA106 Hi-Fi Lossless MP3 Player Manual
No ratings yet
DA106 Hi-Fi Lossless MP3 Player Manual
1 page
LPC2148 Development Board Manual
100% (1)
LPC2148 Development Board Manual
62 pages
The Poetics of Perspective
No ratings yet
The Poetics of Perspective
343 pages
Multiloop Timing Protocol Overview
No ratings yet
Multiloop Timing Protocol Overview
12 pages
GGH3705 Environmental Assessment Guide
No ratings yet
GGH3705 Environmental Assessment Guide
29 pages
Ansible and Docker Exam Questions
No ratings yet
Ansible and Docker Exam Questions
2 pages
IELTS Listening Exam Preparation Tips
No ratings yet
IELTS Listening Exam Preparation Tips
3 pages
Modos Verbales en Inglés
No ratings yet
Modos Verbales en Inglés
5 pages
English 6 Unit 3 Friends Practice Test
No ratings yet
English 6 Unit 3 Friends Practice Test
6 pages
English One Mark Test - 10th - Answer Key
No ratings yet
English One Mark Test - 10th - Answer Key
3 pages
Marsh Unicorn
No ratings yet
Marsh Unicorn
32 pages

Regex Cheat Sheet for Developers

Uploaded by

Regex Cheat Sheet for Developers

Uploaded by

Introduction

Basic Regex Commands

1. x(yz) parentheses create a capturing group with value yz

Quantifiers specify how many characters or expressions should be matched.

More examples for quantifiers

1. ^\d{3} matched 444 in 444-888-999-..

Advanced Regex Commands

1. (?=\d{10})\d{4} matches 2348 in 2348856787

Explore InterviewBit’s Exclusive Live Events

A character class is a combination of a small number of characters and a large number of

1. [8[:alpha:]]+ sample match could beWellDone88

3. Unicode property escapes

The following are examples of Unicode property escapes:

The most common application of recursion is to match balanced constructions.

• (\((?R)?\)) match parentheses like ((()))

Possible Performance pitfalls

1. Regular Expression Performance Pitfalls

Tips to increase the performance

1. Improving the performance of regular expressions

You might also like