Regular Expression

Regular expressions are patterns used to match character combinations in strings. They originated in the 1950s and became popular with Unix text-processing utilities. Some key advantages of regular expressions include being more concise than equivalent code and being easier for non-programmers to use than procedural code. They allow operations like matching alternatives with |, grouping with (), and quantification of elements with ?, *, +, {n}, etc. Regular expressions are now widely supported in programming languages and text editors.

Uploaded by

Fawzi Gharib

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

Regular Expression

Uploaded by

Fawzi Gharib

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Ministry Of Higher Education & Scientific

Research
Salahaddin University - College Of Science-
nd
Computer Department – 2 Stage

Regular Expression
Prepared By : Bilal Kamaran Saed
Contents
Introduction – What is Regular Expression ?...............................................................................................3
History.........................................................................................................................................................4
Advantages of Regular Expression...............................................................................................................5
Basic Concept..............................................................................................................................................6
Writing a regular expression pattern...........................................................................................................8
Using simple patterns..........................................................................................................................8
Using special characters......................................................................................................................8
Escaping.....................................................................................................................................................10
Introduction – What is Regular Expression ?
regular expression (shortened as regex or regexp also referred to as rational
expression ) is a sequence of characters that specifies a search pattern. Usually
such patterns are used by string-searching algorithms for "find" or "find and
replace" operations on strings, or for input validation. It is a technique developed
in theoretical computer science and formal language theory.
The concept of regular expressions began in the 1950s, when the American
mathematician Stephen Cole Kleene formalized the description of a regular
language. They came into common use with Unix text-processing utilities.
Different syntaxes for writing regular expressions have existed since the 1980s,
one being the POSIX standard and another, widely used, being the Perl syntax.
Regular expressions are used in search engines, search and replace dialogs of word
processors and text editors, in text processing utilities such as sed and AWK and
in lexical analysis. Many programming languages provide regex capabilities either
built-in or via libraries, as it has uses in many situations.
History
Regular expressions originated in 1951, when mathematician Stephen Cole
Kleene described regular languages using his mathematical notation
called regular events. These arose in theoretical computer science, in the
subfields of automata theory (models of computation) and the description and
classification of formal languages. Other early implementations of pattern
matching include the SNOBOL language, which did not use regular expressions,
but instead its own pattern matching constructs.
Regular expressions entered popular use from 1968 in two uses: pattern matching
in a text editor and lexical analysis in a compiler. Among the first appearances of
regular expressions in program form was when Ken Thompson built Kleene's
notation into the editor QED as a means to match patterns in text files. For speed,
Thompson implemented regular expression matching by just-in-time
compilation (JIT) to IBM 7094 code on the Compatible Time-Sharing System, an
important early example of JIT compilation. He later added this capability to the
Unix editor ed, which eventually led to the popular search tool grep's use of
regular expressions ("grep" is a word derived from the command for regular
expression searching in the ed editor: g/re/p meaning "Global search for Regular
Expression and Print matching lines"). Around the same time when Thompson
developed QED, a group of researchers including Douglas T. Ross implemented a
tool based on regular expressions that is used for lexical analysis
in compiler design.
Many variations of these original forms of regular expressions were used
in Unix programs at Bell Labs in the 1970s, including vi, lex, sed, AWK, and expr,
and in other programs such as Emacs. Regexes were subsequently adopted by a
wide range of programs, with these early forms standardized in
the POSIX.2 standard in 1992.
In the 1980s the more complicated regexes arose in Perl, which originally derived
from a regex library written by Henry Spencer (1986), who later wrote an
implementation of Advanced Regular Expressions for Tcl. The Tcl library is a
hybrid NFA/DFA implementation with improved performance characteristics.
Software projects that have adopted Spencer's Tcl regular expression
implementation include PostgreSQL. Perl later expanded on Spencer's original
library to add many new features. Part of the effort in the design
of Raku (formerly named Perl 6) is to improve Perl's regex integration, and to
increase their scope and capabilities to allow the definition of parsing expression
grammars.] The result is a mini-language called Raku rules, which are used to
define Raku grammar as well as provide a tool to programmers in the language.
These rules maintain existing features of Perl 5.x regexes, but also allow BNF-style
definition of a recursive descent parser via sub-rules.
The use of regexes in structured information standards for document and
database modeling started in the 1960s and expanded in the 1980s when industry
standards like ISO SGML (precursored by ANSI "GCA 101-1983") consolidated. The
kernel of the structure specification language standards consists of regexes. Its
use is evident in the DTD element group syntax.
Starting in 1997, Philip Hazel developed PCRE (Perl Compatible Regular
Expressions), which attempts to closely mimic Perl's regex functionality and is
used by many modern tools including PHP and Apache HTTP Server.
Today, regexes are widely supported in programming languages, text processing
programs (particularly lexers), advanced text editors, and some other programs.
Regex support is part of the standard library of many programming languages,
including Java and Python, and is built into the syntax of others, including Perl
and ECMAScript. Implementations of regex functionality is often called a regex
engine, and a number of libraries are available for reuse. In the late 2010s, several
companies started to offer hardware, FPGA, GPU implementations
of PCRE compatible regex engines that are faster compared
to CPU implementations.

Advantages of Regular Expression.

 Better than equivalent code
 One line of Regex can replace 100 lines of procedural code
 Easier to cut and paste than code
 Easy to create by trial and error
 Easier for non-programmers than code
 Less error prone than code
Basic Concept
A regular expression, often called a pattern, specifies a set of strings required for a
particular purpose. A simple way to specify a finite set of strings is to list
its elements or members. However, there are often more concise ways: for
example, the set containing the three strings "Handel", "Händel", and "Haendel"
can be specified by the pattern H(ä|ae?)ndel ; we say that this
pattern matches each of the three strings. In most formalisms, if there exists at
least one regular expression that matches a particular set then there exists an
infinite number of other regular expressions that also match it—the specification
is not unique. Most formalisms provide the following operations to construct
regular expressions.

Boolean "or"
A vertical bar separates alternatives. For example, gray|grey can match
"gray" or "grey".
Grouping
Parentheses are used to define the scope and precedence of
the operators (among other uses). For example, gray|grey and gr(a|
e)y are equivalent patterns which both describe the set of "gray" or "grey".
Quantification
A quantifier after a token (such as a character) or group specifies how often
that a preceding element is allowed to occur. The most common quantifiers
are the question mark ? , the asterisk * (derived from the Kleene star), and
the plus sign + (Kleene plus).
? The question mark indicates zero or one occurrences of the
preceding element. For example, colou?r matches both "color"
and "colour".
* The asterisk indicates zero or more occurrences of the
preceding element. For example, ab*c matches "ac", "abc",
"abbc", "abbbc", and so on.
+ The plus sign indicates one or more occurrences of the
preceding element. For example, ab+c matches "abc", "abbc",
"abbbc", and so on, but not "ac".
{n} The preceding item is matched exactly n times.
{min,} The preceding item is matched min or more times.
{,max} The preceding item is matched up to max times.
The preceding item is matched at least min times, but not more
{min,max}
than max times.
Writing a regular expression pattern
A regular expression pattern is composed of simple characters, such as /abc/, or a
combination of simple and special characters, such as /ab*c/ or /Chapter
(\d+)\.\d*/. The last example includes parentheses, which are used as a memory
device. The match made with this part of the pattern is remembered for later use,
as described in Using groups.

Using simple patterns

Simple patterns are constructed of characters for which you want to find a direct
match. For example, the pattern /abc/ matches character combinations in strings
only when the exact sequence "abc" occurs (all characters together and in that
order). Such a match would succeed in the strings "Hi, do you know your
abc's?" and "The latest airplane designs evolved from slabcraft.". In both cases the
match is with the substring "abc". There is no match in the string "Grab
crab" because while it contains the substring "ab c", it does not contain the exact
substring "abc".

Using special characters

When the search for a match requires something more than a direct match, such
as finding one or more b's, or finding white space, you can include special
characters in the pattern. For example, to match a single "a" followed by zero or
more "b"s followed by "c", you'd use the pattern /ab*c/: the * after "b" means "0
or more occurrences of the preceding item." In the string "cbbabbbbcdebc", this
pattern will match the substring "abbbbc".

The following pages provide lists of the different special characters that fit into
each category, along with descriptions and examples.

 Assertions
o Assertions include boundaries, which indicate the beginnings and
endings of lines and words, and other patterns indicating in some
way that a match is possible (including look-ahead, look-behind,
and conditional expressions).
 Character classes
o Distinguish different types of characters. For example,
distinguishing between letters and digits.

Special characters in regular expressions.

Characters / constructs Corresponding
article
\, ., \cX, \d, \D, \f, \n, \r, \s, \S, \t, \v, \w, \W, \0, \xhh, \uhhhh, \uhhhhh, [\b] Character classes
^, $, x(?=y), x(?!y), (?<=y)x, (?<!y)x, \b, \B Assertions
(x), (?:x), (?<Name>x), x|y, [xyz], [^xyz], \Number Groups and ranges

*, +, ?, x{n}, x{n,}, x{n,m} Quantifiers
\p{UnicodeProperty}, \P{UnicodeProperty} Unicode property
escapes
o Groups and ranges
o Indicate groups and ranges of expression characters.
o Quantifiers
o Indicate numbers of characters or expressions to match.
 Unicode property escapes
o Distinguish based on unicode character properties, for example,
upper- and lower-case letters, math symbols, and punctuation.
 If you want to look at all the special characters that can be used in regular
expressions in a single table, see the following:
Escaping
If you need to use any of the special characters literally (actually searching for
a "*", for instance), you must escape it by putting a backslash in front of it. For
instance, to search for "a" followed by "*" followed by "b", you'd use /a\*b/ — the
backslash "escapes" the "*", making it literal instead of special.

Similarly, if you're writing a regular expression literal and need to match a slash
("/"), you need to escape that (otherwise, it terminates the pattern). For instance,
to search for the string "/example/" followed by one or more alphabetic
characters, you'd use /\/example\/[a-z]+/i—the backslashes before each slash
make them literal.

To match a literal backslash, you need to escape the backslash. For instance, to
match the string "C:\" where "C" can be any letter, you'd use /[A-Z]:\\/ — the first
backslash escapes the one after it, so the expression searches for a single literal
backslash.

If using the RegExp constructor with a string literal, remember that the backslash

is an escape in string literals, so to use it in the regular expression, you need to
escape it at the string literal level. /a\*b/ and new RegExp("a\\*b") create the
same expression, which searches for "a" followed by a literal "*" followed by "b".

If escape strings are not already part of your pattern you can add them
using String.replace:

function escapeRegExp(string) {

return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole

matched string

Copy to Clipboard
The "g" after the regular expression is an option or flag that performs a global
search, looking in the whole string and returning all matches.
References
 https://developer.mozilla.org/en-
US/docs/Web/JavaScript/Guide/Regular_Expressions
 https://en.wikipedia.org/wiki/Regular_expression
 http://www.troubleshooters.com/linux/presentations/leap_r
egex/7.html

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
DCF Developers Guide
No ratings yet
DCF Developers Guide
231 pages
Data Migration Testing
No ratings yet
Data Migration Testing
2 pages
Regular Expression
No ratings yet
Regular Expression
11 pages
Regular Expression - Wik..., The Free Encyclopedia PDF
No ratings yet
Regular Expression - Wik..., The Free Encyclopedia PDF
9 pages
Expression (Abbreviated Regex or Regexp) and Sometimes Called A Rational Expression
No ratings yet
Expression (Abbreviated Regex or Regexp) and Sometimes Called A Rational Expression
4 pages
CS312 NLP Lecture 2 Basic Text Processing
No ratings yet
CS312 NLP Lecture 2 Basic Text Processing
10 pages
Python Learn Python Regular Expressions FAST The Ultimate Crash Course To Learning The Basics of Python Regular Expressions - (Acodemy)
No ratings yet
Python Learn Python Regular Expressions FAST The Ultimate Crash Course To Learning The Basics of Python Regular Expressions - (Acodemy)
127 pages
Regular Expressions and Its Applications
No ratings yet
Regular Expressions and Its Applications
6 pages
ATFL Assignment 1
No ratings yet
ATFL Assignment 1
4 pages
Regex Slides PDF
No ratings yet
Regex Slides PDF
435 pages
Regular Expressions To Identify A String
No ratings yet
Regular Expressions To Identify A String
1 page
Regular Expression: Anab Batool Kazmi
No ratings yet
Regular Expression: Anab Batool Kazmi
32 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
DOC4
No ratings yet
DOC4
67 pages
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
No ratings yet
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
42 pages
Regex
100% (1)
Regex
42 pages
Lecture02 Scanning 1
No ratings yet
Lecture02 Scanning 1
72 pages
Application of Regular Expression
No ratings yet
Application of Regular Expression
7 pages
Regex
No ratings yet
Regex
24 pages
Chapter 3 - Regular Expressions
No ratings yet
Chapter 3 - Regular Expressions
49 pages
CC 2
No ratings yet
CC 2
65 pages
Chapter 2
No ratings yet
Chapter 2
209 pages
2 Regular Expression
No ratings yet
2 Regular Expression
23 pages
Applications of Regular Expressions
No ratings yet
Applications of Regular Expressions
2 pages
5a Regular Expressions
No ratings yet
5a Regular Expressions
41 pages
Manipulating Text
No ratings yet
Manipulating Text
13 pages
Compiler
No ratings yet
Compiler
10 pages
Languages and Automata
No ratings yet
Languages and Automata
4 pages
2. Regular Expressions
No ratings yet
2. Regular Expressions
4 pages
Ayan Saha - 10700121101
No ratings yet
Ayan Saha - 10700121101
10 pages
Chapter Two
No ratings yet
Chapter Two
72 pages
class3
No ratings yet
class3
52 pages
Formal Methods: Finite State Machine - Regular Expressions
No ratings yet
Formal Methods: Finite State Machine - Regular Expressions
14 pages
Delos Santos_AutomataResearch2
No ratings yet
Delos Santos_AutomataResearch2
2 pages
Co Data
No ratings yet
Co Data
76 pages
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
No ratings yet
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
206 pages
Using Regular Expressions With PHP
No ratings yet
Using Regular Expressions With PHP
6 pages
CSS Unit 5
No ratings yet
CSS Unit 5
61 pages
Sys LW-08EN Regex-Filters
No ratings yet
Sys LW-08EN Regex-Filters
31 pages
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
No ratings yet
03 Regular Expressions and Grammars Parser Generators 16102023 041542pm
32 pages
Regular Expression: Dept. of Computer Science Faculty of Science and Technology
No ratings yet
Regular Expression: Dept. of Computer Science Faculty of Science and Technology
16 pages
An Introduction To Regular Expressions (9781492082569)
No ratings yet
An Introduction To Regular Expressions (9781492082569)
17 pages
Lecture#06
No ratings yet
Lecture#06
7 pages
Lecture Slides Regular Expressions
No ratings yet
Lecture Slides Regular Expressions
138 pages
REGULAR EXPRESSIONS Workbook
No ratings yet
REGULAR EXPRESSIONS Workbook
8 pages
BBEdit-TextWrangler RegEx Cheat Sheet
No ratings yet
BBEdit-TextWrangler RegEx Cheat Sheet
4 pages
3 REGULAR EXPRESSION
No ratings yet
3 REGULAR EXPRESSION
15 pages
Automata Theory Computability - M2
No ratings yet
Automata Theory Computability - M2
68 pages
Lect2 Lexical
No ratings yet
Lect2 Lexical
9 pages
Introduction to regular expressions
No ratings yet
Introduction to regular expressions
18 pages
Regular Expressions
No ratings yet
Regular Expressions
4 pages
Unix Regular Expression
No ratings yet
Unix Regular Expression
7 pages
Regexp
No ratings yet
Regexp
28 pages
regular expressions - Pattern matching
No ratings yet
regular expressions - Pattern matching
107 pages
POSIX Regular Expressions: Brackets
No ratings yet
POSIX Regular Expressions: Brackets
5 pages
Chapter Two (3) (Autosaved)
No ratings yet
Chapter Two (3) (Autosaved)
29 pages
L4 (2)
No ratings yet
L4 (2)
73 pages
Regular Expressions and Languages
No ratings yet
Regular Expressions and Languages
16 pages
Perl One-Liners: 130 Programs That Get Things Done
From Everand
Perl One-Liners: 130 Programs That Get Things Done
Peteris Krumins
4/5 (3)
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
AZetWEB
No ratings yet
AZetWEB
27 pages
How To Model A Gateway Service Based On CDS View For ABAP Using SADL
No ratings yet
How To Model A Gateway Service Based On CDS View For ABAP Using SADL
3 pages
Assignment 2
No ratings yet
Assignment 2
16 pages
01 Working With Function
100% (1)
01 Working With Function
19 pages
5.3 SQL
No ratings yet
5.3 SQL
29 pages
2024 Power Apps Coding Standards For Canvas Apps
No ratings yet
2024 Power Apps Coding Standards For Canvas Apps
52 pages
ADA - Mid-2 Question Bank
No ratings yet
ADA - Mid-2 Question Bank
16 pages
Css s24 Model Answer Paper of Summer 2024 Exam Css
No ratings yet
Css s24 Model Answer Paper of Summer 2024 Exam Css
31 pages
Visual C++
No ratings yet
Visual C++
3 pages
Python1 PDF
No ratings yet
Python1 PDF
20 pages
ISAM (An Acronym For Indexed Sequential Access Method) Is A Method For Creating, Maintaining, and
No ratings yet
ISAM (An Acronym For Indexed Sequential Access Method) Is A Method For Creating, Maintaining, and
4 pages
Final Exam (OPEN BOOK) : Kohat University of Science & Technology Institute of Computing
No ratings yet
Final Exam (OPEN BOOK) : Kohat University of Science & Technology Institute of Computing
4 pages
OpenCL Heterogeneous Parallel Program For Gaussian Filter
No ratings yet
OpenCL Heterogeneous Parallel Program For Gaussian Filter
8 pages
A Java Library For ZDD
No ratings yet
A Java Library For ZDD
57 pages
FRANC3D V7 Training - Part 9 - Session Log
No ratings yet
FRANC3D V7 Training - Part 9 - Session Log
3 pages
CP+ Lab M
No ratings yet
CP+ Lab M
29 pages
Post and Pre Lab Question
No ratings yet
Post and Pre Lab Question
4 pages
Lesson 17 DIGI Trailblazers Course 1
No ratings yet
Lesson 17 DIGI Trailblazers Course 1
15 pages
MCA (New Course) Scheme
No ratings yet
MCA (New Course) Scheme
2 pages
Ai Module 3
No ratings yet
Ai Module 3
83 pages
Programmed I - O - Floating Point and Procedures - 8086
No ratings yet
Programmed I - O - Floating Point and Procedures - 8086
14 pages
T 24 Delivery
100% (2)
T 24 Delivery
124 pages
Sqltuning Basics
No ratings yet
Sqltuning Basics
12 pages
RA373 Lab 1 CN
No ratings yet
RA373 Lab 1 CN
3 pages
Unit 6 File Indexing and Transaction Processing
No ratings yet
Unit 6 File Indexing and Transaction Processing
21 pages
Assignment 01-Spring24
No ratings yet
Assignment 01-Spring24
3 pages
Python Data Structures Cheat Sheet
No ratings yet
Python Data Structures Cheat Sheet
9 pages
114CS052015
No ratings yet
114CS052015
2 pages

Regular Expression

Uploaded by

Regular Expression

Uploaded by

Ministry Of Higher Education & Scientific

Advantages of Regular Expression.

Using simple patterns

Using special characters

Special characters in regular expressions.

If using the RegExp constructor with a string literal, remember that the backslash

return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole

You might also like