Network Security - 4.2 Reg Ex Primer

network security - 4.2 Reg Ex Primer

Uploaded by

malwares

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Network Security - 4.2 Reg Ex Primer

network security - 4.2 Reg Ex Primer

Uploaded by

malwares

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

A Primer on Regex. What is Regex? Regex is a language for expressing patterns.

The name “Regex”

is from a combination of the words “Regular Expressions.” Regex allows you to express patterns with a
minimum of typing. These expressions can look very cryptic and scary, but are not particularly difficult
to understand if you start with some basics. This primer is geared towards individuals with little or no
programming experience and aims to break down the language of Regex into its fundamental parts so
that anyone (hypothetically) could understand it.

Why Regex? Regex is useful in situations where you need to detect complicated patterns in text,
especially when those patterns can appear multiple times. Regular expressions are portable because
they can be used in just about any of the modern programming languages (like Javascript, C#, Java,
Ruby, Python, PHP, etc.). There are regular expressions for finding email addresses, URLs, HTML tags
or just about any other kind of a pattern that can be expressed in text.

How can we use Regex? Usually, we ask Regex to tell us if it finds a certain pattern in text, and if so,
where. Regex can also tell us where the next place is that it sees the pattern. So if we take the
statement, “there is a frog on a log” and ask Regex where the word “frog” is located, then it will tell us
“position 11.” This is the numbered position in the sentence where the word frog begins:

0 1 2 3 4 5 6 7 8 9 10 11 12 1314 15 16 17 18 19 20 21 22 23
Ther e i s a f r o g o n a l o g

The expression in this case is literally “frog” but we can ask Regex to do more complicated things for
us. For example, we can ask Regex to find any word that ends in “og” and tell us where it is. In order to
say this to Regex, we break it down as follows: we want a word that has any amount of letters (we don't
really care what they are) followed by og. In Regex, the term “\w” signifies a “word letter” (meaning
any letter in the alphabet). The “\” tells Regex “this is a special character” and the “w” says “this is a
word letter (A - Z).” We can follow the “\w” with an “*” meaning “zero to many” or a “+” meaning
“one to many.” So to make an expression that says “one to many letters followed by 'og'” we construct
the following regular expression: \w+og

Now, when we ask Regex to tell us all of the places where it sees the pattern “\w+og”, it tells us
“positions 11 and 21.”

0 1 2 3 4 5 6 7 8 9 10 11 12 1314 15 16 17 18 19 20 21 22 23
Ther e i s a f r o g o n a l o g

This is the tip of the iceberg when it comes to the power of regular expressions to perform pattern
matching on text. Here, we tell Regex that we want one to many word characters. What if we want
exactly one? Just omit the “+” so that we have the following expression: \wog

Or, we might want exactly two. We could do this: \w\wog

But what if we wanted exactly thirteen characters followed by “og”? This situation probably does not
come along often, but you never know. we could write “\w\w\w\w\w\w\w\w\w\w\w\w\wog” but that is
getting a bit ridiculous. Apparently, the creators of regular expressions thought so too, so they created a
special notation that you can use to tell Regex exactly how many times you want it to do something.
Let's say we want to find all words in some text starting with an “s” and ending with an “n” with
exactly 7 letters in between. Here is how we write it: s\w{7}n
This says “s” followed by exactly 7 word characters followed by “n.” Pretty cool, huh? But, what if we
wanted, say, anywhere between 3 and 7 characters between? We just need to modify our last expression
a bit to accommodate this: s\w{3,7}n

When we wrote our expression “s\w{7}n” we were actually telling Regex “s\w{7,7}n” but we can just
put “{7}” for short.

Now on to escaped characters. What are escape characters? And why were they escaping in the first
place? Worry not, because we actually want some of our characters to escape. In regular expressions,
we use the backslash, \, to “escape”certain characters from expressions. This means that when a “w” or
an “s” is in front of a backslash, it carries special meaning of some kind for Regex. So far we have only
used the “\w” in our expressions to detect word characters. Here is a larger list of commonly used
escaped characters:

Character Meaning

\w Word Character

\s Whitespace (space, tab, newline, etc)

\S Non-whitespace (everything but)

\W Non-word characters

. ANY character

\d Numeric digits (0-9 and other international numeric digits)

\D Non-numeric characters

So now, with our newly acquired expert knowledge of Regex escaped characters, we can track down
things like phone numbers and email addresses. There is no escape! Well, you might think that anyway.
Before we move on, we have to point out something here: the “.” character. Alone, in an expression, it
has special meaning: any character. If we were to say “s..p” then it would match with ship, soap, slip,
etc. But, if we say “s\.\.p” then literally “s..p” is what will be matched. Why would this be backwards
from everything else? We have no idea. But it is worth knowing when you make your expressions. A
“.” means any character, and a “\.” means “.” literally. Now, moving on. Let's take a good example of
something that we might search for in text: a phone number. Phone numbers can take a variety of
forms:

(555)555-5555
(555) 555-5555
555-555-5555
1-555-555-5555
555.555.5555

Let's try to match the “555-555-5555” example. There are 3 digits, followed by a dash, followed by
three more digits, followed by a dash, followed by four more digits. So, three digits, \d{3}, followed by
a dash, \d{3}-, followed by three more digits, \d{3}-\d{3}, and another dash, \d{3}-\d{3}-, followed by
four more digits, \d{3}-\d{3}-\d{4}, and there you have it: \d{3}-\d{3}-\d{4}
This pattern will match phone numbers with the pattern “XXX-XXX-XXXX”. But if you look closely,
we are actually repeating information again (like in the “\w\w\w\w\” example). Right at the beginning
of the pattern, you see “\d{3}-” twice. Is there a way to compact this any more? There is, with logical
grouping. In Regex you can logically group different parts of a pattern together with parentheses, and
then treat the group as an individual element. So we could rewrite the expression, \d{3}-\d{3}-, as the
expression, (\d{3}-){2}, and that would make our final expression as follows: (\d{3}-){2}\d{4}

So parentheses, curly braces (the “{” and “}” characters) and backslashes have special meaning in these
patterns: curly braces indicate multiplicity, parentheses logically group things, and backslashes indicate
that the next character after it should be treated specially. This is all fine and dandy, until we want to
find curly braces, parentheses, or backslashes in text. Imagine a situation where we wanted to find, say,
a phone number like “(555) 555-5555.” If you tried to use the pattern, (\d{3})\s\d{3}-\d{4}, where the
first three digits are surrounded by parentheses, then this would actually match “XXX XXX-XXXX”
because Regex thinks that you are trying to logically group the three digits. In order to disillusion our
faithful search algorithm, we need to escape our parentheses. Run away! You escape parentheses just
like anything else: with a backslash. So in order to get this expression right, we have to write it like
this: $\d{3}$\s\d{3}-\d{4}

Now that is what we meant by cryptic looking. If we had shown you something like this in the
beginning of this discussion, you probably would have run screaming in the opposite direction.
Fortunately for you, we're taking it one step at a time, which should hopefully make it easier to
understand. This new pattern is great, but it could be a little better. There is a difference between a
“(XXX) XXX-XXXX” phone number and a “(XXX)XXX-XXXX” number, although they look almost
exactly the same. One has a space after the parentheses, and one does not. It might not seem like much
of a difference, but to a computer the difference is significant. Is there a way to make a pattern that can
handle either scenario? Well, we could put a multiplicity constraint in front of the whitespace character
to make it zero to one, like this:$\d{3}$\s{0,1}\d{3}-\d{4}

That works, but there is a quicker way to say the same thing. Remember the “*” and “+” operators for
“zero to many” and “one to many”? There is also an operator for zero to one, the “?” character. So if we
rewrite our pattern as, $\d{3}$\s?\d{3}-\d{4}, then we have an equivalent expression that says,
“exactly three digits surrounded with parentheses followed by an optional space, followed by exactly
three digits, one dash, then exactly four digits.”

Hopefully, this has helped you get somewhat of a basic understanding of what Regex can do. There are
entire books written on the subject, and millions of programs and web sites all over the world that use
them.

By Wolfgang Meyers

FENDT Vario-Terminal
No ratings yet
FENDT Vario-Terminal
33 pages
Energy Management & Energy Management & Solutions Systems
No ratings yet
Energy Management & Energy Management & Solutions Systems
15 pages
VBA - Regular Expressions in VBScript
No ratings yet
VBA - Regular Expressions in VBScript
4 pages
Regex
100% (1)
Regex
42 pages
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
No ratings yet
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
42 pages
Regular Expressions
No ratings yet
Regular Expressions
35 pages
Regex
No ratings yet
Regex
24 pages
Chapter 5 Regular Expression, Rollover and Frames
No ratings yet
Chapter 5 Regular Expression, Rollover and Frames
56 pages
Lesson 1: An Introduction, and The Abcs
No ratings yet
Lesson 1: An Introduction, and The Abcs
2 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
No ratings yet
Jan Goyvaerts - All About Regular Expressions-Https - WWW - Regular-Expressions - Info - (2019)
206 pages
Regular Expressions Guide and Practice
No ratings yet
Regular Expressions Guide and Practice
21 pages
Regular Expressions
No ratings yet
Regular Expressions
5 pages
[CSC221 2024-02-08]Regular Expressions
No ratings yet
[CSC221 2024-02-08]Regular Expressions
21 pages
Chapter 5 Regular Expressions, Rollover and Frames Regular Expression
No ratings yet
Chapter 5 Regular Expressions, Rollover and Frames Regular Expression
16 pages
Regular Expressions Basics
No ratings yet
Regular Expressions Basics
11 pages
howto-regex
No ratings yet
howto-regex
20 pages
Python RegEx
No ratings yet
Python RegEx
8 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
7 pages
Regular Expressions
No ratings yet
Regular Expressions
4 pages
Regular Expressions
100% (5)
Regular Expressions
94 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Regular Expression in Javascript Regular Expression
No ratings yet
Regular Expression in Javascript Regular Expression
5 pages
Howto Regex PDF
No ratings yet
Howto Regex PDF
20 pages
Regular Expressions (Slides)
No ratings yet
Regular Expressions (Slides)
20 pages
Regular Expressions
No ratings yet
Regular Expressions
9 pages
CHAPTER 10
No ratings yet
CHAPTER 10
28 pages
Supplement Python Regular Expression
No ratings yet
Supplement Python Regular Expression
6 pages
Regular Expression Syntax
No ratings yet
Regular Expression Syntax
9 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
20 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
18 pages
Howto Regex
No ratings yet
Howto Regex
19 pages
How To Write Regular Expressions?: What Is A Regular Expression and What Makes It So Important?
No ratings yet
How To Write Regular Expressions?: What Is A Regular Expression and What Makes It So Important?
2 pages
Java Regex Tutorial: Lars Vogel
No ratings yet
Java Regex Tutorial: Lars Vogel
20 pages
Regex Tutorial-A Quick Cheatsheet by Examples: Anchors - and $
No ratings yet
Regex Tutorial-A Quick Cheatsheet by Examples: Anchors - and $
7 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
No ratings yet
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
No ratings yet
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
2 Regular Expression
No ratings yet
2 Regular Expression
23 pages
Learning REGEX
No ratings yet
Learning REGEX
94 pages
Lecture 6 Re Basics
No ratings yet
Lecture 6 Re Basics
12 pages
2 NLP PDF
No ratings yet
2 NLP PDF
10 pages
Regex Tutorial - A Quick Cheatsheet by Examples - by Jonny Fox - Factory Mind - Medium
No ratings yet
Regex Tutorial - A Quick Cheatsheet by Examples - by Jonny Fox - Factory Mind - Medium
7 pages
All About Regular Expressions 2019th Edition Jan Goyvaerts download
100% (1)
All About Regular Expressions 2019th Edition Jan Goyvaerts download
87 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
100% (1)
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Regular Expression Python
No ratings yet
Regular Expression Python
23 pages
Regular Expression Overview
No ratings yet
Regular Expression Overview
5 pages
Python How To Regex
No ratings yet
Python How To Regex
19 pages
Lesson 1: Introducing Regular Expressions
No ratings yet
Lesson 1: Introducing Regular Expressions
4 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Regex All in One Guide
No ratings yet
Regex All in One Guide
16 pages
Howto Regex
No ratings yet
Howto Regex
17 pages
Lecture 9
No ratings yet
Lecture 9
26 pages
L4 (2)
No ratings yet
L4 (2)
73 pages
40-Multitrack TM, Pattern Matching-02-05-2024
No ratings yet
40-Multitrack TM, Pattern Matching-02-05-2024
17 pages
Hunt A Killer: The Detective's Puzzle Book 2: Tricky Ciphers, Mysterious Riddles, and More True Crime-Inspired Logic Games
From Everand
Hunt A Killer: The Detective's Puzzle Book 2: Tricky Ciphers, Mysterious Riddles, and More True Crime-Inspired Logic Games
Hunt A Killer
No ratings yet
Conundrum: Crack the Ultimate Cipher Challenge
From Everand
Conundrum: Crack the Ultimate Cipher Challenge
Brian Clegg
No ratings yet
The Reasoned Schemer, second edition
From Everand
The Reasoned Schemer, second edition
Daniel P. Friedman
4/5 (16)
Flying Space Available on Military Aircraft
From Everand
Flying Space Available on Military Aircraft
W. Addison Gast
No ratings yet
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
Just the basics of JavaScript
From Everand
Just the basics of JavaScript
Tom Henricksen
No ratings yet
Word Juggling
From Everand
Word Juggling
John Cramer
No ratings yet
Hunt A Killer: The Detective's Puzzle Book: True-Crime Inspired Ciphers, Codes, and Brain Games
From Everand
Hunt A Killer: The Detective's Puzzle Book: True-Crime Inspired Ciphers, Codes, and Brain Games
Hunt A Killer
No ratings yet
International Journal On Recent and Inno
No ratings yet
International Journal On Recent and Inno
5 pages
Computer Science: Gwyddor Cyfrifiadur
No ratings yet
Computer Science: Gwyddor Cyfrifiadur
40 pages
AXIS Camera Station Solution Troubleshooting Guide: User Manual
No ratings yet
AXIS Camera Station Solution Troubleshooting Guide: User Manual
18 pages
Logistic - Poly Regression
No ratings yet
Logistic - Poly Regression
13 pages
161 - EbookThe Python Book - The Ultimate Guide To Coding With Python (PDFDrive)
No ratings yet
161 - EbookThe Python Book - The Ultimate Guide To Coding With Python (PDFDrive)
20 pages
Ads Mse2
No ratings yet
Ads Mse2
4 pages
Non-Negative Matrix Factorization, A New Tool For Feature Extraction: Theory and Applications
No ratings yet
Non-Negative Matrix Factorization, A New Tool For Feature Extraction: Theory and Applications
8 pages
LFS167x - Introduction To Jenkins: Course Overview
No ratings yet
LFS167x - Introduction To Jenkins: Course Overview
9 pages
CSE3008 Module4
No ratings yet
CSE3008 Module4
32 pages
Project Proposal
No ratings yet
Project Proposal
12 pages
Exam Guide - 406 - Kinetic Tools Management
No ratings yet
Exam Guide - 406 - Kinetic Tools Management
8 pages
Functional and Concurrent Programming Core Concepts and Features 1st Edition Charpentier download
100% (1)
Functional and Concurrent Programming Core Concepts and Features 1st Edition Charpentier download
42 pages
Using Supertags in Intouch 7.X: What Is A Supertag?
No ratings yet
Using Supertags in Intouch 7.X: What Is A Supertag?
7 pages
Credit Card
No ratings yet
Credit Card
9 pages
Chatgpt Prompt
No ratings yet
Chatgpt Prompt
80 pages
Installation and Configuration of Wamp Server
No ratings yet
Installation and Configuration of Wamp Server
24 pages
ASSESSMENT ICT551 MAC-AUGUST 2021 SF
No ratings yet
ASSESSMENT ICT551 MAC-AUGUST 2021 SF
3 pages
Gpchsm Erp
No ratings yet
Gpchsm Erp
1 page
Yusuf Khan - INFINITE SERIES 2019-05-28-1
No ratings yet
Yusuf Khan - INFINITE SERIES 2019-05-28-1
9 pages
Luxembourg - Periodic VAT Return: SAP Library Documentation
No ratings yet
Luxembourg - Periodic VAT Return: SAP Library Documentation
10 pages
TCP in Wireless Domain
No ratings yet
TCP in Wireless Domain
30 pages
2 CET MCQs Computer Memory
No ratings yet
2 CET MCQs Computer Memory
13 pages
Unit II: Chapter 1 - Corporate Culture and Communication
100% (2)
Unit II: Chapter 1 - Corporate Culture and Communication
21 pages
CAT WFH Process Guide (Applicant's Guide)
No ratings yet
CAT WFH Process Guide (Applicant's Guide)
1 page
18cs32 - Data Structure - Notes
No ratings yet
18cs32 - Data Structure - Notes
89 pages
ISMS Mandatory Documentation Checklist
No ratings yet
ISMS Mandatory Documentation Checklist
29 pages
Lenovo Laptop Project
100% (1)
Lenovo Laptop Project
17 pages
Docu10450 - Documentum Content Server 6.6 Full Text Indexing Deployment Guide
No ratings yet
Docu10450 - Documentum Content Server 6.6 Full Text Indexing Deployment Guide
59 pages