DOC4
DOC4
Palani Karthikeyan
[email protected]
What are regular expressions?
● A regular expression is a pattern that describes a
set of strings.
● Regular expressions are used to search and
manipulate the text, based on the patterns.
● A regular expression, often shortened to “regex” or
“regexp”.
● Regexes enhance the ability to meaningfully
process text content, especially when combined
with other commands.
grep ,sed,awk
● Usually, regular expressions are included in the
grep,sed and awk in the following format:
● grep [options] [regexp] [inputfile]
● In sed : sed [option] '/[regexp]action/' [inputfile]
● In awk: awk [option] '/[regexp]{Action}' [inputfile]
BRE & ERE
● Two types of regular expression feature in
unix/Linux shell
● Basic Regular Expression – BRE
● Extended Regular Expression – ERE
BRE
● BRE – following meta characters are used
● . (dot) Matches any single character.
● ^ match expression at the start of a line, as ^PATTERN
● $ match expression at the end of a line, as in PATTERN$.
● \ (Back Slash) = turn off the special meaning of the next character, as in \^
● [ ] (Brackets)=match any one of the enclosed characters
● [^ ]= match any one character except those enclosed in [ ]
● * (Asterisk) = match zero or more of the preceding character or expression
● ^PATTERN$ = match PATTERN only in single line
● [-]=Character ranges as [A-Z] [0-9] [a-z] [A-Za-z0-9]
ERE
● ERE – Following meta characters are used.
● ? means that the preceding item is optional, and if found, will be matched at the
most, once.
● + means the preceding item will be matched one or more times.
● {n} means the preceding item is matched exactly n times
●
{n,} means the item is matched n or more times.
●
{n,m} means that the preceding item is matched at least n times, but not more
than m times.
●
{,m} means that the preceding item is matched, at the most, m times.
● | (alternation) operator means that the pattern containing this operator separately
matches the parts on either side of it; if either one is found, the line containing it is
a match.
●
( ) Grouping means that ( ) to group several patterns to behave as one.
ERE
● In general ERE supports following operations
– Alternative Match Patterns
– Grouping Alternatives
– Quantifiers
Alternative Match Patterns
Example:-
$var =~ /st+/ # Will match for the strings
like “st”,”sttr”, “sts” ,”star “, but not “son”.
Quantifiers (Contd..)
character Description
Example : -
{n} - should match exactly n times.
{n,} - should match at least n times
{n, m} - Should match at least n times but
not more than m times.
Example :
$var =~ /mn{2,4}p/ # will match “mnnp”,
“mnnnp”, ”mnnnnp” .
Making Quantifiers Less Greedy