How To Write Regular Expressions?: What Is A Regular Expression and What Makes It So Important?
How To Write Regular Expressions?: What Is A Regular Expression and What Makes It So Important?
Repeaters : * , + and { } :
These symbols act as repeaters and tell the computer that the preceding character
is to be used for more than just one time.
The asterisk symbol ( * ):
It tells the computer to match the preceding character (or set of characters) for 0 or
more times (upto infinite).
Example : The regular expression ab*c will give ac, abc, abbc,
abbbc….ans so on
The Plus symbol ( + ):
It tells the computer to repeat the preceding character (or set of characters) for
atleast one or more times(upto infinite).
Example : The regular expression ab+c will give abc, abbc,
abbc, … and so on.
The curly braces {…}:
It tells the computer to repeat the preceding character (or set of characters) for as
many times as the value inside this bracket.
Example : {2} means that the preceding character is to be repeated 2
times, {min,} means the preceding character is matches min or more
times. {min,max} means that the preceding character is repeated at
least min & at most max times.
Wildcard – ( . )
The dot symbol can take place of any other symbol, that is why it
is called the wildcard character.
Example :
The Regular expression .* will tell the computer that any character
can be used any number of times.
Optional character – ( ? )
This symbol tells the computer that the preceding character may
or may not be present in the string to be matched.
Example :
We may write the format for document file as – “docx?”
The ‘?’ tells the computer that x may or may not be
present in the name of file format.
The caret ( ^ ) symbol: Setting position for match :tells the computer that the
match must start at the beginning of the string or line.
Example : ^\d{3} will match with patterns like "901" in "901-333-".
The dollar ( $ ) symbol
It tells the computer that the match must occur at the end of the string or before \n
at the end of the line or string.
Example : -\d{3}$ will match with patterns like "-333" in "-901-
333".
Character Classes
A character class matches any one of a set of characters. It is used to match the
most basic element of a language like a letter, a digit, space, a symbol etc.
/s : matches any whitespace characters such as space and tab
/S : matches any non-whitespace characters
/d : matches any digit character
/D : matches any non-digit characters
/w : matches any word character (basically alpha-numeric)
/W : matches any non-word character
/b : matches any word boundary (this would include spaces, dashes, commas,
semi-colons, etc)
[set_of_characters] – Matches any single character in set_of_characters. By
default, the match is case-sensitive.
Example : [abc] will match characters a,b and c in any string.
[^set_of_characters] – Negation: Matches any single character that is not in
set_of_characters. By default, the match is case sensitive.