Regular Expressions
Regular Expressions
The select tool searches the data for lines containing or not containing a match to the given pattern.
Regular Expression is introduced in this tool. A Regular Expression is a pattern describing a certain
amount of text.
Example
· ^chr([0-9A-Za-z])+ would match lines that begin with chromosomes, such as lines in a BED
format file.
· (ACGT){1,5} would match at least 1 "ACGT" and at most 5 "ACGT" consecutively.
· ([^,][0-9]{1,3})(,[0-9]{3})* would match a large integer that is properly separated with commas
such as 23,078,651.
· (abc)|(def) would match either "abc" or "def".
· ^\W+# would match any line that is a comment.
So to learn about regex basics, We need to start learning about some special
characters that are known as MetaCharacters. They help us in creating more
complex regex search term. Mentioned below is the list of basic
metacharacters,
[^ ] will match all character except for the one mentioned in braces
\ is
an escape character, used when we need to include one of the
metacharacters is our search.