0% found this document useful (0 votes)
63 views

Short Perl Tutorial: Instructor: Rada Mihalcea University of Antwerp

This document provides a short tutorial on the Perl programming language. It covers Perl's history and design, basic syntax like variables and data types, operations, conditional and iterative structures, strings, regular expressions, lists and arrays. The tutorial is presented over 13 slides that progressively introduce Perl concepts like variables, arithmetic, input/output, conditionals, loops, strings, regexes, and lists. Examples are provided throughout to demonstrate each concept.

Uploaded by

Ashoka Vanjare
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Short Perl Tutorial: Instructor: Rada Mihalcea University of Antwerp

This document provides a short tutorial on the Perl programming language. It covers Perl's history and design, basic syntax like variables and data types, operations, conditional and iterative structures, strings, regular expressions, lists and arrays. The tutorial is presented over 13 slides that progressively introduce Perl concepts like variables, arithmetic, input/output, conditionals, loops, strings, regexes, and lists. Examples are provided throughout to demonstrate each concept.

Uploaded by

Ashoka Vanjare
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 34

Short Perl tutorial

Instructor: Rada Mihalcea



Note: some of the material in this slide set was adapted from a Perl course taught at
University of Antwerp
Slide 1
About Perl
1987
Larry Wall Develops PERL

1989
October 18 Perl 3.0 is released
under the GNU Protection
License

1991
March 21 Perl 4.0 is released
under the GPL and the new
Perl Artistic License

Now
Perl 5.14
PERL is not officially a
Programming Language per se.

Walls original intent was to
develop a scripting language
more powerful than Unix Shell
Scripting, but not as tedious as
C.

PERL is an interpreted language.
That means that there is no
explicitly separate compilation
step.

Rather, the processor reads the
whole file, converts it to an
internal form and executes it
immediately.

P.E.R.L. = Practical Extraction and
Report Language
Slide 2
Variables

A variable is a name of a place where some information is stored. For
example:

$yearOfBirth = 1976;
$currentYear = 2011;
$age = $currentYear-$yearOfBirth;
print $age;

The variables in the example program can be identified as such because their names
start with a dollar ($). Perl uses different prefix characters for structure names in
programs. Here is an overview:

$: variable containing scalar values such as a number or a string
@: variable containing a list with numeric keys
%: variable containing a list with strings as keys
&: subroutine
*: matches all structures with the associated name

Slide 3
Operations on numbers

Perl contains the following arithmetic operators:
+: sum
-: subtraction
*: product
/: division
%: modulo division
**: exponent

Apart from these operators, Perl contains some built-in arithmetic functions.
Some of these are mentioned in the following list:

abs($x): absolute value
int($x): integer part
rand(): random number between 0 and 1
sqrt($x): square root


Slide 4
Input and output

# age calculator
print "Please enter your birth year ";
$yearOfBirth = <>;
chomp($yearOfBirth);
print "Your age is ",2012-$yearOfBirth,".\n";



# count the number of lines in a file
open INPUTFILE, <$myfile;
(-r INPUTFILE) || die Could not open the file $myfile\n;

$count = 0;

while($line = <INPUTFILE>) {
$count++;
}

print $count lines in file $myfile\n;


# open for writing
open OUTPUTFILE, >$myfile;
Slide 5
Conditional structures

# determine whether number is odd or even

print "Enter number: ";
$number = <>;
chomp($number);
if ($number-2*int($number/2) == 0) {
print "$number is even\n";
}
elsif (abs($number-2*int($number/2)) == 1) {
print "$number is odd\n";
}
else {
print "Something strange has happened!\n";
}
Slide 6
Numeric test operators
An overview of the numeric test operators:
==: equal
!=: not equal
<: less than
<=: less than or equal to
>: greater than
>=: greater than or equal to

All these operators can be used for comparing two numeric values in an if
condition.


Truth expressions

three logical operators:
and: and (alternative: &&)
or: or (alternative: ||)
not: not (alternative: !)


Slide 7
Iterative structures

#print numbers 1-10 in three different ways
$i = 1;
while ($i<=10) {
print "$i\n";
$i++;
}

for ($i=1;$i<=10;$i++) {
print "$i\n";
}

foreach $i (1,2,3,4,5,6,7,8,9,10) {
print "$i\n";
}


Stop a loop, or force continuation:
last; # C break

next; # C continue;


Exercise:
Read ten numbers and print the largest, the smallest and a count representing how
many of them are dividable by three.

if (not(defined($largest)) or $number > $largest) { $largest = $number; }
if ($number-3*int($number/3) == 0) { $count3++; }
Slide 8
A paranthesis
PERL philosophy(ies)


There is more than one way to do it



If you want to shoot yourself in the foot,
who am I to stop you?


And a comment: DO write comments in your Perl
programs!

Slide 9
Basic string operations

- strings are stored in the same type of variables we used for storing numbers

- string values can be specified between double and single quotes

- !!! in the first one variables will be evaluated, in the second one they will not.

Comparison operators for strings

- eq: equal
- ne: not equal
- lt: less than
- le: less than or equal to
- gt: greater than
- ge: greater than or equal to

Examples:

if ($a eq $b) {
.
}
Slide 10
String substitution and string
matching

The power of Perl!

The s/// operator modifies sequences of characters (substitute)
The tr/// operator changes individual characters. (translate)
The m// operator checks for matching (or in short //) (match)

- the first part between the first two slashes contains a search pattern
- the second part between the final two slashes contains the replacement.
- behind the final slash we can put characters to modify the behavior of the
commands.

By default s/// only replaces the first occurrence of the search pattern
- append a g to the operator to replace every occurrence.
- append an i to the operator, to have the search case insensitive

tr translates the characters in the first set of characters into the characters of
the second set
- if the second set is shorter, the last character is multiplied
- if the second set is longer, the exceeding characters are truncated

The tr/// operator allows the modification characters
- c (replace the complement of the search class)
- d (delete characters of the search class that are not replaced)
- s (squeeze sequences of identical replaced characters to one character)
Slide 11
Examples
# replace first occurrence of "bug"
$text =~ s/bug/feature/;

# replace all occurrences of "bug"
$text =~ s/bug/feature/g;

# convert to lower case
$text =~ tr/[A-Z]/[a-z]/;

# delete vowels
$text =~ tr/AEIOUaeiou//d;

# replace nonnumber sequences with x
$text =~ tr/[0-9]/x/cs;

# replace all capital characters by CAPS
$text =~ s/[A-Z]/CAPS/g;

Simple example:
Print all lines from a file that include a given sequence of characters
[emulate grep behavior]

Slide 12
Regular expressions

\b: word boundaries
\d: digits
\n: newline
\r: carriage return
\s: white space characters
\t: tab
\w: alphanumeric characters
^: beginning of string
$: end of string
.: any character
[bdkp]: characters b, d, k and p
[a-f]: characters a to f
[^a-f]: all characters except a to f
abc|def: string abc or string def
[:alpha:],[:punct:],[:digit:], - use inside character class e.g., [[:alpha:]]


*: zero or more times
+: one or more times
?: zero or one time
{p,q}: at least p times and at most q times
{p,}: at least p times
{p}: exactly p times


Examples:
1. Clean an HTML formatted text


2. Grab URLs from a Web page


3. Transform all lines from a file into
lower case


Slide 13
Lists and arrays

@a = (); # empty list
@b = (1,2,3); # three numbers
@c = ("Jan","Piet","Marie"); # three strings
@d = ("Dirk",1.92,46,"20-03-1977"); # a mixed list

Variables and sublists are interpolated in a list
@b = ($a,$a+1,$a+2); # variable interpolation
@c = ("Jan",("Piet","Marie")); # list interpolation
@d = ("Dirk",1.92,46,(),"20-03-1977"); # empty list interpolation
@e = ( @b, @c ); # same as (1,2,3,"Jan","Piet","Marie")

Practical construction operators
($x..$y)
@x = (1..6) # same as (1, 2, 3, 4, 5, 6)
@y = (1.2..4.2) # same as (1.2, 2.2, 3.2, 4.2, 5.2)
@z = (2..5,8,11..13) # same as (2,3,4,5,8,11,12,13)

qw() ("quote word") function
qw(Jan Piet Marie) is a shorter notation for ("Jan","Piet","Marie").

split function


Slide 14
Split function
$string = "Jan Piet\nMarie \tDirk";
@list = split /\s+/, $string; # yields ( "Jan","Piet","Marie","Dirk" )

$string = " Jan Piet\nMarie \tDirk\n"; # watch out, empty string at the begin and end!!!
@list = split /\s+/, $string; # yields ( "", "Jan","Piet","Marie","Dirk", "" )

$string = "Jan:Piet;Marie---Dirk"; # use any regular expression...
@list = split /[:;]|---/, $string; # yields ( "Jan","Piet","Marie","Dirk" )

$string = "Jan Piet"; # use an empty regular expression to split on letters
@letters= split //, $string; # yields ( "J","a","n"," ","P","i","e","t")




Example:

1. Tokenize a text: separate simple punctuation (, . ; ! ? ( ) )

2. Add all the digits in a number

Slide 15
More about arrays
@array = ("an","bert","cindy","dirk");
$length = @array; # $length now has the value 4

@array = ("an","bert","cindy","dirk");
$length = @array;
print $length; # prints 4
print $#array; # prints 3
print $array[$#array] # prints "dirk"
print scalar(@array) # prints 4


($a, $b) = ("one","two");
($onething, @manythings) = (1,2,3,4,5,6) # now $onething equals 1 # and
# @manythings = (2,3,4,5,6)
($array[0],$array[1]) = ($array[1],$array[0]); # swap the first two


Pay attention to the fact that assignment to a variable first evaluates the right hand-
side of the expression, and then makes a copy of the result

@array = ("an","bert","cindy","dirk");
@copyarray = @array; # makes a copy
$copyarray[2] = "XXXXX";
Slide 16
Manipulating lists and their elements

push ARRAY LIST
appends the list to the end of the array.
if the second argument is a scalar rather than a list, it appends it as the last
item of the array.
@array = ("an","bert","cindy","dirk");
@brray = ("evelien","frank");
push @array, @brray; # @array is ("an","bert","cindy","dirk","evelien","frank")
push @brray, "gerben"; # @brray is ("evelien","frank","gerben")

pop ARRAY does the opposite of push. it removes the last item of its
argument list and returns it. if the list is empty it returns undef.
@array = ("an","bert","cindy","dirk");
$item = pop @array; # $item is "dirk" and @array is ( "an","bert","cindy")

shift ARRAY works on the left end of the list, but is otherwise the same as
pop.

unshift ARRAY LIST puts stuff on the left side of the list, just as push does
for the right side.

Slide 17
Working with lists

Convert lists to strings
@array = ("an","bert","cindy","dirk");
print "The array contains $array[0] $array[1] $array[2] $array[3]";

# interpolate
print "The array contains @array";

function join STRI NG LI ST.
$string = join ":", @array; # $string now has the value "an:bert:cindy:dirk"
$string = join "+", "", @array; # $string now has the value "+an+bert+cindy+dirk"

Iteration over lists
for( $i=0 ; $i<=$#array; $i++){
$item = $array[$i];
$item =~ tr/a-z/A-Z/;
print "$item ";
}

foreach $item (@array){
$item =~ tr/a-z/A-Z/;
print "$item "; # prints a capitalized version of each item
}
Slide 18
Grep and map
grep CONDITION LIST
returns a list of all items from list that satisfy some condition.

For example:
@large = grep $_ > 10, (1,2,4,8,16,25); # returns (16,25)
@i_names = grep /i/, @array; # returns ("cindy","dirk")

Example:
Print all lines from a file that include a given sequence of characters
[emulate grep behavior]


map OPERATION LIST
is an extension of grep, and performs an arbitrary operation on each element
of a list.

For example:
@more = map $_ + 3, (1,2,4,8,16,25); # returns (4,5,7,11,19,28)
@initials = map substr($_,0,1), @array; # returns ("a","b","c","d")

Slide 19
Hashes (Associative Arrays)
- associate keys with values
- allows for almost instantaneous lookup of a value that is associated
with some particular key

Existing, Defined and true.

- If the value for a key does not exist in the hash, the access to it returns
the undef value.

- special test function exists(HASHENTRY), which returns true if the
hash key exists in the hash

- if($hash{$key}){...}, or if(defined($hash{$key})){...} return false if the
key $key has no associated value

Slide 20
Hashes (contd)
Examples
$wordfrequency{"the"} = 12731; # creates key "the", value 12731
$phonenumber{"An De Wilde"} = "+31-20-6777871";
$index{$word} = $nwords;
$occurrences{$a}++; # if this is the first reference,
# the value associated with $a will
# be increased from 0 to 1


%birthdays = ("An","25-02-1975","Bert","12-10-1953","Cindy","23-05-
1969","Dirk","01-04-1961"); # fill the hash

%birthdays = (An => "25-02-1975", Bert => "12-10-1953", Cindy => "23-05-1969",
Dirk => "01-04-1961" ); # fill the hash; the same as above, but more explicit

@list = %birthdays; # make a list of the key/value pairs

%copy_of_bdays = %birthdays; # copy a hash

Slide 21
Operations on Hashes

- keys HASH returns a list with only the keys in the hash. As with any list,
using it in a scalar context returns the number of keys in that list.

- values HASH returns a list with only the values in the hash, in the same
order as the keys returned by keys.

foreach $key (sort keys %hash ){
push @sortedlist, ($key , $hash{$key} );
print "Key $key has value $hash{$key}\n";
}


Slide 22
Operations on Hashes
reverse the direction of the mapping, i.e. construct a hash with keys and
values swapped:
%backwards = reverse %forward;
(if %forward has two identical values associated with different keys, those will end up
as only a single element in %backwards)

- hash slice
@birthdays{"An","Bert","Cindy","Dirk"} = ("25-02-1975","12-10-1953","23-05-
1969","01-04-1961");

- each( HASH ) traverse a hash
while (($name,$date) = each(%birthdays)) {
print "$name's birthday is $date\n";
}
# alternative: foreach $key (keys %birthdays)

Slide 23
Multidimensional data structures

- Perl does not really have multi-dimensional data structures, but a nice
way of emulating them, using references

$matrix[$i][$j] = $x;
$lexicon1{"word"}[1] = $partofspeech;
$lexicon2{"word"}{"noun"} = $frequency;

Array of arrays
@matrix = ( # an array of references to anonymous arrays
[1, 2, 3], [4, 5, 6], [7, 8, 9]
);

Slide 24
Multidimensional structures
Hash of arrays
%lexicon1 = ( # a hash from strings to anonymous arrays
the => [ "Det", 12731 ],
man => [ "Noun", 658 ],
with => [ "Prep", 3482 ]
);

Hash of hashes
%lexicon2 = ( # a hash from strings to anonymous hashes of strings to
numbers
the => { Det => 12731 },
man => { Noun => 658 , Verb => 12 },
with => { Prep => 3482 }
);

Slide 25
Programming Example
A program that reads lines of text, gives a unique index number to each word and
counts the word frequencies

#!/usr/local/bin/perl
# read all lines in the input

$nwords = 0;
while(defined($line = <>)){
# cut off leading and trailing whitespace
$line =~ s/^\s*//;
$line =~ s/\s*$//;
# and put the words in an array
@words = split /\s+/, $line;
if(!@words){
# there are no words?
next;
}
# process each word...
while($word = pop @words){
# if it's unknown assign a new index
if(!exists($index{$word})){
$index{$word} = $nwords++;
}
# always update the frequency
$frequency{$word}++;
}
}
# now we print the words sorted
foreach $word ( sort keys %index ){
print "$word has frequency $frequency{$word} and index $index{$word}\n";
}

Slide 26
A note on sorting
If we would like to have the words sorted by their frequency instead of by
alphabet, we need a construct that imposes a different sort order.

sort function can use any sort order that is provided as an expression.

- the usual alphabetical sort order:
sort { $a cmp $b } @list;

!! $a and $b are placeholders for the two items from the list that are to be
compared. Do not attempt to replace them with other variable names. Using
$x and $y instead will not provide the same effect

- a numerical sort order
sort { $a <=> $b } @list;

- for a reverse sort, change the order of the arguments:
sort { $b <=> $a } @list;

- sort the keys of a hash by their value instead of by their own identity,
substitute the values for the arguments of sort:
sort { $hash{$b} <=> $hash{$a} } ( keys %hash )
Slide 27
Basics about Subroutines
Calls to subroutines can be recognized because subroutine names often
start with the special character &.

sub askForInput {
print "Please enter something: ";
}
# function call
&askForInput();

Tip: put related subroutines in a file (usually with the extention .pm = perl
module) and include the file with the command require:

# files with subroutines are stored here
use lib "C:\PERL\MYLIBS";

# we will use this file
require "nlp";


Slide 28
Variables Scope
A variable $a is used both in the subroutine and in the main part program of
the program.

$a = 0;
print "$a\n";

sub changeA {
$a = 1;
}

print "$a\n";
&changeA();
print "$a\n";

The value of $a is printed three times. Can you guess what values are printed?
- $a is a global variable.

Slide 29
Variables Scope
Hide variables from the rest of the program using my.

my $a = 0;
print "$a\n";
sub changeA {
my $a = 1;
}
print "$a\n";
&changeA();
print "$a\n";

What values are printed now?

Slide 30
Communication between subroutines
and programs

Provide the arguments of the subroutine call:

&doSomething(2,"a",$abc).
- Perl converts all arguments to a flat list. This means that
&doSomething((2,"a"),$abc) will result in the same list of arguments as the
earlier example.


Access the argument values inside the procedure with the special list @_.

E.g. my($number, $letter, $string) = @_; # reads the parameters from @_

- A tricky problem is passing two or more lists as arguments of a subroutine.
&sub(@a,@b) the subroutine receives the two list as one big one and it will
be unable to determine where the first ends and where the second starts.

- pass the lists as reference arguments:
&sub(\@a,\@b).

Slide 31

- Subroutines also use a list as output.
# the return statement from a subroutine
return(1,2); # or simply (1,2)

# read the return values from the subroutine
($a,$b) = &subr().

- Read the main program arguments using $ARGC and @ARGV (same
as in C)

Slide 32
More about file management
open(INFILE,"myfile"): reading
open(OUTFILE,">myfile"): writing
open(OUTFILE,">>myfile"): appending
open(INFILE,"someprogram |"): reading from program
open(OUTFILE,"| someprogram"): writing to program
opendir(DIR,"mydirectory"): open directo

Operations on an open file handle
$a = <INFILE>: read a line from INFILE into $a
@a = <INFILE>: read all lines from INFILE into @a
$a = readdir(DIR): read a filename from DIR into $a
@a = readdir(DIR): read all filenames from DIR into @a
read(INFILE,$a,$length): read $length characters from INFILE into $a
print OUTFILE "text": write some text in OUTFILE

Close files / directories
close(FILE): close a file
closedir(DIR): close a directory


Slide 33
Other file management commands
binmode(HANDLE): change file mode from text to binary
unlink("myfile"): delete file myfile
rename("file1","file2"): change name of file file1 to file2
mkdir("mydir"): create directory mydir
rmdir("mydir"): delete directory mydir
chdir("mydir"): change the current directory to mydir
system("command"): execute command command
die("message"): exit program with message message
warn("message"): warn user about problem message

Example
open(INFILE,"myfile") or die("cannot open myfile!");

Other

About $_

Holds the content of the current variable
Examples:
while(<INFILE>) # $_ contains the current line read
foreach (@array) # $_ contains the current element in @array

You might also like