_________________________________________________________________ PERL Practical Extraction and Report Language or Pathologically Eclectic Rubbish Lister CS/EE Systems Group Workshop Lecture Trent A. Fisher Based on a lecture by Tom Christiansen CONVEX Computer Corporation _________________________________________________________________ 1 Version 1.3, 92/11/20. _________________________________________________________________ Overview o What is Perl: features, advantages, disadvantages, preview o Data Types: scalars and various sorts of arrays and their asso- ciated builtin functions. o Regular Expressions o Operators o Flow Control o I/O: regular I/O, system functions, directory access, formatted I/O o Functions and Subroutines o Advanced topics: eval, dbm file access, dynamic scoping suid scripts, debugging, packages, library packages and command line options o Examples _________________________________________________________________ 2 Version 1.3, 92/11/20. _________________________________________________________________ What is Perl? o An interpreted language that looks a lot like C with built-in sed, awk, and sh, as well as bits of csh, Pascal, FORTRAN and BASIC-PLUS thrown in. o Highly optimized for manipulating printable text, but also able to handle binary data. o Especially suitable for system management tasks due to inter- faces to most common system calls. o Rich enough for most general programming tasks. o A shell for C programmers. [Larry Wall] _________________________________________________________________ 3 Version 1.3, 92/11/20. _________________________________________________________________ Features o Easy to learn because much derives from existing tools. o More rapid program development because it's an interpreter o Faster execution than shell script equivalents. o More powerful than sed, awk or sh. o Translators available for old find, sed and awk scripts o Comes with a symbolic debugger (itself written in Perl). o Portable across many different architectures. o Absence of arbitrary limits like string length. o It's free! _________________________________________________________________ 4 Version 1.3, 92/11/20. _________________________________________________________________ Disadvantages o Difficult to learn because much derives from existing tools. o No complex data types. o Slower than C. o Tends to use large amounts of memory. o Powerful rather than elegant. o Feeping Creaturism. _________________________________________________________________ 5 Version 1.3, 92/11/20. _________________________________________________________________ Where to get it o Any comp.sources.unix archive Famous archive servers o uunet.uu.net 192.48.96.2 o tut.cis.ohio-state.edu 128.146.8.60 Its author, Larry Wall o jpl-devvax.jpl.nasa.gov 128.149.1.143 _________________________________________________________________ 6 Version 1.3, 92/11/20. _________________________________________________________________ Resources o The man page. o Programming Perl by Wall and Schwartz. o Perl reference guide (in postscript form) also available from Ohio State, along with some sample scripts. o A Texinfo Reference Guide (available via info) o Also, convex.com has many sample scripts plus comp.lang.perl archives. o USENET newsgroup comp.lang.perl good source for questions, com- ments, examples. o Locally, a collection of scripts can be found in /home/rigel/diner/banzai/perl, and also in ~trent/src/perl. _________________________________________________________________ 7 Version 1.3, 92/11/20. _________________________________________________________________ Preview It's not for nothing that perl is sometimes called the pathologi- cally eclectic rubbish lister. Before you drown in a deluge of features, here's a simple example to whet your appetites that demonstrates the principal features of the language, all of which have been present since version 1. while (<>) { next if /^#/; ($x, $y, $z) = /(\S+)\s+(\d\d\d)\s+(foo|bar)/; $x =~ tr/a-z/A-Z/; $seen{$x}++; $z =~ s/foo/fear/ && $scared++; printf "%s %08x %-10s\n", $z, $y, $x if $seen{$x} > $y; } _________________________________________________________________ 8 Version 1.3, 92/11/20. _________________________________________________________________ General Coding Info o Executable scripts start with #!/usr/local/bin/perl. o Use standard # for comments. o No need to use backslash for continuation lines. o White space almost never matters. o Strings may contain newlines. o Statements end in semi-colons. o Blocks use braces like C. _________________________________________________________________ 9 Version 1.3, 92/11/20. _________________________________________________________________ Data Types Perl has one basic data type: scalars. o Scalars are either string, numeric, or boolean, depending on context. o Boolean is much like C: values of 0 (zero) and ``'' (null string) are false; all else is true. o Numbers are stored, internally, as double precision. o Variables are not declared, except for locals. o Variables automagically initialized to the null string or 0 de- pending on the context. o Precede the identifier with $ to indicate a scalar value. $foo = 3.14159; $foo = $foo + 1; _________________________________________________________________ 10 Version 1.3, 92/11/20. _________________________________________________________________ Quoting o Quoting is (almost) like the shell: single, double, backtick and here documents. $foo = 'red'; $foo = 'I paid $4275000.32'; $foo = "was $foo before"; $foo = "was ${foo}ish before"; $host = `hostname`; $foo = < the effective uid of the process o $) the effective gid of the process Many others (one for every symbol on the keyboard, plus one). _________________________________________________________________ 12 Version 1.3, 92/11/20. _________________________________________________________________ Data Types o Scalars can be aggregated into indexed arrays (lists) or and associative arrays. o Type of variable determined by leading special character. $ scalar @ indexed array (lists) % associative array & function (call) * all of the above o All data types have their own separate namespaces, as do la- bels, functions, and file and directory handles. Gotcha o No nested data structures or C style structs. However, asso- ciative arrays often can fake it well enough. _________________________________________________________________ 13 Version 1.3, 92/11/20. _________________________________________________________________ Data types (lists) o Indexed arrays a use $ for one scalar element, @ for all or a portion $foo[$i+2] = 3; # set one element to 3 @foo = ( 1, 3, 5 ); # init whole array @foo = ( ); # initialize empty array @foo = @bar; # copy whole @array @foo = @bar[$i..$i+5]; # copy slice of @array @foo[3..5] = @bar; # first 3 elements of bar o Lists can be represented by a series of values enclosed on parenthesis. ($foo, $bar, $glarch) = ('eggs', 'sausage', 'spam'); ($foo, $bar) = ($bar, $foo); # exchange $fullname = (getpwnam("trent"))[6]; print (('a'..'z')[(15, 4, 17, 11)]; o $#ARRAY is index of highest subscript, and evaluating @ARRAY in a scalar context will return the number of elements. o The builtin variable $[ defines the starting index of arrays. 0 by default. _________________________________________________________________ 14 Version 1.3, 92/11/20. _________________________________________________________________ List Weirdness o Interpolating a list in double quotes will produce elements separated by spaces. Without quotes, all elements are concate- nated. o If a backquote evaluation is done into an array, each line of output is put into a separate element (w/ a newline). @yppasswd = `ypcat passwd`; chop(@yppasswd); o There are two contexts: scalar and array. You can use the scalar function to force an array returning function to supply a scalar. For example: $foo = ; # get next line @foo = ; # get whole file @foo = scalar(); # get next line as $foo[0] o Attempts to combine lists will result in one flat list. @foo = ( 1, 3, 5 ); @bar = ( 2, 4, 6 ); @new = ( @foo, 10, @bar ); print "@new\n"; => 1 3 5 10 2 4 6 _________________________________________________________________ 15 Version 1.3, 92/11/20. _________________________________________________________________ Special Array Variables @ARGV Command line arguments. So the script's name is $0 and its arguments run from $ARGV[0] through $ARGV[$#ARGV], inclu- sive. For example, a simple-minded echo could be written as: print "@ARGV\n"; @INC search path for files called with do @_ default for split and subroutine parameters. _________________________________________________________________ 16 Version 1.3, 92/11/20. _________________________________________________________________ Built-in Array Functions o Indexed arrays function as lists; you can add items to or re- move them from either end using these functions: pop remove last value from end of array push add values to end of array shift remove first value from front of array unshift add values to front of array For example: push(@list, $bar); push(@list, @rest); $tos = pop(@list); while ( $arg = shift(@ARGV) ) { } unshift( @ARGV, 'zeroth arg', 'first arg'); _________________________________________________________________ 17 Version 1.3, 92/11/20. _________________________________________________________________ Built-in Array Functions (split and join) o split breaks up a string into an array of new strings. You can split on arbitrary regular expressions, limit the number of fields you split into, and save the delimiters if you want. o By default, uses $_ for input, @_ for output, and /\s+/ for the delimiter. o In a scalar context, it returns the number of matches. @list = split(/[, \t]+/, $expr); while () { ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(/:/); } o The inverse of split is join. $line = join(':', $login, $passwd, $uid, $gid, $gcos, $home, $shell); _________________________________________________________________ 18 Version 1.3, 92/11/20. _________________________________________________________________ Built-in Array Functions (sort, grep, reverse) o reverse inverts a list. (if given a scalar it reverses it). foreach $tick (reverse 0..10) { } $is_palindrome = reverse($foo) eq $foo; o sort returns a new array with the elements ordered according to their ASCII values. Use your own routine for different collat- ing. print sort @list; sub numerically { $a - $b; } print sort numerically @list; o grep returns a new list consisting of all the elements for which a given expression is true. For example, this will delete all lines with leading pound signs @lines = grep(!/^#/, @lines); o In a scalar context, it returns the number of lines in the re- sult. _________________________________________________________________ 19 Version 1.3, 92/11/20. _________________________________________________________________ Associative Arrays o Instead of being indexed by just integers, these may be indexed by any scalar. o Use $ for one scalar element, % for all. Use curly braces for indexing. $frogs{'green'} += 23; # 23 more green frogs %bar = ( 1, 'one', 2, 'two' ); %foo = %bar; # copy whole %array @frogs{'green', 'blue', 'yellow'} = (3, 6, 9); o Multi-dimensional arrays can be faked. The real index will be joined with $; as a separator. $location{$x, $y, $z} = 'troll'; # multi-dim array o Use defined operator to check for existence of an element. if ($foo{'bar'}) # might be 0 if (defined $foo{'bar'}) o An element can be removed with delete. delete $table{$killed}; _________________________________________________________________ 20 Version 1.3, 92/11/20. _________________________________________________________________ Special Associative Array Variables %ENV the current environment; e.g. "$ENV{'HOME'}" %SIG The list of signal handling subroutines. sub trapped { print STDERR "Interrupted\007\n"; exit 1; } $SIG{'INT'} = 'trapped'; _________________________________________________________________ 21 Version 1.3, 92/11/20. _________________________________________________________________ Built-in Array Functions (%arrays) For manipulating associative arrays, the keys and values func- tions return indexed arrays of the indices and data values re- spectively. each is used to iterate through an associative array to retrieve one ($key,$value) pair at a time. while (($key,$value) = each %array) { printf "%s is %s\n", $key, $value; } foreach $key (keys %array) { printf "%s is %s\n", $key, $array{$key}; } print reverse sort values %array; _________________________________________________________________ 22 Version 1.3, 92/11/20. _________________________________________________________________ Regular Expressions o Understands egrep regexps, plus o Meta-characters o \w, \W alphanumerics plus _ (and negation) o \d, \D digits (and negation) o \s, \S white space (and negation) o \b, \B word boundaries (and negation) o C-style escapes recognized, like \t, \n, \034 o Don't escape these characters for their special meaning: ( ) | { } + o Character classes may contain metas, e.g. [\w.$] o Special variables: $& means all text matched, $` is text before match, $' is text after match. _________________________________________________________________ 23 Version 1.3, 92/11/20. _________________________________________________________________ Regular Expressions (continued) o Matches are done against $_ by enclosing the regular expression in slashes. It returns 1 if it matches, 0 otherwise. print if /^Subject: /; o A trailing i means the matching will be case-insensitive. o Matches within parenthesis are put into \1 .. \9 within regu- lar expressions; $1 .. $9 outside. if (/^this (red|blue|green) (bat|ball) is \1/) { ($color, $object) = ($1, $2); } if (/(\d\d):(\d\d):(\d\d)/) { $hours = $1; $minutes = $2; $seconds = $3; } o In an array context, the match operator returns the matches within parens. ($color, $object) = /^this (red|blue|green) (bat|ball) is \1/; _________________________________________________________________ 24 Version 1.3, 92/11/20. _________________________________________________________________ Regular Expressions (continued) o Substitute and translation operators are like sed's s and y. s/alpha/beta/; s/(.)\1/$1/g; y/A-Z/a-z/; o Use =~ and !~ to match against variables if ($foo !~ /^\w+$/) { exit 1; } $foo =~ s/\boregon\b/OR/i; o The /e modifier means evaluate the RHS as an expression. $_ = "7abc123xyz\n"; s/\d+/3 * $&/eg; # "21abc369xyz\n"; _________________________________________________________________ 25 Version 1.3, 92/11/20. _________________________________________________________________ Operators Perl has a rich set of operators, many which are stolen from oth- er languages (like C). o Perl has an operator precedence scheme as complex as C. o Standard operators: +, -, *, /, % o Bitwise operators: |, &, ^, << and >> o Assignment: All of the above operators can be combined with =. $foo += 42; o Auto-increment. Both pre- and post- increment and decrement. $foo++; $foo = --$bar; o exponentiation: **, **= o range operator: .. In a scalar context, it indicates a line range; in an array context it generates a list of the given range. $inheader = 1 if /^From / .. /^$/; if (1..10) { do foo(); } for $i (60..75) { do foo($i); } @new = @old[30..50]; o string concatenation: ., .= $x = $y . &frob(@list) . $z; $x .= "\n"; _________________________________________________________________ 26 Version 1.3, 92/11/20. _________________________________________________________________ Operators (continued) o string repetition: x, x= $bar = '-' x 72; # row of 72 dashes o Numeric comparison: ==, !=, <, >, <=, >= and <=> o String tests: eq, ne, lt, gt, le, ge and comp if ($x eq 'foo') { } if ($x ge 'red' ) { } o Ternary operator: ?: o Comma operator. o Search or substitution operators: =~, !~ o file test operators like augmented /bin/test tests work on strings or filehandles if (-e $file) { } # file exists if (-z $file) { } # zero length if (-O LOG) { } # LOG owned by real uid die "$file not a text file" unless -T $file; _________________________________________________________________ 27 Version 1.3, 92/11/20. _________________________________________________________________ Scalar Built-ins o Besides the powerful regular expression features, several well- known C string manipulation functions are provided, including crypt, index, rindex, length, substr, and sprintf. o The chop function efficiently removes the last character from a string. It's usually used to delete the trailing newline on input lines. Like many perl operators, it works on $_ if no operand is given. chop($line); chop ($host = `hostname`); while () { chop; ... } o The substr function can be assigned to: $phrase = "in big words"; substr($phrase, 3, 3) = "other"; print $phrase, "\n"; => in other words _________________________________________________________________ 28 Version 1.3, 92/11/20. _________________________________________________________________ More Scalar Built-ins o Perl has the usual set of mathematical functions: atan2, cos, exp, int, log, sin, sqrt, rand, srand. o Conversion functions: hex, oct and ord. o Time functions: time, times, localtime and gmtime. o Bitwise manipulation: vec. _________________________________________________________________ 29 Version 1.3, 92/11/20. _________________________________________________________________ Flow Control o Unlike C, blocks always require enclosing braces {} o unless and until are just if and while negated o if (EXPR) BLOCK else BLOCK o if (EXPR) BLOCK elsif (EXPR) BLOCK else BLOCK o while (EXPR) BLOCK o do BLOCK while EXPR o for (EXPR; EXPR; EXPR) BLOCK o foreach $VAR (LIST) BLOCK o For readability, if, unless, while, and until may be used as trailing statement modifiers as in BASIC-PLUS return -1 unless $x > 0; o do takes 3 forms o execute a block do { $x += $a[$i++] } until $i > $j; o execute a subroutine do foo($x, $y); o execute a file in current context do 'subroutines.pl'; _________________________________________________________________ 30 Version 1.3, 92/11/20. _________________________________________________________________ Flow Control (continued) o Use next and last rather than C's continue and break o redo restarts the current iteration, ignoring the loop test o A block for a loop can have a continue part, which will be exe- cuted for each iteration and if next is used. foreach $i (@list) { next if $i =~ /spam/; ... } continue { $last = $i; } o Blocks (and next, last, and redo) take optional labels for clearer loop control, avoiding the use of goto to exit nested loops. o || and && work as in the shell. They evaluate left to right, using short-circuit logic, and return the last expression eval- uated. open(FILE, $file) || die "$file: $!"; unlink($bar) && print "ok\n"; _________________________________________________________________ 31 Version 1.3, 92/11/20. _________________________________________________________________ I/O o Filehandles have their own distinct namespaces, but are typi- cally all upper case for clarity. Pre-defined filehandles are STDIN, STDOUT, STDERR and DATA. o The DATA filehandle refers to everything after the __END__ in the script. o Mentioning a filehandle in angle brackets reads next line in scalar context, all lines in an array context; newlines are left intact. $line = ; @lines = ; o <> means all files supplied on command line (or STDIN if none). When used this way, $ARGV is the current filename. o When used in a while construct, input lines are automatically assigned to the $_ variable. o It is common to iterate over file a line at a time, assigning to $_ (implicitly) each time and using that as the default operand. while ( <> ) { chop; next if /^#/; # skip comments s/left/right/g; # global substitute print; # print $_ } _________________________________________________________________ 32 Version 1.3, 92/11/20. _________________________________________________________________ I/O (continued) o If not using the pseudo-file <>, use open to create new ones. open (PWD, "/etc/passwd"); open (TMP, ">/tmp/foobar.$$"); open (LOG, ">>logfile"); open (TOPIPE, "| lpr"); open (FROMPIPE, "/usr/etc/netstat -a |"); o open returns 1 on success (or the child's pid for a pipe), 0 otherwise. open (PWD, "/etc/passwd") || die "ACK!"; o Both print and printf take a filehandle as an optional first argument. printf LOG "%-8s %s: weird bits: %08x\n", $program, &ctime, $bits; o May also use getc for character I/O and read for raw I/O. o Access to eof, seek, close, flock, ioctl, fcntl, and select calls for use with filehandles. o Access to mkdir, rmdir, chmod, chown, link, symlink (if sup- ported), stat, rename, unlink calls for use with filenames. Gotcha o Don't forget to close pipes and files. _________________________________________________________________ 33 Version 1.3, 92/11/20. _________________________________________________________________ Formatted I/O o Besides printf, formatted I/O can be done with format and write statements. o Automatic pagination and printing of headers. o Picture description facilitates lining up multi-line output o Fields in picture may be left or right-justified or centered o Multi-line text-block filling is provided, something like hav- ing a %s format string with a built-in pipe to fmt) o These special scalar variables are useful: $% for current page number, $= for current page length (default 60) $- for lines left on page _________________________________________________________________ 34 Version 1.3, 92/11/20. _________________________________________________________________ Formatted I/O (example) # a report from a bug report form; taken from perl man page format top = Bug Reports @<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>> $system, $%, $date ------------------------------------------------------------------ . format STDOUT = Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $subject Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< $index, $description Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< $priority, $date, $description From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< $from, $description Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< $programmer, $description ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description ~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description ~ ^<<<<<<<<<<<<<<<<<<<<<<<... $description . _________________________________________________________________ 35 Version 1.3, 92/11/20. _________________________________________________________________ Directory Access Three methods of accessing directories are provided. o You may open a pipe from /bin/ls like this: open(FILES,"/bin/ls *.c |"); while ($file = ) { chop($file); ... } o The directory-reading routines are provided as built-ins and operate on directory handles. Supported routines are opendir, readdir, closedir, seekdir, telldir, and rewinddir. For exam- ple: opendir(DIR, "."); @files = readdir(DIR); print "@files"; o The easiest way is to use perl's file globbing notation. A string enclosed in angle brackets containing shell meta- characters evaluates to a list of matching filenames. foreach $x ( <*.[ch]> ) { rename($x, "$x.old"); } chmod 0644, <*.c>; _________________________________________________________________ 36 Version 1.3, 92/11/20. _________________________________________________________________ Database Access o Associative arrays bay be bound to dbm files with dbmopen(). dbmopen(ALIAS,'/usr/lib/aliases',undef) || die "can't dbmopen aliases: $!"; o Use each not keys for big dbm files. $alias = $alias["postmaster\0"]; chop $alias; print "postmaster -> $alias0; Gotcha o Many dbm files have null bytes on the end of the keys and val- ues. _________________________________________________________________ 37 Version 1.3, 92/11/20. _________________________________________________________________ System Functions A plethora of functions from the C library are provided as built- ins, including most system calls. These include: o chdir, chroot, exec, exit, fork, getlogin, getpgrp, getppid, kill, setpgrp, setpriority, sleep, syscall, system, times, umask, wait. o If your system has Berkeley-style networking, bind, connect, send, getsockname, getsockopt, getpeername, recv, listen, sock- et, socketpair. o getpw*, getgr*, gethost*, getnet*, getserv*, and getproto*. _________________________________________________________________ 38 Version 1.3, 92/11/20. _________________________________________________________________ Handling Binary Data o Perl strings can hold arbitrary binary data. o The functions pack and unpack can break data up into strings. o Each takes a template describe the conversion. $var = pack(TEMPLATE, LIST); @ary = unpack(TEMPLATE, EXPR); o A template is a string which specifies a type (i.e. byte, short, float) and a quantity, or * for the rest of the data. o Spaces are ignored in a template (for readability). o Examples: S4 Four unsigned shorts i f c8 An int, a float, and 8 chars A20 A* A 20 byte string and the rest as a string c3 x i 3 chars, skip one byte, and an int o For example, to read /etc/utmp: while ( read(utmp,$utmp,36)) { ($line,$name,$host,$time) = unpack('A8 A8 A16 l',$utmp); ... o unpack is more efficient than substr at breaking up static data _________________________________________________________________ 39 Version 1.3, 92/11/20. _________________________________________________________________ Bit Strings o To help in manipulating binary data, and to support the BSD se- lect call, there is a special kind of string called a bit- vector. o The bitwise operators |, &, ^ and ~ assume bit-vector mode if given strings. o Like substr, vec may be used as an lvalue. $rin = 0; vec($rin, fileno(SOCKET), 1) = 1; print unpack("b*", $rin); Gotcha If you evaluate a bit-vector in a numeric context, they lose their magic. _________________________________________________________________ 40 Version 1.3, 92/11/20. _________________________________________________________________ Subroutines o Order of definition doesn't matter. o Functions may be called recursively or indirectly. o Subroutines called either with `do' operator or with `&'. o Scalars, lists and arrays may be passed as parameters. Howev- er, they all become one list. do foo(1.43); do foo(@list) $x = &foo('red', 3, @others); @list = &foo(@olist); %foo = &foo($foo, @foo); _________________________________________________________________ 41 Version 1.3, 92/11/20. _________________________________________________________________ Subroutines (continued) o Parameters are received by the subroutine in the special array @_. o Parameters are passed by reference, so they should be copied to local variables. $result = &simple($alpha, $beta, @tutti); sub simple { local($x, $y, @rest) = @_; local($sum, %seen); return $sum; } o Subroutines may also be called indirectly $foo = 'some_routine'; do $foo(@list) ($x, $y, $z) = do $foo(%maps); o Since @_ supplies references to the variables given as parame- ters, these variables may be modified. For example: sub invflag { if ((@_[0] != 0) && (@_[0] != 1)) { @_[0] = 0; } else { @_[0] ^= 1; } } _________________________________________________________________ 42 Version 1.3, 92/11/20. _________________________________________________________________ Dynamic scoping o The scope of local() is through the end of its block and is visible in subroutines called in that block. sub wisconsin { local ($_bar_) = 18; &drink(); } sub illinois { local ($_bar_) = 21; &drink(); } sub drink { if ($AGE < $_bar) { &bouncer(); } else { &imbibe(); } } $state = "wisconsin"; &$state(); _________________________________________________________________ 43 Version 1.3, 92/11/20. _________________________________________________________________ Eval o The eval operator lets you execute dynamically generated code. o Any syntax or runtime errors (including die messages) will be put in $@ and the eval will return the undefined value. o An eval counts as a block, so you can declare locals within that will last until it's done. but other variables and sub- routines will persist. o For example, to process any command line arguments of the form variable=value, place this at the top of your script: eval '$'.$1."'$2';" while $ARGV[0] =~ /^([A-Za-z_]+=)(.*)/ && shift; _________________________________________________________________ 44 Version 1.3, 92/11/20. _________________________________________________________________ Eval Applications o The eval operator is also useful for run-time testing of sys- tem-dependent features which would otherwise trigger fatal er- rors. For example, not all systems support the symlink or db- mopen. eval "symlink(\$old, \$new)" || warn ("oops: " . ($@ ? "no symlinks" : "symlink ($old $new): $!")); o Another use of eval and die is to return control from a deeply nested subroutine, much as setjmp() and longjmp() do in C. eval '&funcl'; die $@ unless $@ =~ /exception:/i; sub func1 { &func2; } sub func2 { &func3; } sub func3 { die "EXCEPTION: returning"; } o While you can't make an array of arrays, you can make an array of array names. o For example, you can use @name to contain the names of arrays (which actually contain the data). $ary = $name[$i]; $val = eval "\$$ary[$j]"; _________________________________________________________________ 45 Version 1.3, 92/11/20. _________________________________________________________________ Packages o Using packages you can write modules with separate namespaces to avoid naming conflicts. o Variable are accessed by the package'name notation. $packname'variable++; open (APACK'FILE, "<$somefile") || die; for (keys %network'listeners) { } o Identifiers that are not fully-qualified are considered to be in the current package. o The null package is the same as the main package: $'foo is the same as $main'foo. o Perl programs begin in the main package. o Use the package statement to switch to another package. o The scope of a package statement is lexical, not dynamic, and lasts until the end of the current block. _________________________________________________________________ 46 Version 1.3, 92/11/20. _________________________________________________________________ Suid Scripts o Perl programs can be made to run setuid, and can actually be more secure than the corresponding C program. o Because interpreters have no guarantee that the filename they get as the first argument is the same file that was exec'ed, perl won't let your run a setuid script on a system where setu- id scripts are not disabled. o Using a dataflow tracing mechanism triggered by setuid execu- tion, perl can tell what data is safe to use and what data comes from an external source and thus is tainted. o Tainted data may not be used directly or indirectly in any com- mand that modifies files, directories or processes or else a fatal run-time error will result. _________________________________________________________________ 47 Version 1.3, 92/11/20. _________________________________________________________________ Perl Libraries Perl comes with many useful library routines. shellwords Breaks a line up like the shell (quotes). termcap Accesses the termcap database complete Interactive completion of words getopt, getopts For parsing command line arguments, similar to it's C coun- terpart. importenv Turn the environment into perl variables, i.e. $HOME. pwd Sets up current directory tracking, i.e. $CWD. ctime For printing dates. bigint, bigrat, bigfloat Arbitrary precision math. _________________________________________________________________ 48 Version 1.3, 92/11/20. _________________________________________________________________ Require and Search Paths o To use a library, use require: require 'Getopts.pl' o Perl's library search path is in @INC. o The PERLLIB environment variable is a colon delimited list of pathnames which will be prepended to @INC if set. o Perl will only include a file once under the same name. It maintains a table in %INC of what's been included. _________________________________________________________________ 49 Version 1.3, 92/11/20. _________________________________________________________________ Debugging o When invoked with the -d switch, perl runs your program under a symbolic debugger (written in perl). It can: o Set breakpoints, conditional or unconditional o Print variables, list source code o Produce stack traces o Single step through code o Set arbitrary actions at certain lines Because it uses eval on your code, you can execute any arbitrary perl code you want from the debugger. o You can enter the debugger without a script via: perl -de 0; _________________________________________________________________ 50 Version 1.3, 92/11/20. _________________________________________________________________ Command Line Options The following are the more important command line switches recog- nized by perl: o -v print out version string o -w issue warnings about error-prone constructs o -d run script under the debugger o -e like sed: used to enter single command lines o -n loop around input like sed -n o -p as with -n but print out each line o -i edit files in place o -a turn on autosplit mode (like awk) into @F array o -P call C pre-processor on script _________________________________________________________________ 51 Version 1.3, 92/11/20. _________________________________________________________________ Common mistakes o Forgetting the semi-colon at the end of a statement. o Leaving off the $ in the left-hand side of an assignment, or in a foreach loop. o Using == instead of eq. o Putting a comma after a filehandle in a print statement. print STDOUT, "hello, world\n"; o Forgetting the & in a function call. o Using else if or elif instead of elsif. o Forgetting to save $1 .. $9 before doing other matches or sub- stitutions. o Forgetting to enclose a BLOCK in curly braces. o Not choping input lines, or results from a backtick'ed command. o Saying @foo[1] instead of $foo[1]. o Different behaviours in scalar and array context. o Not using -w to find typos _________________________________________________________________ 52 Version 1.3, 92/11/20. _________________________________________________________________ Examples: Command Line # output current version perl -v # simplest perl program perl -e 'print "hello, world.\n";' # useful at end of "find foo -print" perl -n -e 'chop;unlink;' # add first and last columns (filter) perl -a -n -e 'print $F[0] + $F[$#F], "\n";' # in-place edit of *.c files changing all foo to bar perl -p -i -e 's/\bfoo\b/bar/g;' *.c # run a script under the debugger perl -d myscript _________________________________________________________________ 53 Version 1.3, 92/11/20.