Shell Programming CS/EE Systems Group Workshop Lecture Trent A. Fisher Advantages o Portability (no need to recompile) o Speed of writing/testing o Ease of pipelining information o No data types Disadvantages o No complex data types. (Arrays can be faked.) o Some operations difficult or impossible. o Can be much slower than C program. o No data types o Lowest common denominator When aiming for the common denominator, be pre- pared for the occasional division by zero. --Jim des Rivieres o Size text data bss dec 28672 4096 0 32768 /bin/sh 106496 8192 7632 122320 /bin/csh 126976 8192 4188 139356 /bin/ksh 188416 53248 5168 246832 /usr/local/bin/bash o Portability o More flexibility for programming o Less bugs o Execute commands This means the shell has to create a new process and have it run the desired binary. | | | | fork() | \ | \ | \ | \ | \ | exec() | | | | | | | | | | | | | | | new | program | / | / | / | / | / wait() o If a command is run in the background (i.e. the wait call is not done. o Break command line into arguments For example, the com- mand gcc -g -Wall slartibartfast.c would be broken up like this: argv ----->| -----> gcc | | -----> -g | | -----> -Wall | | -----> slartibartfast.c | 0 o Find binary on the system o Execute the command o Supply the return status A status of 0 means the program worked normally, anything else means something went wrong. o Redirection o Piping o Filename globbing (meta-characters) o Builtin operations o I/O streams Every process has three default I/O streams: stdin, std- out, stderr. These streams are supplied by the parent process. This means that a shell can change where they actually go for any program they run without the pro- gram's knowledge. Forms: o Redirection o Append o By descriptor o Merging streams o Closing streams <&- close stdin >&- close stdout o Here documents Read from script (here) until a keyword is encountered cat << SPAM > x How to identify different trees from quite a long way away. Number one... The Larch. The... Larch. SPAM An excellent example of the use of here documents is o Connect stdout of command on the left to the stdin of the command on the right. Stderr is unaffected. The command line cat /etc/passwd | sort -t: +2 | less builds a pipeline like this: stdin- stdout-stdin- stdout-stdin- stdout- cat /etc/passwd | sort -t: +2 | less | | | stderr stderr stderr | | | | | | * Matches any file (except ones starting with `.'). ? Matches any single character (except `.'). [...] Matches any one of the characters enclosed. A pair of characters separated by `-' matches any character lexi- cally between the pair. Examples: $ ls -A .bashrc .cshrc README x xxx y z $ echo * README x xxx y z $ echo ? x y z $ echo .* . .. .bashrc .cshrc $ echo [xyz] x y z $ echo [xyz]* x xxx y z $ echo .[a-z]* .bashrc .cshrc o Flow control If-then-else, loops, and case statements o Convenience builtins Some utilities are built into the shell for efficiency. Commonly these are test, echo, time, and others. o One utility must be built into the shell: o Another utility which is built into the shell is it runs a process by overlaying it on top of the current one (effectively destroying the current process). o The shell can read commands in an interactive fashion from a terminal, or it can read a sequence of commands from a file. There are several ways to do this (assuming that the file is a file contains some shell commands): sh njorl o The other way is to make the file executable: chmod +x njorl The first line of the script indicates what type of script it is. The best way to specify what shell to use is the notation. The first line of an executable file can contain a followed by the absolute pathname of the shell, i.e. On some systems the first character may also specify what type of script it is. o A colon indicates a bourne shell script. o A pound sign ( without a trailing indicates a csh script. o Yet other systems (mainly old System V machines) recog- nize none of these and blindly run your script under the same type of shell as your login shell. Identifiers for shell variables are similar to other lan- guages: an alphabetic character followed by alphabetic, numeric or underscore characters. Shell variables are all strings, although they can be used as numbers. Builtin Variables The shell maintains a number of variables which are built into the shell, and the values are determined by the shell. ? The value returned by the last executed command in dec- imal. $ The process number of this shell. ! The process number of the last background command invoked. - Options supplied to the shell on invocation or by 0 Name of the current shell script, or the name of the shell. # The number of command line arguments in decimal. [1-9] The command line arguments to the current shell script. * All the command line arguments. @ Similar to, but subtly different from `*'. Other shell variables have special meaning to the shell. These variables are used for mainly configuration, a few are set when the shell is invoked. PATH The search path for commands. It usually contains at least: :/bin:/usr/bin:/usr/ucb:/usr/local/bin:/usr/games: PS1 PS2 Prompt strings. They default to `$ ' and `> ', respec- tively. USER Your login name HOME Your home directory. The default argument for MAIL Name of your mail file. Usually, IFS Internal field separators, normally space, tab, and newline. Assignment Assigning a value to a variable is much like other program- ming languages, using an `=' to indicate assignment. For example: tree=larch meal="spam eggs sausage and spam" Gotchas Due to the nature of the shell you have to be careful about spaces. For example: $ tree = larch tree: not found $ tree= larch larch: not found $ meal=spam eggs sausage spam eggs: not found Interpolation o To interpret a variable precede the identifier with For example: $ echo $PATH /usr/local/gnu/bin:/home/rigel/rye/trent/bin: /usr/ucb:/bin:/usr/bin:/usr/local: /usr/local/bin:/usr/games:/usr/etc: /usr/local/etc:/etc:/usr/bin/X11: $ echo $$ 21167 $ echo $- s o You may also surround the identifier with braces ( This is useful if you want to follow the variable immediately with other characters. For Example: $ a1=quotes $ a2=monty $ echo ${a1}_$a2 quotes_monty $ echo $a1_$a2 monty Parameter substitution Determine if a shell variable is set, and perform various actions accordingly. o ${identifier:-word} If the variable is set and is nonnull, substitute its value; otherwise substitute `word'. o ${identifier:=word} If the variable is not set or is null set it to word; the value of the parameter is substituted. o ${identifier:?word} If parameter is set and is nonnull, substitute its value; otherwise, print word and exit from the shell. If word is omitted, the message is printed. o ${identifier:+word} If parameter is set and is nonnull, substitute word; oth- erwise substitute nothing. If the colon is omitted, the shell will only test if the variable is unset as opposed to unset or null. Gotchas Many shells have more of these, and some don't have all of them. Environment Variables Normally shell variables are not passed to children. How- ever, they can be put into the environment via the command, which will give children processes access to them. $ a=var-a $ b=var-b $ export b $ echo $a $b var-a var-b $ sh $ echo $a $b var-b Many builtin shell variables must be such as and The shell provides a number of control structures. However, most of them have deviations which do not exist in other languages. The structures are: o For loops o Case statements o While loops o If then else statements o Logical or o Logical and Unlike for loops in other languages (possibly like pascal or ada). for i in * do compress $i & done Anything may be given after the for example: for i in spam eggs bacon sausage do echo $i done The command line (is implicit with the for loop. for i do echo $i done The case statement is similar to case statements in other languages except it allows globbing and logical expressions. Its syntax is: case var in pattern) statements ;; ... esac Here is a more complex example: for i do case $i in -[ocs]) ... ;; -verbose|-debug) debug=1 ;; -*) echo unknown flag $i ;; *.c) /lib/c0 $i ... ;; \?) echo what\? ;; *) echo unexpected argument $i ;; esac done An understanding of the command is vital for what follows. The command takes its arguments as logical expressions and returns true or false via the programs exit status. can also be called via the command, the only difference is that in this form it must have a trailing o File Tests -r, -w, -f, -d, -s o String Tests -z, -n, =, != o Numeric operations -eq, -ne, -gt, -ge, -lt, -le o Logical operators !, -a, -o, ( ) Gotchas o Notice that all the operators and flags are separate arguments to test. This means that is not the same as o Notice also that parentheses are meaningful to the Shell and must be escaped. The statement is similar to other programming languages, except instead of a logical expression, it takes a command and evaluates its exit status. if grep -s root /etc/passwd then echo ok else echo we got trouble fi Or using the command: if [ -w / ] then echo bad news fi You can also chain via the construct if [ $x -gt 0 ] then echo positive elif [ $x -lt 0 ] then echo negative else echo zero fi Gotcha Note that the shell concept of true or false is reverse of C and other languages: 0 is true, anything else is false. Much like the The while loop takes a command as its condi- tion and evaluates the exit status to determine whether to continue to loop. x=10 while [ $x -ge 0 ] do echo -n "$x, " x=`expr $x - 1` # decrement $x done echo boom The shell will allow you to chain commands together logical operators. For example: if [ -w .rhosts ] then chmod go-w .rhosts fi Can be rewritten as test -w .rhosts && chmod go-w .rhosts The second command is executed only if the first command succeeds. The operator works similarly test ! -w .rhosts || chmod go-w .rhosts The second command is executed only if the first one fails. There are two ways to group commands: { commands ; } or ( commands ) The former merely executes the commands in sequence, the latter executes the commands in a subshell. A simple exam- ple is: (cd monty; rm chapman) In this case the will not affect the current shell. A more useful example is: (cd /src/gnu/groff; tar cvf - ) | tar xvf - o In both cases the commands should be separated by semi- colons. Quotes are taken care of before the command is broken into arguments. +--+----------------------------+ | | \ $ * ` " ' | +--++---------------------------+ |' ||n n n n n t | |` ||y n n t n n | |" || y y n y t n | +--++---------------------------+ t terminator y interpreted n not interpreted Examples grep "s.*p.*a.*m" /usr/dict/words awk ' BEGIN { FS = ":" } {if ($1 == "'$USER'") print $3}' /etc/passwd The back-quotes (or back-ticks) indicate that the contents of the quotes should be executed as a command and the output from the command is substituted. $ echo `date` Tue Aug 27 14:06:07 PDT 1991 A more useful (and more dangerous) example is: files=`find . -name '*.Z' -print` for i in $files do uncompress $i& done The shell has the ability to have signal handlers. This way, a script can respond to an interrupt and perform cleanup. The command allows certain operations to be performed in response to certain signals. trap 'rm /tmp/$$; exit' 2 If the is not done the shell will resume execution where the signal was received. Signals can also be ignored using trap '' 1 2 3 15 Shell programming depends on many peripheral programs to do various tasks. o echo o test o true, false o basename o expr o whoami o who am i o pwd o yes o comm o colrm o sort o uniq o head o tail o sed o awk echo test basename Removes a specified suffix from a file, for example: cc $1 mv a.out `basename $1 .c` expr evaluate mathematical expressions. Also, some string operations pwd Print the current working directory. whoami Prints the login name of the current effective UID. who am i Prints the login name of the person logged in on the current tty. tee Pipe fitting. yes Print a given word (the letter `y' by default) infinitely. This is useful for commands that ask stupid questions. true false Return status of 1 or 0. This is useful for infinite while loops. awk Pattern scanning, manipulating tabular data sed Editing What is a UNIX filter? Reads from the standard input. After processing, output goes to the standard output. General Purpose Filters Line Based o awk o cat o col o comm o grep o head o nl (SysV) o pr o rev o sed o sort o tail o tee o tr o uniq Column Based o cut (SysV) o paste (SysV) o colrm (BSD) expr Arguments are taken as an expression and evaluated. Result written to the standard output. expr operators expr1 | expr2 (expr1 if it is not null or zero, expr2 if otherwise) expr1 & expr2 (expr1 if neither are NULL or zero, zero if otherwise) expr relop expr (where relop may be <, <=, =, !=, >= and >) expr + expr expr - expr expr * expr expr / expr expr % expr expr : expr (string match) #!/bin/sh # sum=0 while test "$#" != 0 do sum=`expr $sum + $1` shift done echo $sum bc Infix calculator. Takes commands from the standard input and writes results to the standard output. #!/bin/sh # sum=0 while test "$#" != 0 do sum=`echo "$sum + $1" | bc` shift done echo $sum Invocation join [options] file1 file2 Options o -jn m Join the mth field of the file n. o -o list Each output line comprises the fields specified in list. Each element is of the form n.m where n is the file num- ber and m is the field number. o -tc Use the character c as the separator. Example Line from the passwd file: janaka:M4iME2l/FA57g:1004:34:Born Again Hacker:/home/rigel/rye/janaka:/bin/csh Line from the group file: ee-fac:*:34: sort -t: +3 -4 /etc/passwd > passwd sort -t: +2 -3 /etc/group > group join -t: -j1 4 -j2 3 -o 1.1 2.1 passwd group o basename o pwd o true o false o yes o read o Bourne shell is more generous about newlines in () and quotes. The C-shell in particular does not allow new- lines, which make programming more difficult o Tilde expansion. Most modern shells use the tilde character for several things: expands the to your home directory, expands to home directory. o More parameter substitutions In fact the POSIX shell standard specifies many more, which effectively eliminates the need for o Aliases Some shells allow command aliases, some allow argument interpolation (csh) others don't (bash) o Functions Many modern shells have shell functions, which operate like functions in most other programming languages. Usu- ally, the ability to create local variables is also given.