courses

Introduction
Shell Basics
Startup Scripts
- Questions
File/Directory Manipulation
Viewing File Contents
- Questions
man
tar
grep
find
asciinema
Other Useful Tools
- Questions to Consider

Introduction

OK, so there is a lot here, if you do everything asked of you. It is not intended that you go through this all in a single sitting. I’d suggest not doing more than a section or two at a time. The main point of this is to ensure we all are on the same baseline. This page describes tools that will be necessary or will make your life better when using Linux. It isn’t intended that this all be new, or all be review. It is intended to be a repository of useful information when dealing with Linux. Some tools have their own pages and are not here. Some information is on the Software Configuration page.

So, work your way through this page at your own pace over the next week or two. It is intended to be interactive – see a tool, look at the manual, try it out. Don’t just passively read your way through this, as that won’t be very helpful. Some tools I explain in detail and some are left as an exercise for the reader. If you run into something confusing, ask the question, and we’ll try to explain.

Some of you have used Linux before and are comfortable on the command line. Some of you may not be. This page is intended as both a refresher and an introduction to some tools that you may find very useful for this class. If you have taken other classes from me, you’ve likely seen this page before.

The idea here is that this isn’t just a list of commands or functionality. It is intended to be interactive, and you should try many of the listed commands on an actual Linux install. There are some Mac-specific flags to certain tools, due to its BSD heritage. If you find something that doesn’t work for you, and you’re on a Mac, please just look at the man page for the tool on your Mac!

Shell Basics

The shell is the basic unit of interaction on a Linux machine – at least remotely or without a GUI. Even if you do have access to a Linux GUI, you will be self-restricting if you don’t know how to use the command line interface.

You will see in my examples that I make use of the zsh shell. zsh is the modern default (or soon to be default) shell of Ubuntu, macOS, and several other Linux distributions. There are many advantages of zsh over older shells such as bash or tcsh, but ultimately you need to decide what shell you like, and roll with that.

For me, zsh brings the right combination of functionality, security, and community activity. If you like what you see of my shell look and feel, you can make use of oh-my-zsh (see below on startup script) with the jonathan theme. I also make use of antigen to manage zsh plugins. tmux also has a series of plugins I’d recommend (see Software Configuration for details).

There are three books worth downloading:

The Linux Command Line by William Shotts (Creative Commons License)
Adventures with the Linux Command Line by William Shotts (Creative Commons License)
The Linux Development Evironment by Rafeeq Ur Rehman and Christopher Paul (Internet Archive Link) <– This is an older book, but still contains a lot of useful information

These books cover in great detail all the tools you are likely to need! What you see below is a very brief overview of common tools.

Startup Scripts

What are often called RC files or startup scripts are used to configure most tools in a Linux environment. These are plain text files which enable user-specified options, and usually are of the form .toolrc or .tool.rc or .config/tool.rc – hence the name RC files.

One of the most important is .$(basename $SHELL)rc. This controls your shell – options, look and feel, any extensions, whether ls uses colors, etc. The file is written in the shell syntax of your chosen shell. If you’re already run the script from the Software page, and then decide to change your shell, you’ll want to re-run the script.

If you do decide to take the zsh route, I strongly encourage you to make use of oh my zsh. It provides some significant improvements over the default settings in zsh.

The ~ character is a shortcut for a home directory. Your home directory is just plain ~. Kevin’s is ~dmcgrath

Nearly every other tool has some sort of configuration script, usually somewhere in your home directory. Take a look around, see what you find. Some specific directories to investigate:

~/.ssh/
~/.config/
And of course the root of your home directory: ~/

Do keep in mind that dot files are by default hidden from ls. Look at the man page for ls to see that command line option will allow you to see all the files in a directory, including the hidden ones.

Questions

What does the tool basename do?
What does .$(basename $SHELL)rc return?
Why does this work?

File/Directory Manipulation

One of the most basic sets of functionality in a shell is moving around and manipulating files in some way. A list of useful commands for this can be seen below. For further details, take a look at the man page for each of them.

Unix commands take a combination of short form and long form options. Short form options are usually of the form -x, where x can be any single letter. Long form options are of the form –long-option. Many commands have long and short options for the same thing, but you’ll need to investigate the commands man page for more details.

ls List the contents of a directory. Some useful options include
- -l for a more detailed listing
- -a for including dot files
- -t for sorting by time
- -r for reverse sorting
chmod

Change the mode (permisions) of a file or directory. Modes are octal constants.
cp

Copy a file or directory (with -r) to a new location.
mv

Move a file or directory to a new name or location.
scp

Just like copy, but secure and one end or the other can be a remote system.
rmdir

Remove an empty directory. If the directory has anything in it (visible or not) this command will fail.
rm

Remove a file or a directory (with -r). Useful options:
- -r to recursively delete
- -f to force a delete – don’t ask for confirmation. BE VERY CAREFUL WITH THIS, especially if combined with -r!
mkdir

Make a new directory.

Viewing File Contents

After moving around and make directories and files, one of the most common things you’ll do on a Linux system is look at files. There are a variety of ways to do that, depending on your needs.

less

An improved pager, with the ability to move up and down, rather than just down – less is more. A pager, in this case, is a utility that provides output one “page” at a time, for a page equal to some number of lines. The setup script I had you run installs and aliases a different tool in place of less, called bat, which is essentially less but with syntax highlighting.
cat

Dump the entire contents of a file to standard out. This is especially useful if you want to copy a file on a Mac (pipe it to pbcopy) or in WSL2 (pipe it to clip.exe).
hexyl

A nice syntax highlighting hex dump utility.
objdump

Displays a variety of information about object files – basically anything compiled and linked.
nm

Displays the symbols from an executable file
readelf

Displays information about ELF files. ELF files are executables in Linux land. If you know what a PE file is, same idea, different OS.

Questions

readelf, objdump, and nm are all part of the GNU binutils collection of tools for working with binaries. Which you would use in a given situation entirely depends on what you are trying to do. There is some overlap in output, but each brings unique strengths to the table.

man

man is the command which displays online manuals. This does not mean on the web – online in this case means on the current computer in use. I would suggest taking a look at the manual for man: man man

Reading man pages is a skill like any other. At first, it will seem difficult. As you practice more, you’ll start to recognize the way they are structured, gain an understanding of what’s in each section of a given man page, and how to make use of them to be more productive.

An example man page (man 3 fopen) with most of the DESCRIPTION section removed for brevity:

FOPEN(3)                                                          Linux Programmer's Manual                                                         FOPEN(3)

NAME
       fopen, fdopen, freopen - stream open functions

SYNOPSIS
       #include <stdio.h>

       FILE *fopen(const char *pathname, const char *mode);

       FILE *fdopen(int fd, const char *mode);

       FILE *freopen(const char *pathname, const char *mode, FILE *stream);

   Feature Test Macro Requirements for glibc (see feature_test_macros(7)):

       fdopen(): _POSIX_C_SOURCE

DESCRIPTION
       The fopen() function opens the file whose name is the string pointed to by pathname and associates a stream with it.
        .
        .
        .
       
RETURN VALUE
       Upon  successful  completion fopen(), fdopen() and freopen() return a FILE pointer.  Otherwise, NULL is returned and errno is set to indicate the er‐
       ror.

ERRORS
       EINVAL The mode provided to fopen(), fdopen(), or freopen() was invalid.

       The fopen(), fdopen() and freopen() functions may also fail and set errno for any of the errors specified for the routine malloc(3).

       The fopen() function may also fail and set errno for any of the errors specified for the routine open(2).

       The fdopen() function may also fail and set errno for any of the errors specified for the routine fcntl(2).

       The freopen() function may also fail and set errno for any of the errors specified for the routines open(2), fclose(3), and fflush(3).

ATTRIBUTES
       For an explanation of the terms used in this section, see attributes(7).
       ┌────────────────────────────────────────────────┬───────────────┬─────────┐
       │ Interface                                      │ Attribute     │ Value   │
       ├────────────────────────────────────────────────┼───────────────┼─────────┤
       │ fopen(), fdopen(), freopen()                   │ Thread safety │ MT-Safe │
       └────────────────────────────────────────────────┴───────────────┴─────────┘

CONFORMING TO
       fopen(), freopen(): POSIX.1-2001, POSIX.1-2008, C89, C99.

       fdopen(): POSIX.1-2001, POSIX.1-2008.

SEE ALSO
       open(2), fclose(3), fileno(3), fmemopen(3), fopencookie(3), open_memstream(3)

COLOPHON
       This  page  is  part  of release 5.05 of the Linux man-pages project.  A description of the project, information about reporting bugs, and the latest
       version of this page, can be found at https://www.kernel.org/doc/man-pages/.

GNU                                                                      2019-05-09                                                                 FOPEN(3)

Some important things to point out here:

man pages always start with the name of whatever the page is describing. If a family of function calls or executables, it shows all that is described herein.
Synopsis shows invocation information. In the case of library calls, it may also include any necessary feature macros that control which functions are available.
Description gives a detailed explanation of the tool or library. This includes the permissible values of parameters or options.
Library call manpages will differ from executable man pages in that library calls will have the “RETURN VALUE” section, while executables have an “Exit status” section.
Library calls will also have an “ERRORS” section which details the values that might be stored in errno.
Additional information can include authors, where to report bugs, copyright information, and any other related man pages.

tar

tar is one of those tools that everyone loves to hate. It has an enormous collection of flags and switches, and in reality is being shoe-horned into its current role as “master compression utility.” Historically it was created to make tape backups of a filesystem. Since compression can only work on a single file, it is common to first combine all necessary files into a single meta-file: a tar ball. This tar ball is then compressed using the compression algorithm of your choice.

There are a few common ways to use tar:

As a tool to compress a folder: tar cjvf folder.tar.bz2 folder/
As a tool to uncompress an archive: tar xjvf archive.tar.bz2
As the first stage in a pipeline for compression: tar cvf - folder bzip2 > folder.tar.bz2

All of the above assume bzip compression. There are multiple other variants:

-a, --auto-compress Use archive suffix to determine the compression program.
-I, --use-compress-program=COMMAND Filter data through COMMAND. It must accept the -d option, for decompression. The argument can contain command line options.
-j, --bzip2 Filter the archive through bzip2(1).
-J, --xz Filter the archive through xz(1).
--lzip Filter the archive through lzip(1).
--lzma Filter the archive through lzma(1).
--lzop Filter the archive through lzop(1).
--no-auto-compress Do not use archive suffix to determine the compression program.
-z, --gzip Filter the archive through gzip(1).
-Z, --compress Filter the archive through compress(1).
--zstd Filter the archive through zstd(1).

While there are a large number of options, the most common are bzip2, gzip, and xz.

Creating an archive is done with -c, extraction with -x. Archive name is specified with -f. -v requests verbose output.

There are a lot of places you will need to know how to work with tar, so learning the basics of it is quite important.

Lots more information can be found in the man pages.

grep

grep is a tool which allows for regular expression matching of contents of a line or file. It is commonly known as a filter – it removes unwanted cruft and only allows the display or further processing of data that matches (or doesn’t match, in the case of an inverted search).

grep has multiple modes, which can be invoked with an alternative name or a specific flag. For instance, to use extended regular expression syntax, you can invoke as egrep or call as grep -E. Both operate identically. The different modes are

Extended regular expressions: -E
PERL compatible regular expressions: -P
Fixed string matching: -F
Default behavior, if you want to specify it as a flag (scripts and the like): -G

grep is a tool you will likely make a ton of use of. That will require both a knowledge of regular expressions and an understanding of how grep works and is controlled by its flags. See the man page for grep for the latter, and you’re on your own for anything beyond very basic regular expressions. I’m a firm believer that if you try to solve a problem with regular expressions, you now have many problems, rather than just the one.

find

Ah, find. The true Swiss Army Knife of Linux. find is a great example of a simple tool that got way out of hand, yet still maintains a significant utility. While the feature creep is real, the features are just so dang useful!

find requires a path and a test (and sometimes an action). The path is just the root of the subtree in the filesystem you want to search, while the test can be…almost anything you can imagine.

file name
file type (directory, normal file, etc.)
file permissions
file age
- in epoch time
- in vague relative terms (like “yesterday”)
when the file was last accessed
when the file was last modified
whether the file is empty
whether the file is executable
what kind of filesystem the file is on
always false
always true
whether a file belongs to a group
what inode the file is on
file has $n$ hard links
file is a soft link
and on
and on
and on

Then, once you’ve written a test, what should you do with the files that match the test? The options are legion:

print the matching file names in one of a dozen or so different ways
delete the files
execute an arbitrary command on the file
quit after the first match

While that’s a relatively short list, I would direct your attention to the fact that find will execute a given command on every file that matches the pattern. Any command. Of your choosing. This could be as simple as a rename, or an extraction of $n$ lines of text. Or it could be as complex as a multi-stage pipeline that vastly transforms the input files into something altogether different. The choice is yours.

See the man page for all the gory details of the above.

asciinema

asciinema is a tool for creating casts of a terminal session. Specifically, it can be used to record and play back a series of shell commands, in an interactive fashion, allowing the user to copy the commands. You’ll see this littered throughout this course, in places where a simple interactive session will provide significant instructional value.

Think of it as a way to record the steps and output of commands that you will run. For instance, a testing session or proof of functionality that is requested. Very useful tool.

By default, it uploads to the asciinema servers. But if you provide a file name, instead of uploading, it saves it to the named file.

┌─(~)────────────────────────────────────────────────────────────(dmcgrath@DESKTOP-2JUKDNG:pts/2)─┐
└─(17:50:35)──> asciinema rec -t "Introduction to man pages" -i 2 --overwrite man.cast           ─┘

By default, it will not overwrite an existing file. As you can see above, I provide to --overwrite flag to have it ignore existing files. See the man page for details on the other options seen in the example invocation above.

Other Useful Tools

This list of utilities was generated by mining my own history file. I’ve left out anything that was very domain specific (reverse engineering, security, network), but they are mostly in reverse count order (most used at the top). I’m leaving them as a collection of tools you’ll want to look up in the man pages. Most can also be investigated with tool --help, but that is much more terse than the man pages.

Tools to pay specific attention to include make, curl, which, ps, and awk. So basically the top 5 most used in my history.

ps
awk
which
rsync
make
cmake
curl
wget
pygmentize
file
ip
du
df
md5sum, sha256sum, etc.
kill
readelf
env
wc
ln
bc

I will leave you with one of my most used commands on servers I manage:

$ ps -efH --no-header | awk '{print $1}' | grep -Ev $(python3 -c 'import sys; print("|".join(sys.argv[1:]))' $(cut -f1 -d':' /etc/passwd)) | sort | uniq -c | sort -n

No, I don’t type that every time. I have it as an alias in my startup script, since it’s how I print out a list of users who are causing excessive load on the server. Play with it, see if you can improve on it. You can also see the use of immediate python combined with the use of awk, cut, and grep.

Questions to Consider

What do you think the grep portion does? Why is that useful? What is an alias? Can you replace the awk pipeline component with a cut? What about the reverse? Take a look at a tool called paste. How might you use it to replace the use of python in this pipeline?