Part 1: Review of some basics

1 Commands, options, arguments & command line input
- 1.1 Command options
- 1.2 Getting help
- 1.3 Command line history and editing
- 1.4 Tab key completion
- 1.5 Multiple commands/lines
- 1.6 Literal characters and metacharacters
2 Basic text manipulation
- 2.1 Standard streams and piping
- 2.2 echo, head, tail, cat -n, wc
- 2.3 basic grep
- 2.4 What is text?
3 Other shell concepts
- 3.1 Environment variables
- 3.2 Quoting in the shell
4 Redirection
- 4.1 Errors, output and their streams
5 File systems, files, and file manipulation

Commands, options, arguments & command line input

The Intro Unix: The Bash shell and commands section introduced the bash REPL – a Read, Eval, Print Loop that processes lines of command input. To review, a command consists of:

The command name – any built-in Linux/Unix utilities that performs a specific function
- or the name of a 3rd party program or user-written script
One or more (optional) options, usually noted with a leading dash ( - ) or double-dash ( -- )
- options affect how a command performs its processing
One or more command-line arguments, which are often (but not always) file names
- arguments specify what data the command works on

The shell executes its REPL when it sees a linefeed (a.k.a newline), which happens when you press Enter after typing the command.

Command options

Types of command options:

Short (1-character) options which can be provided separately, prefixed by a single dash ( - )
- or can be combined, prefixed by a single dash (e.g. ls -lah)
Long (multi-character/"word") options are prefixed with a double dash ( -- ) and must be supplied separately.
Many utilities have equivalent long and short options, both of which can have values.
- The short option and its value are usually separated by a space, but can also be run together (e.g. -f 2 or -f2)
- Strictly speaking, the long option and its value should be separated by an equal sign ( = ) according to the POSIX standard (see https://en.wikipedia.org/wiki/POSIX). But many programs let you use a space as separator also.
Options usually come before arguments, but may also be allowed after the arguments depending on the tool.

More at: Intro Unix: The Bash shell and commands: Command options

Getting help

To learn what options and arguments a command has:

In the Terminal, type in the command name then the --help long option (e.g. ls --help)
- Works for most Linux commands
  - 3rd party tools may use -h or -? or even /? instead
- May produce a lot of output, so you may need to scroll up quite a bit or pipe the output to a pager
  - e.g. ls --help | more; type q to quit or space for the next "page"
Use the built-in manual system (e.g. type man ls)
- This system uses the less pager (space advances the output by one screen/"page"; typing q will quit/exit the display)
Ask the Google, e.g. search for ls man page ( can be easier to read)
1. or ask ChatGPT or other chatbot
Consult our Intro Unix: Some Linux commands wiki page
- It lists many useful Linux commands along with some of their commonly used options

More at:

Command line history and editing

Sometimes you want to repeat a command you've entered before, possibly with some changes.

The built-in history command lists the commands you've entered, each with a number.
- You can re-execute any command in the history by typing an exclamation point ( ! ) then the number
- e.g. !15 re-executes the 15th command in your history.
- Only commands in your current bash session are in the history, but you can always save them for future reference, e.g. history > ~/history.2025-04-25.

Use Up arrow to retrieve any of the last 50+ commands you've typed, going backwards through your history.
- You can then edit the retrieved line, and hit Enter (even in the middle of the command), and the shell will process that command.
The Down arrow "scrolls" forward from where you are in the command history.

To affect the cursor (small thick bar on the command line) that marks where you are on the command line.

Right arrow and Left arrow move the cursor forward or backward on the current command line.
Use Ctrl-a to jump the cursor to the start of the line.
Use Ctrl-e to jump the cursor to the end of the line.
Arrow keys are also modified by Ctrl- (Windows) or Option- (Mac)
- Ctrl-right-arrow (Windows) or Option-right-arrow (Mac) will skip by "words" forward
- Ctrl-left-arrow (Windows) or Option-left-arrow (Mac) will skip by "words" backward

Once the cursor is positioned where you want it:

Just type in any additional text you want
To delete text before the cursor, use:
- Ctrl-h or
  - Backspace key on Windows
  - Delete key on Mac
To delete text after the cursor, use:
- Ctrl-d or
  - Delete key on Windows
  - Function-Delete keys on Macintosh
Use Ctrl-k (kill) to delete everything on the line after the cursor

For more on how to edit text on the command line, see: Intro Unix: The Bash shell and commands: Command line history and editing

Tab key completion

Hitting Tab when entering command line text invokes shell completion, instructing the shell to try to guess what you're doing and finish the typing for you. It's almost magic!

On most modern Linux shells you use Tab completion by pressing:

single Tab – completes file or directory name up to any ambiguous part
- if nothing shows up, there is no unambiguous match
Tab twice – display all possible completions
- you then decide where to go next

Let's have some fun with our friend the Tab key. Follow along if you can, as we use the Tab key to see the /stor/work/CBRS_unix/fastq path.

ls /st                     # press Tab key - expands to /stor/ which 
                           #   is the only match
ls /stor/w                 # press Tab key again: expands to /stor/work/,
                           #   again the only match
ls /stor/work/C            # press Tab once - you hear a "bell" sound,                    
                           #   and nothing is displayed because 
                           #   there are multiple matches
ls /stor/work/C            # press Tab a 2nd time - all matching 
                           #   entries are listed
ls /stor/work/CB           # press Tab key - expands to 
                           #   /stor/work/CBRS_unix
ls /stor/work/CBRS_unix/   # press Tab twice to see all completions
ls /stor/work/CBRS_unix/f  # press Tab once - expands to 
                           #   /stor/work/CBRS_unix/fastq

Tab key completion also works on commands! Type "bowtie" and Tab twice to see all the programs in the bowtie2 and bowtie tool suites.

Multiple commands/lines

Like everything in Unix, the command line has similarities to a text file. And in Unix, all text file "lines" are terminated by a linefeed character (\n, also called a newline).

Note: The Unix linefeed (\n) line delimiter is different from Windows, where the default line ending is carriage-return + linefeed (\r\n), and some Mac text editors that just use a carriage return (\r).

The shell executes command line input when it sees a linefeed, which happens when you press Enter after entering the command. But you can enter more than one command on a single line – just separate the commands with a semi-colon ( ; ).

Multiple command on a line

ls -l haiku.txt; cat haiku.txt

You can also split a single command across multiple lines by adding a backslash ( \ ) at the end of the line you want to continue, before pressing Enter. Just make sure there are no characters after the backslash.

Split a command across multiple lines

student01@gsafcomp02:~$ ls haiku.txt \
> mobydick.txt

The shell indicates that it is not done with command-line input by displaying a greater than sign ( > ). You can just enter more text then Enter when done. At any time during command input you can press Ctrl-c to get back to the command prompt. This is true whether you're entering a single command line or at a > continuation.

For more information, see: Intro Unix: About command line input

Literal characters and metacharacters

In the bash shell, and in most tools and programming environment, there are two kinds of input:

literal characters, that just represent (and print as) themselves
- e.g. alphanumeric characters A-Z, a-z, 0-9
metacharacters - these are special characters that are associated with an operation in the environment
- e.g. the Enter key that emits a linefeed character to end the current line

There are many metacharacters in bash: # \ $ | ~ " ' [ ] to name a few.

We'll be emphasizing the different metacharacters and their usages – which can depend on the context where they're used – both in the bash command line and in commands/programs called from bash.

More at:

Basic text manipulation

Standard streams and piping

A key to text manipulation is understanding Unix streams. Every command and Unix program has three "built-in" streams: standard input, standard output and standard error.

Note that on the command line, all three of these standard streams are mapped to your Terminal.

Most programs/commands read input data from some source, then write output to some destination.

A data source can be a file, but can also be the standard input stream.
Similarly, a data destination can be a file but can also be a stream such as standard output.

The pipe operator ( | ) connects one program's standard output to the next program's standard input. The power of the Linux command line is due in no small part to the power of piping.

The key to the power of piping is that most Unix commands can accept input from standard input instead of from files. So, for example, these two expressions are nearly equivalent:

more jabberwocky.txt
cat jabberwocky.txt | more

More at: Intro Unix: Viewing text in files: Standard streams and piping

echo, head, tail, cat -n, wc

The head and tail commands can be used to view/extract specific parts of large files.

With no options, head shows the first 10 lines of its input and tail shows the last 10 lines.
- You can use the -n option to specify how many lines to view, or just put the number you want after a dash (e.g. head -5 for 5 lines or head -1 for 1 line).
To start viewing lines at a particular line number, use tail and put a plus sign (+) in front of the number (with or without the -n option).
The cat -n option adds line numbers to the text it displays, which can help orient you when dealing with large files
- head and tail do not have options to show line numbers

Use the wc (word count) command to count text lines (wc -l) or characters (wc -c).

echo is the bash command to output text.

echo -e says to enable interpretation of backslash escapes
- so, for example, \n is interpreted as a linefeed, and \t as a tab character
echo -n says don't output the trailing newline (linefeed) character

Examples:

head -n 5 haiku.txt                   # display the 1st 5 lines of 
                                      #   "haiku.txt"  
cat -n haiku.txt                      # display "haiku.txt" contents 
                                      #   with line numbers        
cat -n haiku.txt | tail -n 7          # display the last 7 lines of
                                      #   "haiku.txt"
cat -n haiku.txt | tail -n +6         # display text in "haiku.txt"
                                      #   starting at line 6
cat -n haiku.txt | tail +5 | head -3  # display the middle stanza of
                                      #   "haiku.txt" (lines 5-7)

wc -l haiku.txt                       # count lines in "haiku.txt" file
cat haiku.txt | wc -l                 # count lines of piped-in text

echo 'Hello world!' | wc -c           # count characters output by echo,
                                      #   including the trailing newline
echo -n 'Hello world' ! wc -c         # count characters output by echo,
                                      #   without the trailing newline

More at:

basic grep

The word grep stands for general regular expression parser.

In Unix, the grep program performs regular-expression text searching, and displays lines where text matching the pattern is found.

Basic usage: grep <pattern> [file] where <pattern> describes what to search for. Importantly, grep can also take its input on standard input.

There are many grep regular expression metacharacters that control how the search is performed. We'll see more in Part 4: Advanced text manipulation, and at the grep command.

grep -i will perform a case-insensitive search
grep -n will display line numbers where the pattern was matched

Because grep's metacharacters are different from metacharacters in bash, it is always a good idea to enclose the <pattern> in single quotes so that the shell treats it as literal text and passes it through as-is to grep.

More at Intro Unix: Introducing grep

What is text?

So what exactly is text? Inside of files, text isn't characters at all – it is all numbers (0's and 1's), because that's all computers know.

On standard Unix systems, each text character is stored as one byte – eight binary bits – in a format called ASCII (American Standard Code for Information Interchange). Eight bits can store 2^{^8} = 256 values, numbered 0 - 255. In its original form values 0 - 127 were used for standard ASCII characters. Now values 128 - 255 comprise an Extended set. See https://www.asciitable.com/

The non-printable ASCII characters we care most about are:

Tab (decimal 9, hexadecimal 0x9, octal 0o011)
- backslash escape: \t
Linefeed/Newline (decimal 10, hexadecimal 0xA, octal 0o012)
- backslash escape: \n
Carriage Return (decimal 13, hexadecimal 0xD, octal 0o015)
- backslash escape: \r

# display 2 lines of text using \n for newline and \t for Tab
echo -e "aa z\nbb\tcc"                  

# use the hexdump alias to view the hex values for the alphabetic
# and special characters
echo -e "aa z\nbb\tcc\r\nddd" | hexdump

More at:

Other shell concepts

Environment variables

Environment variables are just like variables in a programming language (in fact bash is a complete programming language): they are names that hold a value assigned to them. As with all programming language variables, they have two operations: