Unix (Linux/macOS) command line basics

This page covers the basics of navigating directories and manipulating files using the Unix command line. This includes terminals in any Linux or macOS environment, both of which are derived from Unix. It also includes Linux environments enabled within Windows via Windows Subsystem for Linux.

Manual ("man") pages

Most Unix commands have a manual explaining their function and how they should be run, as well as a list of all optional arguments. It is a good idea to read a command's manual before attempting to run it for the first time. To open the manual for a given command, use the man command, followed by the name of the command in question. For example:

Use the up and down arrow keys to scroll through the manual, and press the q key to quit.

Manuals are also hosted online on sites like die.net. You can search these sites for manuals if you would prefer to read them in a separate window from your terminal. Note that in some rare cases, different versions of a command will have slightly different syntax and options, and a manual viewed online may not correspond exactly to the version installed in your terminal.

Basic concepts and command syntax

Each Unix command has its own specific options, its own input requirements, and its own output. However, most common commands use the same basic syntax, which looks like this:

command -options input

"Command" refers to the specific process that you would like to run.

"Options" refers to any number of optional arguments that modify the default command, which are provided following a hyphen ( - ) and which can often be used in combination with one another.

"Input" refers to any input file, directory, or string which the command will process or act upon. Some commands require an explicit input, while others have a default value that will be used if no input is provided.

Unix is entirely case sensitive, meaning command names, options, and input files or directories must be capitalized correctly. Commands are generally called using all lower case letters, to make them easier to type and remember, and it is a good idea to make directory and filenames all lower case as well.

Navigating directories

When using the terminal, you navigate directories (i.e. folders) just like you would in Windows Explorer or macOS Finder. Just like Explorer and Finder, you will need to understand how directories and files are nested in order to find the files you want to work with. Navigating directories is all done using typed commands, rather than clicking on folder icons as in Explorer and Finder. Directories and files in a Unix file path are separated using forward slashes ( / ), unlike Windows, which uses backslashes ( \ ).

Identifying the current working directory (cwd)

Unlike Explorer or Finder, the terminal can only have one directory "open" at a time. This is called the current working directory, or cwd. The terminal always displays the current working directory between the active user's name (to the cwd's left, separated by a colon) and the cursor (to the cwd's right, separated by the $ symbol). Some terminals will display each element in a different color to make distinguishing them easier:

The pwd (print working directory) command can also be used to output the cwd.

Listing the contents of a directory (ls command)

Use the ls (list) command to output the contents of the cwd:

Add the -1 (hyphen followed by the number 1) option to the right of the command to display the output as one line per item:

Add the -l (lowercase L) option to display additional information about each item in the output, such as permissions and date of last modification:

If no explicit input is provided, the ls command will display the contents of the cwd. If a specific directory path is provided as input, the contents of that directory will be output instead.

The root and home directories

Unlike Windows, which separates files among different drives and assigns each drive a unique letter (such as C:), Unix has a unified file system, meaning every directory and file on every disk is nested below a single directory, which is called the root.  The root directory is represented by a single forward slash ( / ).

The root directory contains a number of system directories, one of which is the home directory. The home directory will contain subdirectories for each user on the system, and when a user launches the terminal the cwd will be set to their individual home directory by default. The home directory is represented by a tilde (~).

Changing directories (cd command)

You can change the cwd to a new directory using the cd (change directory) command, followed by the name of the directory you would like to navigate to. Use cd and ls commands in conjunction with one another to list the contents of the cwd and navigate to subdirectories.

When typing the name of a file or directory in the terminal, you can use the Tab key to auto-complete the file or directory's name based on what you have already entered. For example, if the cwd contains a directory named "temp" and no other directories that start with "te", you can type "cd te" and hit the Tab key to auto-complete "cd temp". Since the terminal knows there is only one possible target beginning with "te" within the cwd, it can insert it for you. This is especially useful as a way of avoiding needing to type very long directory names. It also reduces opportunities for typos. If the Tab key is not auto-completing a directory or filename the way you expect, use the ls command to double-check that the file or directory you are targeting is spelled the way you think.

Remember that the terminal is case sensitive: if the directory is named "temp" (lowercase), the Tab key will not auto-fill it if you type "cd Te" (with a capital T)

The cwd's parent directory is represented using .. (two periods). To navigate "up" a directory, type "cwd ..":

You can always jump directly from the cwd to any directory on the system (not just subdirectories of the cwd) by typing cd followed by the target directory's full path, also called an absolute path. Because an absolute path always starts at the root directory, it will always begin with a forward slash:

Finding, copying, moving, and renaming files

Finding files with ls

In addition to listing the contents of a directory, the ls command can also be used to search for specific files or file types. To list all the files in a directory that begin with a given string of characters, type ls followed by the string and an asterisk ( * ). The asterisk is a "wildcard" symbol that the system reads as "any character or number of characters". For example, to list all the files in the cwd that begin with "cardenal", type "ls cardenal*":

You can also use the asterisk wildcard at the beginning of a string to list all files that have a particular file extension:

Finding files with find

The ls command will only search for files found in a single directory. To perform a search for files with a given name or format that crawls through subdirectories, use the find command.

Use the -name option followed by the search string (in quotes) to find all matching file or directory names:

The period at the beginning of each file path in the results represents the cwd.

You can use asterisk wildcard characters just like with the ls command:

Copying files (cp, scp, and rsync commands)

There are several different ways of copying files, depending on your needs. The most straightforward method is the cp (copy) command, which copies a source file to a target directory, using the following syntax:

cp /path/to/source/file.abc /path/to/target/directory/

By default, cp can copy individual files from a source location to a target directory, but it cannot copy an entire directory. To copy a directory, add the -r (recursive) option. The following command will move the entire "source" directory into "target":

cp -r /path/to/source /path/to/target/

To copy the contents of a source directory to a target directory (but not the source directory itself, add a slash to the end of the source directory path:

cp -r /path/to/source/ /path/to/target/

The scp (secure copy) command is basically identical to the cp command, but uses ssh encryption and is more suited to transfers between different machines. You should use scp rather than cp if you will be copying from one server or computer to another.

The rsync (recursive sync) command is similar to cp -r or scp -r. It is used to synchronize two copies of a directory that are stored in two separate locations, and can be used to copy an entire directory from one location to another. Like cp and scp, rsync requires a source and target:

rsync /path/to/source /path/to/target/

rsync is very flexible and has many options for how files are copied and how progress is displayed during the process. See the rsync manual for more information about these options.

Moving and renaming files (mv command)

When you need to move rather than copy files from one location to another, use the mv (move) command. The command syntax is very similar to the cp command, in that you must provide a source file and target destination:

mv /path/to/source/file.abc /path/to/target/directory/

The mv command is also used to rename files. When used this way, the mv command "moves" a file from one path to another, even if that path is within the same directory. To rename a file, add the new name to the end of the target destination:

mv /path/to/source/file.abc /path/to/source/newFile.abc

You can combine these two uses of mv to move a file to a new directory and rename it at the same time:

mv /path/to/source/file.abc /path/to/target/directory/newFile.abc

Calculating file and directory size (du command)

The du (disk usage) command calculates the size of a file, directory, or set of directories. To calculate the size of an individual file, simply run du followed by the path to the file:

By default, du displays file size in bytes. Use the -h option to display the size in a more human-readable format:

Running du followed by the path to a directory will calculate the size of all subdirectories within that location:

Add the -a option to calculate the size of subdirectories and files within a given directory:

Use the -s option to calculate the total size of a directory and its contents:

Calculating free disk space (df command)

The df (disk free) command is used to calculate the amount of free space on a given volume. When run without any options or arguments, it will display the free space and disk usage of all mounted volumes, as well as the local disk:

This can be somewhat overwhelming. To display the free space and disk usage of one volume in particular, run df followed by the path to the volume's mount point. The mount point is listed in the final column, under "Mounted on" of the default df output. For example, if you have the dps volume mounted at /dps:

Like du, df displays disk usage in bytes by default. Use the -h option to produce a more human-readable display:

Searching within files (grep command)

The grep command is used to search for a specific string within a plain text or CSV file. grep has many options including case-insensitive search, inverse search, and regular expression search. See /wiki/spaces/utldigitalstewardship/pages/43057645 for a full guide to using grep.

Displaying file contents (cat, less, and head/tail commands)

There are several commands that can be used to display the contents of plain text or CSV files right in the terminal window. Each displays information in slightly different ways.

To output every line from a file, run the cat (concatenate) command, followed by the path to a text or CSV file:

If you want to preview the contents of a file without necessarily outputting every line to the terminal, run the less command, followed by the path to a file. This will fill your terminal window with the contents of the file, stopping once the window is full. You can then scroll down (and back up) through the text file using the down and up arrow keys. Hit the q key to exit the less output screen.

To quickly output only the first lines in a file, run the head command, followed by the path to a file. By default, head will output the first 10 lines of the file, but you can also specify the number of lines to output using the -n option:

The tail command works just like head, but it displays the last lines of the file. Just like head, you can specify the number of lines to output using -n:

Terminal multiplexer (tmux command)

Any time you have to run a process that will take a very long time, it's a good idea to use the tmux (terminal multiplexer) command to avoid accidentally interrupting the process. Using tmux you can begin a process, "detach" from the session to close the terminal window without killing the process, and then "reattach" to the session later to review the results of your process. tmux also allows you to run several commands simultaneously, one per tmux session.

To launch a new tmux session, simply enter tmux. Whenever the active terminal is a tmux session, a green stripe will be visible along the bottom of the terminal window:

The session number is the value shown on the left side of the stripe. The first session launched will be called 0, and each subsequent session will be assigned the next largest number.

To detach from the session without interrupting an active process or erasing the terminal history, enter Ctrl+b, followed by d. This will return you to the main terminal window. To reattach to a previous tmux session, run tmux attach -t, followed by the session number (remember that the first session will be number 0, not 1). If there is only one session open, tmux attach will attach to it. If you aren't sure whether there are any tmux sessions active, run tmux ls:

To scroll up and down through the terminal output of a tmux session, enter Ctrl+b, followed by [ (left square bracket). This will enable the cursor and allow you to scroll using the arrow keys or page up and page down. Press q to return to the last line of the terminal.

To close a tmux session (and kill whatever processes are running within it!), attach to the session and enter Ctrl+d. This cannot be undone, so before doing this, be sure that your process has finished, and that you have saved whatever terminal output you need.

Redirecting output and command pipelines (stdin and stdout)

You may encounter situations where you need to manipulate or reuse the output from a particular command. For example, the output from a given command may not be formatted the way you need it, or you may want to save the output from a command to a text file that you can review later. The terminal allows you to do this using stdin and stdout.

Every process on the terminal has a standard input and a standard output, called stdin and stdout for short. For basic commands, stdin is the text of the command provided by the user, such as "ls -1 /dps/david/temp". The stdout is the output produced by running the command in stdin. In the following example, the stdin is underlined in red, while the stdout is all the text contained within the green box:

You can sometimes manipulate the format and structure of stdout using command options (such as "-1" in this example, which changes the output to a single directory or file per line), but that's still relatively limited, since the available options vary from one command to the next. Most of the time, if you need to manipulate or reuse a command's output, the best approach is to redirect stdout and/or pipe commands.

Redirecting stdout

The simplest way of redirecting stdout is to save it to a text file, which you can open, review, and manipulate using any text editor. To redirect stdout to a text file, use the > (greater than sign), followed by an output path. Think of it like an arrow pointing to an output file. In the following example, the stdout from "ls -1 /dps/david/temp" is saved as "/dps/david/tempcontents.txt":

You can open that file with a text editor and see that its contents look just like the output we'd ordinarily expect the command to produce in the terminal:

This redirection method will always create a new file at the output path you provide. If there is already a file at that location, it will be deleted and replaced with the stdout from your command, so be sure not to redirect to the location of any important existing file.

If you want to append your results to an existing file without overwriting it, use >> instead of >. In the following example, the contents of /dps/david/temp/ are saved to /dps/david/tempcontents.txt (shown in red), then the contents of /dps/david/temp/pdfs/ (shown in green) are added to the end of the same file

Command pipelines

Another option is to redirect stdout straight into another command using a "pipeline" of commands. Pipelines take the stdout from one command and feed it directly into a second command to form part of that command's stdin. Commands are separated using the pipe character ( | ), located above the backslash on a US keyboard layout. Command pipelines proceed from one command to the next from left to right, and there is no limit to the number of commands that can be chained together.

For example, if you wanted to find the first 25 lines in /dps/david/temp/manifests.txt that contain "tif", you could run the following command:

grep tif /dps/david/temp/manifests.txt

If you actually ran this command, however, you'd find that it produced far more than 25 lines. It also takes much longer than necessary, since it has to run through the entire file rather than stopping after 25 matches like you want. You could always save these results as a text file, open the file, and delete everything after the 25th match, but that's still quite time consuming.

A better approach is to combine the grep and head commands using a pipeline to stop the process after the 25th match:

In this example, the stdout from the grep command has been fed directly into the head command that follows it. Ordinarily, when you run head you have to point it to a specific file to be read as an input, but in a command pipeline, you can omit that piece in order to use the previous command's stdout as the input.

Remember that there is no limit to the number of commands you can combine in a pipeline. If you needed to strip these results of the first two columns and leave only the file paths, you could add the cut command:

The cut command receives the stdout from the previous head command and (using the -f3 option) cuts that list down to just the third column. If you then needed to count the number of characters in these 25 file paths, you could add the wc command:

The wc command receives the stdout from the previous cut command and (using the -c) option counts the number of characters it contains. If you wanted to save this final output to a text file, you can always redirect the final stdout using > followed by an output path.

Looping through a file (while read method)

If you need to open a text file and perform some action using the value stored in each line, you can use the while read method to construct a "for loop". A for loop reads a line from a text file, stores the value of the line, performs some action using the value, and then moves onto the next line to repeat the process. To do this, you will need to provide a "while read" statement, the command(s) you want to run using each line, and identify the file containing the lines you want to act on.

For example, the file /dps/david/temp/msg_files.txt contains a list of .msg files stored within /dps/david/temp/test:

If you needed to calculate the character count for each one of these files, you could run wc -c on each file individually, or you could integrate that command into a for loop:

while read -r line; do wc -c "$line"; done < /dps/david/temp/txt_files.txt

Every for loop is made up of three main elements, separated by semicolons:

  1. A "while read" statement (underlined in red in the above example)
    1. "while read -r" tells the terminal to read each line in an input file (that will be named at the end)
    2. "line" is the name of the variable that will be used as a stand-in for the actual value of each line in the file. This variable name is entirely arbitrary, and you can choose whatever variable name seems most logical to you, as long as you reference it correctly in the next part of the loop.
  2. A "do" statement (underlined in green)
    1. "do" tells the terminal to run a given command for each line in the input file
    2. "wc -c" is the command we have chosen to run in this case. Any terminal command can be incorporated into a for loop, using its original options and syntax requirements. You can also build command pipelines and redirect output as part of a for loop.
    3. ""$line"" (wrapped in quotes) is the name of the variable that was assigned in the first part of the loop, used here as the input for the "wc -c" command. If you changed the variable name to "abc" in the first part, you would change it to "$abc" in this part.
  3. A "done" statement (underlined in blue)
    1. "done" closes the loop
    2. "< /dps/david/temp/txt_files.txt" is the input file containing the lines to be read and stored as a variable in part one, then acted upon in part two. Think of < as an arrow feeding the contents of the text file into the preceding script.

In the above example, the terminal opens the input file indicated at the very end of the script, reads the first line and stores the value found there as "line", runs "wc -c" on that value, outputs the result, then moves onto the value in the next line. It does this until it reaches the end of the input text file, at which point the loop closes.

While the specific command(s) you run using a for loop may be different from the above example, all for loops adhere to this basic structure.

MacOS: Hidden .DS_Store and ._AppleDouble files

MacOS creates, among other hidden files, .DS_Store files and sometimes "AppleDouble" files on drives or network shares that are not formatted with an Apple file system.

.DS_Store files are created by MacOS Finder and store metadata about folders, including for instance Finder window size/position/configuration. Apple used to provide a technical bulletin to explain the use of these files and how to prevent them from being created on network shares, but the support document has been removed. These files typically can be safely removed.

"AppleDouble" files appear to be duplicates of other files existing in the same folder, prepended with dot and underscore (._). These files contain so-called Resource forks of a file, which are saved as a separate file on non-Apple file systems: https://en.wikipedia.org/wiki/AppleSingle_and_AppleDouble_formats. Exercise care when working with these files, as they might contain metadata that is not included in the Data fork portion of a file.