Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page should serve as a reference for the many "things Linux" we use in this course. It is by no means complete – Linux is **huge** – but offers introductions to many important topics.

...

  • Macs and Linux have a Terminal program built-in
  • Windows options:

Use ssh (secure shell) to login to a remote computers.

Code Block
languagebash
titleSSH to a remote computer
# General form:
ssh <user_name>@<full_host_name>

# For example
ssh abattenh@ls6.tacc.utexas.edu

...

  • Just type in any additional text you want
  • To delete text after the cursor, use: Ctrl-d or:
    • Delete key on Windows
    • Function-Delete keys on Macintosh
  • To delete text before the cursor, use, use: Ctrl-h or:
    • Backspace key on Windows
    • Delete key on Macintosh
  • Use Ctrl-k (kill) to delete everything on the line after the cursor
  • Use Ctrl-y (yank) to copy the last killed text to where the cursor is

...

  • samtools view converts the binary small.bam file to text and writes alignment record lines one at a time to standard output.
    • -F 0x4 option says to filter out any records where the 0x4 flag bit is 0 (not set)
    • since the 0x4 flag bit is set (1) for unmapped records, this says to only report records where the query sequence did map to the reference
  • | head -1000
    • the pipe connects the standard output of samtools view to the standard input of head
    • the -1000 option says to only write the first 1000 lines of input to standard output
  • | cut -f 5
    • the pipe connects the standard output of head to the standard input of cut
    • the -f 5 option says to only write the 5th field of each input line to standard output (input fields are tab-delimited by default)
      • the 5th field of an alignment record is an integer representing the alignment mapping quality
      •  the resulting output will have one integer per line (and 1000 lines)
  • | sort -n
    • the pipe connects the standard output of cut to the standard input of sort
    • the -n option says to sort input lines according to numeric sort order
    • the resulting output will be 1000 numeric values, one per line, sorted from lowest to highest
  • | uniq -c
    • the pipe connects the standard output of sort to the standard input of uniq
    • the -c option option says to just count groups of lines with the same value (that's why they must be sorted) and report the total for each group
    • the resulting output will be one line for each group that uniq sees
    • each line will have the text for the group (here the unique mapping quality values) and a count of lines in each group

More Linux concepts

Environment variables

Environment variables are just like variables in a programming language (in fact bash is a complete programming language), they are "pointers" that reference data assigned to them. In bash, you assign an environment variable as shown below:

...

Viewing text in files

cat, more or less

The most basic way of view file data is the cat command. While the name comes from its ability to concatenate one or more files, it can be used to output the contents of a single file. For example:

Code Block
cat ~/.profile

# or, to see line numbers in the output:
cat -n ~/.profile

Using cat by itself is fine for small files, but it reads/writes everything in the file without stopping. So for larger files you use a pager such as more, or less. A pager reads text and outputs only one "page" of text at a time, then waits for you to ask it to advance. And a "page" of text is the number of lines that will fit on your visible Terminal

Using the more pager:

Code Block
more ~/.bashrc
  • Press the spacebar to see the next page.
  • If there is additional output, you'll see the  --More-- indicator again; if not, the command prompt appears again.

  • To end the more display, just type q (quit) or Ctrl-c.

Using the less pager:

Code Block
less ~/.bashrc

# to see line numbers in the output:
less -N ~/.bashrc

# to use case-insensitive matching:
less -I ~/.bashrc

Basic navigation in less:

  • Use q to quit less at any time
  • space or Ctrl-f advances one page forward; Ctrl-b goes back one page
  • down arrow goes down (forward) one line; up arrow goes up (backward) one line

Searching in less:

  • /<pattern> – search for <pattern> in forward direction
    • n – goes to the next match of <pattern>
    • N – goes to the previous match of <pattern>
  • ?<pattern> – search for <pattern> in backward direction
    • nprevious match going back
    • Nnext match going forward

Introducing grep

Another method of text searching is using the grep program, which stands for general regular expression parser. In Unix, the grep program performs regular-expression text searching, and displays lines where the pattern text is found.

Nearly every programming language offers grep functionality, where a pattern you specify – a regular expression or regex – describes how the search is performed. 

There are many grep regular expression metacharacters that control how the search is performed (see the grep command).  

Basic usage is:  grep '<pattern>' <file> where

  • '<pattern>' (usually enclosed in single quotes) just contains alphanumeric characters (A-Z, a-z, 0-9).

Common options:

  • grep -i will perform a case-insensitive search
  • grep -n will display line numbers where the pattern was matched

head and tail

Two other commands that are useful for viewing text are head and tail.

  • With no options, head shows the first 10 lines of its input and tail shows the last 10 lines.
  • Use the -n option followed by a number to specify how many lines to view']
    • or just put the number you want after a dash (e.g. -5 for 5 lines or -1 for 1 line)
  • use the tail -n +<integer> syntax to display all input starting from that line

Examples:

Code Block
head ~/.bashrc           # view the 1st 10 file lines
head -n 2 ~/.bashrc      # view the 1st 2 file lines
head -5 ~/.bashrc        # view the 1st 5 file lines

tail ~/.bashrc           # view the last 10 file lines
tail -n 3 ~/.bashrc      # view the last 3 file lines
tail -1 ~/.bashrc        # view the last line of the file

# view 7 lines of text starting at line 20
tail -n +20 ~/.bashrc | head -7

Since head and tail do not have an option to display line numbers, you can pipe in text that includes line numbers with cat -n:

Code Block
cat -n ~/.bashrc | head -4       # view the 1st 4 lines w/line numbers
cat -n ~/.bashrc | tail -5       # view the last 5 lines w/line numbers

# view 6 lines of text starting at line 25
cat -n ~/.bashrc | tail -n +25 | head -6

More Linux concepts

Environment variables

Environment variables are just like variables in a programming language (in fact bash is a complete programming language), they are "pointers" that reference data assigned to them. In bash, you assign an environment variable as shown below:

Code Block
languagebash
titleSet an environment variable
export varname="Some value, here it's a string"

...

Navigation and operations in nano are similar to those we discussed in Command line editing

You can just type in text, and navigate around using arrow keys (up/down/left/right). A couple of other navigation shortcuts:

...

  • To delete text after the cursor, use Ctrl-d or:
    • Delete key on Windows
    • Function-Delete keys on Macintosh
  • To delete text before the cursor, use Ctrl-h or:
    • Backspace key on Windows
    • Delete key on Macintosh
  • Use Ctrl-k (kill) to delete everything on the line
    • This is different from Ctrl-k on the command line where it deletes everything after the cursor
  • Use Ctrl-u (uncut) to paste the just-killed text at the cursor
    • Recall this operation is Ctrl-y (yank) for command line editing

...

  • Ctrl-x/Ctrl-s - write out the file
  • Ctrl-x/Ctrl-c - exit emacs

You can just type in text, and navigate around using arrow keys. A couple of other navigation shortcuts:

...