This page should serve as a reference for the many "things Linux" we use in this course. It is by no means complete – Linux is **huge** – but offers introductions to many important topics.
...
- Macs and Linux have a Terminal program built-in
- Windows options:
- Windows 10+
- Command Prompt and PowerShell programs have ssh and scp (may require latest Windows updates)
- Start menu → Search for Command
- Putty – http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
- simple Terminal and file copy programs
- download either the Putty installer or just putty.exe (Terminal) and pscp.exe (secure copy client)
- Windows Subsystem for Linux – Windows 10 Professional includes a Ubuntu-like bash shells
- See https://docs.microsoft.com/en-us/windows/wsl/install-win10
- We recommend the Ubuntu Linux distribution, but any Linux distribution will have an SSH client
- Command Prompt and PowerShell programs have ssh and scp (may require latest Windows updates)
- Windows 10+
Use ssh (secure shell) to login to a remote computers.
| Code Block | ||||
|---|---|---|---|---|
| ||||
# General form: ssh <user_name>@<full_host_name> # For example ssh abattenh@ls6.tacc.utexas.edu |
...
- samtools view converts the binary small.bam file to text and writes alignment record lines one at a time to standard output.
- -F 0x4 option says to filter out any records where the 0x4 flag bit is 0 (not set)
- since the 0x4 flag bit is set (1) for unmapped records, this says to only report records where the query sequence did map to the reference
- | head -1000
- the pipe connects the standard output of samtools view to the standard input of head
- the -1000 option says to only write the first 1000 lines of input to standard output
- | cut -f 5
- the pipe connects the standard output of head to the standard input of cut
- the -f 5 option says to only write the 5th field of each input line to standard output (input fields are tab-delimited by default)
- the 5th field of an alignment record is an integer representing the alignment mapping quality
- the resulting output will have one integer per line (and 1000 lines)
- | sort -n
- the pipe connects the standard output of cut to the standard input of sort
- the -n option says to sort input lines according to numeric sort order
- the resulting output will be 1000 numeric values, one per line, sorted from lowest to highest
- | uniq -c
- the pipe connects the standard output of sort to the standard input of uniq
- the -c option option says to just count groups of lines with the same value (that's why they must be sorted) and report the total for each group
- the resulting output will be one line for each group that uniq sees
- each line will have the text for the group (here the unique mapping quality values) and a count of lines in each group
Viewing text in files
cat, more or less
The most basic way of view file data is the cat command. While the name comes from its ability to concatenate one or more files, it can be used to output the contents of a single file. For example:
| Code Block |
|---|
cat ~/.profile
# or, to see line numbers in the output:
cat -n ~/.profile |
Using cat by itself is fine for small files, but it reads/writes everything in the file without stopping. So for larger files you use a pager such as more, or less. A pager reads text and outputs only one "page" of text at a time, then waits for you to ask it to advance. And a "page" of text is the number of lines that will fit on your visible Terminal.
Using the more pager:
| Code Block |
|---|
more ~/.bashrc |
- Press the spacebar to see the next page.
If there is additional output, you'll see the --More-- indicator again; if not, the command prompt appears again.
- To end the more display, just type q (quit) or Ctrl-c.
Using the less pager:
| Code Block |
|---|
less ~/.bashrc
# to see line numbers in the output:
less -N ~/.bashrc
# to use case-insensitive matching:
less -I ~/.bashrc |
Basic navigation in less:
- Use q to quit less at any time
- space or Ctrl-f advances one page forward; Ctrl-b goes back one page
- down arrow goes down (forward) one line; up arrow goes up (backward) one line
Searching in less:
- /<pattern> – search for <pattern> in forward direction
- n – goes to the next match of <pattern>
- N – goes to the previous match of <pattern>
- ?<pattern> – search for <pattern> in backward direction
- n – previous match going back
- N – next match going forward
Introducing grep
Another method of text searching is using the grep program, which stands for general regular expression parser. In Unix, the grep program performs regular-expression text searching, and displays lines where the pattern text is found.
Nearly every programming language offers grep functionality, where a pattern you specify – a regular expression or regex – describes how the search is performed.
There are many grep regular expression metacharacters that control how the search is performed (see the grep command).
Basic usage is: grep '<pattern>' <file> where
- '<pattern>' (usually enclosed in single quotes) just contains alphanumeric characters (A-Z, a-z, 0-9).
Common options:
- grep -i will perform a case-insensitive search
- grep -n will display line numbers where the pattern was matched
head and tail
Two other commands that are useful for viewing text are head and tail.
- With no options, head shows the first 10 lines of its input and tail shows the last 10 lines.
- Use the -n option followed by a number to specify how many lines to view']
- or just put the number you want after a dash (e.g. -5 for 5 lines or -1 for 1 line)
- use the tail -n +<integer> syntax to display all input starting from that line
Examples:
| Code Block |
|---|
head ~/.bashrc # view the 1st 10 file lines
head -n 2 ~/.bashrc # view the 1st 2 file lines
head -5 ~/.bashrc # view the 1st 5 file lines
tail ~/.bashrc # view the last 10 file lines
tail -n 3 ~/.bashrc # view the last 3 file lines
tail -1 ~/.bashrc # view the last line of the file
# view 7 lines of text starting at line 20
tail -n +20 ~/.bashrc | head -7 |
Since head and tail do not have an option to display line numbers, you can pipe in text that includes line numbers with cat -n:
| Code Block |
|---|
cat -n ~/.bashrc | head -4 # view the 1st 4 lines w/line numbers
cat -n ~/.bashrc | tail -5 # view the last 5 lines w/line numbers
# view 6 lines of text starting at line 25
cat -n ~/.bashrc | tail -n +25 | head -6 |
More Linux concepts
Environment variables
...
Navigation and operations in nano are similar to those we discussed in Command line editing
You can just type in text, and navigate around using arrow keys (up/down/left/right). A couple of other navigation shortcuts:
...