Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. When you type a command, only locations that are in your PATH variable are searched for an executable command matching that name.
  2. When the command is found in your PATH, the computer immediately substitutes the location it was found for the short command name you entered, and stops searching.
    1. This means that things that are early in your path are always searched first. In some extreme circumstances if you add a bunch of locations with lots of files to the front of your PATH, you can actually slow down your entire computer, so try to limit the path variable to only look in directories containing executablesexecutable files.
  3. The module system always assumes that when you load a module, you intend to use it, and thus puts the executables for that module at the front of your PATH. This is one of the reasons we try to limit the number of modules we load by default (it puts other commands further back in the list). 
  4. In your .bashrc file, modules are loaded first (including samtools). 
  5. After modules are loaded, we further manipulate your PATH variable several times. The last section involving breseq has 2 alternative manipulations:
    1. The first which you can see we have commented out:

      No Format
      # export PATH=$BI/breseq/bin:$PATH
      1. This command says make the variable PATH equal to the variable BI plus /breseq/bin and then add on the existing value of $PATH
    2. The second we actually use.

      No Format
      export PATH=$PATH:$BI/breseq/bin
      1. This command says make the variable PATH equal to the existing value of $PATH variable and then add on BI plus /breseq/bin 
  6. Warning
    titleOne of the most important lessons you can ever learn

    Anytime you manipulate your PATH variable you always want to make sure that you include $PATH on the right side of the equation somewhere separated by : either before it, after it, or on both sides of it if you want it in the middle of 2 different locations. As we are explaining right now, there are reasons and times to put it in different relative places, but if you fail to include it (or include a typo in it by calling it say $PTAH) you can actually remove access to all existing commands including the most basic things like "ls" "mkdir" "cd".

Expand
titleReload the module version of samtools which is the correct one for the tutorial.

Simply reload samtools using the module system, check the version, and which version is now being used.

Code Block
languagebash
titlestill stuck?
collapsetrue
module load samtools
samtools  # check output for version information
which samtools
 
#required output:
/opt/apps/intel18/samtools/1.6/bin/samtools

If this doesn't seem familiar or make sense, get my attention on zoom.

Warning
titleAs alluded to in the introduction, this tutorial is designed to run (and will actually only run) with one of these 2 versions. That version is the module version. At the end of this tutorial there is an optional suggestion to try to use the bioITeam version of samtools, but for now...

execute the following command and make sure you get the 2nd line as output:

No Format
tacc:~$ which samtools
/opt/apps/intel18/samtools/1.6/bin/samtools

If you see something different get my attention or the tutorial will not work.

Prepare your directories

...

Tip
titleDoing more with which

A lot of text was just devoted to the $PATH variable and how its manipulated and how to investigate it in a few different ways. This is because PATH variables are fairly common, especially when if start experimenting with multiple different programs that may have similar underlying requirements.

Code Block
languagebash
titlewhich has its own -h option that can be used to see what other potentially useful options it has
which -a samtools

When you have multiple results, the top line is the line that will be used unless you specify the entire path to the command. This can be useful in diagnosing what is going wrong particularly when you have an error message that says a command didn't work, and which -a tells you you have multiple different instances available.


Prepare your directories

Since the $SCRATCH directory on lonestar is effectively infinite for our purposes, we're going to copy the relevant files from our mapping tutorial into a new directory for this tutorial. This should help you identify what files came from what tutorial if you look back at it in the future. Let's copy over just the read alignment file in the SAM format and the reference genome in FASTA format to a new directory called GVA_samtools_tutorial.

...

First, you need to index the reference file. (This isn't the same as indexing it for read mapping. It's indexing it so that SAMtools can quickly jump to a certain base in the reference.)

...

The following 3 commands are used to convert from SAM to BAM format, sort the BAM file, and index the BAM file. As you might guess this is computationally intense and as such must be iDEV node or submitted as a job (more on this on Friday). If you want to submit them to the job queue, you will want to separate them with a ";" to ensure that they run sequentially on the same noderather than simultaneously as each uses the output of the previous command. Under no circumstances should you run this on the head node.

idev -m 180 -r CCBB_Day_2 -A UT-2015-05-18

If not, and you need help getting a new idev node, see this tutorial.

Warning
titleDo not run on head node

Use showq -u to verify you are still on the idev node.

Code Block
titleUse this command to restart an idev session if you are not on one
collapsetrue
Code Block
languagebash
titleCommands to be executed in order...
samtools view -b -S -o SRR030257.bam SRR030257.sam
samtools sort SRR030257.bam -o SRR030257.sorted.bam
samtools index SRR030257.sorted.bam
Tip
This is a really common sequence of commands, so you might want to add it to your personal cheat sheet.


Examine the output of the previous commands to get an idea of whats going on. Here are some prompts of how to do that:

...