/
Trimmomatic - GVA2020

Trimmomatic - GVA2020

Overview

As mentioned in the introduction tutorial as well as the read processing tutorial, read processing can make a huge impact on downstream work. While cutadapt which was introduced in the read processing tutorial is great for quick evaluation or dealing with a single bad sample, it is not as robust as some other trimmers in particular when it comes to removing sequence that you know shouldn't be present but may exist in odd orientations (such as adapter sequences from the library preparation).

A note on the adapter file used here

The adapter file listed here is likely the correct one to use for standard library preps that have been generated in the last few years, but may not be appropriate for all library preps (such as single end sequencing adapters, or nextera based preps). look to both the trimmomatic documentation and your experimental procedures at the bench to figure out if the adapter file is sufficient or if you need to create your own


Learning objectives:

  1. Install trimmomatic

  2. Set up a small script to work around the annoying java invocation

  3. Remove adapter sequences from some plasmids and evaluate effect on read quality, or assembly.

Installing trimmomatic

Trimmomatic's home page can be found at this link which includes links to the paper discussing the program, and a user manual. Trimmomatic is far above average for as far as programs go, most will not have a user manual, may not have been updated since originally published, etc. This is what makes it such a good tool.

Checking for installation

To verify you have placed the .jar file in a location that is already in your path try the following java invocation to pull up the help information.
java -jar $HOME/local/bin/trimmomatic-0.39.jar

If the above command works, jump down to the section on making a bash script. Otherwise continue with the next section to install the program

Installing using wget

In a new web browser window/tab, navigate to the trimmomatic home page. In the Downloading Trimmomatic section; right click on the 'binary' link for version 0.39 and copy that link address.

Which to choose binary files or uncompiled source code

The binary files will be what you want 100 out of 100 times, likely until you begin working with a specific program that you identify bugs in, submit them to the developers, they actually respond (most programs are not in active development), they try to address them, and begin asking you to try using the compiled version to check different scenarios. 

Use the wget command to download the link you just copied to a new folder named src in your $WORK directory.

Using the mkdir command to create a folder named 'src' inside of your $WORK directory
mkdir $WORK/src
cd $WORK/src

If you already have a src directory, you'll get a very benign error message stating that the folder already exists and thus can not be created. 

The wget command is very simple. It has 2 parts: 1. the command 'wget', and 2. the location of the file you want to download.
wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.39.zip

You should see a download bar showing you the file has begun downloading, when complete the ls command will show you a new compressed file named "Trimmomatic-0.39.zip". Next we need to uncompress this file, and copy the executable file to a location already in our $PATH variable.

unzip Trimmomatic-0.39.zip
cp Trimmomatic-0.39/trimmomatic-0.39.jar $HOME/local/bin

If you don't see the zip file or are unable to cd into the 0.39 directory after unzipping it let the instructor know.

To verify you have placed the .jar file in a location that is already in your path try the following java invocation to pull up the help information.
java -jar $HOME/local/bin/trimmomatic-0.39.jar

When you compare how wordy and complicated that is to the other programs you have encountered in the course, it makes sense that we would want a simpler way of accessing the program which is exactly what we will do next.