...
In order to run breseq, we need to install it. If you think this sounds like a great opportunity to use conda you are right! Using https://anaconda.org/ you find 2 different results for breseq. 1 is in the bioconda channel that we have worked with multiple times and has been downloaded >19,000 times. The other is from a contributor with limited track record, nothing has been updated in more than 3.5 years, and has only been accessed 6 times. Like with apps on your phone the answer to "which is the correct one" should be rather obvious.
...
language | bash |
---|
title | Check that you have access to breseq |
---|
...
If you try the basic installation command, you might notice the following important lines:
No Format |
---|
The following packages will be UPDATED:
...
openssl 1.0.2u-h7b6447c_0 --> 1.1.1k-h27cfd23_0
...
The following packages will be DOWNGRADED:
samtools 1.9-h10a08f8_12 --> 1.7-1 |
Recall that in our SNV tutorial, we went through a lot of trouble to install samtools version 1.9 and that part of the solution was preventing openssl version 1.1 from installing. Based on this (and our desire to maintain these tools), it is a better idea to set up a new environment with breseq in it as we did for SVDetect.
As breseq is a "all in 1 tool" that works to map, call variants, sift signal from noise, and provide basic visualization, you may think that breseq is the only program you would use for the analysis of appropriate samples. Do not forget that read qc and read processing actually take place before running breseq. Therefor a full environment might contain (such as the one I use in my own analysis): fastqc, trimmomatic, and breseq. Nicely, all 3 programs are in the bioconda channel. Using what you have already learned so far, see if you can create a new environment with these 3 programs.
Code Block |
---|
language | bash |
---|
title | You can name your new environment anything you want, my suggestion would be GVA-breseq so you remember both that it was part of this class, as well as what is in it. |
---|
collapse | true |
---|
|
conda create --name GVA-breseq -c bioconda fastqc trimmomatic breseq
conda activate GVA-breseq |
Using the what you know so far see if you can figure out what versions of the 3 programs you have installed is.
Expand |
---|
title | Click here for expected answer |
---|
|
No Format |
---|
(GVA-breseq) tacc:/scratch/0004/train402/GVA_samtools_tutorial$ fastqc --version
FastQC v0.11.9
(GVA-breseq) tacc:/scratch/0004/train402/GVA_samtools_tutorial$ trimmomatic -version
0.39
(GVA-breseq) tacc:/scratch/0004/train402/GVA_samtools_tutorial$ breseq --version
breseq 0.35.7
|
|
breseq should now run using the breseq command. breseq without any options will show you what the command expectations are.
...
Code Block |
---|
language | bash |
---|
title | By this point in the course you should not need to expand this box to see the suggested solution. You should continue expanding boxes such as this to make sure you are not drifting too far |
---|
collapse | true |
---|
|
mkdir $SCRATCH/GVA_breseq_lambda_mixed_pop
cp $BI/ngs_course/lambda_mixed_pop/data/* $SCRATCH/GVA_breseq_lambda_mixed_pop |
Note |
---|
title | Possible errors on idev nodes |
---|
| As mentioned over zoom this is one instance that i know for sure copying these files while on an idev node may not work giving Input/Output | . If you are already | an idev session and this does not work, just use the logout command to exit the idev session and retry the copy command. If both methods fail, please get my attention so we can figure out what is going on.By a similar token if you actually are on an idev node and are able to transfer the files, please let me know as it may help figure out what the real source of the error is |
As mentioned yesterday, you can not copy from the BioITeam (because it is on corral-repl) while on an idev node. Logout of your idev session, copy the files, and be sure to start a new idev session as breseq should not be run on the headnode. |
Code Block |
---|
title | Now use the ls command to see what files were copied. again, you should not need to expand this to get the output listed below |
---|
collapse | true |
---|
|
ls $SCRATCH/GVA_breseq_lambda_mixed_pop
|
...
Because this data set is relatively small (roughly 100x coverage of a 48,000 bp genome), a breseq run will take < 5 minutes, but it is computationally intense enough that it should not be run on the head node since we have reservations and theres no reason not to use them.
Warning |
---|
title | Remember to make sure you are on an idev done |
---|
|
For reasons discussed numerous times throughout the course already, please be sure you are on an idev done. Remember the hostname command and showq -u can be used to check if you are on one of the login nodes or one of the compute nodes. If you need more information or help re-launching a new idev node, please see this tutorial. |
Code Block |
---|
language | bash |
---|
title | breseq command |
---|
|
cd $SCRATCH/GVA_breseq_lambda_mixed_pop
breseq -r lambda.gbk lambda_mixed_population.fastq
|
...