Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

File Name

Description

Sample

SRR030252_1.fastq SRR030252_2.fastq

Paired-end Illumina 36-bp reads

0K generation evolved E. coli strain

SRR030253_1.fastq SRR030253_2.fastq

Paired-end Illumina 36-bp reads

2K generation evolved E. coli strain

SRR030254_1.fastq SRR030254_2.fastq

Paired-end Illumina 36-bp reads

5K generation evolved E. coli strain

SRR030255_1.fastq SRR030255_2.fastq

Paired-end Illumina 36-bp reads

10K generation evolved E. coli strain

SRR030256_1.fastq SRR030256_2.fastq

Paired-end Illumina 36-bp reads

15K generation evolved E. coli strain

SRR030257_1.fastq SRR030257_2.fastq

Paired-end Illumina 36-bp reads

20K generation evolved E. coli strain

SRR030258_1.fastq SRR030258_2.fastq

Paired-end Illumina 36-bp reads

40K generation evolved E. coli strain

NC_012967.1.fastafasta NC_012967.1.gbk


Reference Genome

E. coli B str. REL606

...

And its parts can be explained as follows:

partpurpose
-j 9use 9 processors/threads speeding things up by less than a factor of 9 as discussed in earlier tutorials 
-o run_output/<xx>kdirects all output to the run_output directory, AND creates a new directory with 2 digits (<XX>) followed by a K for individual samples data. If we don't include the second part referencing the individual sample, breseq would write the output from all of the runs on top of one other. The program will undoubtedly get confused, possibly crash, but definitely not be analyzable
<ID>this is just used to denote read1 and or read2 ... note that in our acutal commands they reference the fastq files, and are supplied without an option
&> logs/<XX>00K.log.txtRedirect both the standard output and the standard error streams to a file called <XX>00k.log.txt. and put that file in a directory named logs. The &> are telling the command line to send the streams to that file.
Info
titleWhy do we use -j 9?
  1. Each stampede2 compute node has 68 processors available.
  2. We have 7 samples to run, so by requesting 9 processors, we allow all 7 samples to start at the same time leaving 5 unused processors.
  3. If we had requested 10 processors for each sample, only 6 samples would start initially and the 7th would start after the first finishes.

...

Again while in nano you will edit most of the same lines you edited in the in the breseq tutorial. Note that most of these lines have additional text to the right of the line. This commented text is present to help remind you what goes on each line, leaving it alone will not hurt anything, removing it may make it more difficult for you to remember what the purpose of the line is

Line numberAs isTo be
16

#SBATCH -J jobName

#SBATCH -J mutli_sample_breseq
17

#SBATCH -n 1

#SBATCH -n 7

21

#SBATCH -t 12:00:00

#SBATCH -t 6:00:00

22

##SBATCH --mail-user=ADD

#SBATCH --mail-user=<YourEmailAddress>

23

##SBATCH --mail-type=all

#SBATCH --mail-type=all

27

conda activate GVA2021

conda activate GVA-breseq

31

export LAUNCHER_JOB_FILE=commands

export LAUNCHER_JOB_FILE=breseq_commands

The changes to lines 22 and 23 are optional but will give you an idea of what types of email you could expect from TACC if you choose to use these options. Just be sure to pay attention to these 2 lines starting with a single # symbol after editing them.

...

Code Block
languagebash
titlesubmit the job to run on the que
sbatch fastqcbreseq.slurm


Evaluating the run

Generic information about checking the status of a run in the job que system can be found in our earlier tutorial: Stampede2 Breseq Tutorial GVA2021#Checkingyoursubmittedjob, but that has to do with TACC running the command, not with breseq finishing the analysis and how to verify it completed successfully. As noted above, our commands included "&>" followed by a file name so that all of the information that would normally print to the screen, instead printed to a log file. You can use the less command to evaluate any of the log files to see the information that would have been printed to the screen if we had run the commands interactively on an idev node. As you may have noticed from the earlier breseq runs, the final line of the breseq run is "+++ SUCCESSFULLY COMPLETED" when things have finished without errors. Since we have 7 samples, it is useful to check for this all at once:

...