Page Comparison

...

File Name	Description	Sample
`SRR030252_1.fastq SRR030252_2.fastq`	Paired-end Illumina 36-bp reads	0K generation evolved E. coli strain
`SRR030253_1.fastq SRR030253_2.fastq`	Paired-end Illumina 36-bp reads	2K generation evolved E. coli strain
`SRR030254_1.fastq SRR030254_2.fastq`	Paired-end Illumina 36-bp reads	5K generation evolved E. coli strain
`SRR030255_1.fastq SRR030255_2.fastq`	Paired-end Illumina 36-bp reads	10K generation evolved E. coli strain
`SRR030256_1.fastq SRR030256_2.fastq`	Paired-end Illumina 36-bp reads	15K generation evolved E. coli strain
`SRR030257_1.fastq SRR030257_2.fastq`	Paired-end Illumina 36-bp reads	20K generation evolved E. coli strain
`SRR030258_1.fastq SRR030258_2.fastq`	Paired-end Illumina 36-bp reads	40K generation evolved E. coli strain
`NC_012967.1.fastafasta NC_012967.1.gbk`	Reference Genome	E. coli B str. REL606

...

And its parts can be explained as follows:

part	purpose
-j 9	use 9 processors/threads speeding things up by less than a factor of 9 as discussed in earlier tutorials
-o run_output/<xx>k	directs all output to the run_output directory, AND creates a new directory with 2 digits (<XX>) followed by a K for individual samples data. If we don't include the second part referencing the individual sample, breseq would write the output from all of the runs on top of one other. The program will undoubtedly get confused, possibly crash, but definitely not be analyzable
<ID>	this is just used to denote read1 and or read2 ... note that in our acutal commands they reference the fastq files, and are supplied without an option
&> logs/<XX>00K.log.txt	Redirect both the standard output and the standard error streams to a file called <XX>00k.log.txt. and put that file in a directory named logs. The &> are telling the command line to send the streams to that file.

Info

title	Why do we use -j 9?

Each stampede2 compute node has 68 processors available.
We have 7 samples to run, so by requesting 9 processors, we allow all 7 samples to start at the same time leaving 5 unused processors.
If we had requested 10 processors for each sample, only 6 samples would start initially and the 7th would start after the first finishes.

...

Again while in nano you will edit most of the same lines you edited in the in the breseq tutorial. Note that most of these lines have additional text to the right of the line. This commented text is present to help remind you what goes on each line, leaving it alone will not hurt anything, removing it may make it more difficult for you to remember what the purpose of the line is

Line number	As is	To be
16	#SBATCH -J jobName	#SBATCH -J mutli_sample_breseq
17	#SBATCH -n 1	#SBATCH -n 7
21	#SBATCH -t 12:00:00	#SBATCH -t 6:00:00
22	##SBATCH --mail-user=ADD	#SBATCH --mail-user=<YourEmailAddress>
23	##SBATCH --mail-type=all	#SBATCH --mail-type=all
27	conda activate GVA2021	conda activate GVA-breseq
31	export LAUNCHER_JOB_FILE=commands	export LAUNCHER_JOB_FILE=breseq_commands

The changes to lines 22 and 23 are optional but will give you an idea of what types of email you could expect from TACC if you choose to use these options. Just be sure to pay attention to these 2 lines starting with a single # symbol after editing them.

...

Code Block

language	bash
title	submit the job to run on the que

sbatch fastqcbreseq.slurm

Evaluating the run

Generic information about checking the status of a run in the job que system can be found in our earlier tutorial: Stampede2 Breseq Tutorial GVA2021#Checkingyoursubmittedjob, but that has to do with TACC running the command, not with breseq finishing the analysis and how to verify it completed successfully. As noted above, our commands included "&>" followed by a file name so that all of the information that would normally print to the screen, instead printed to a log file. You can use the less command to evaluate any of the log files to see the information that would have been printed to the screen if we had run the commands interactively on an idev node. As you may have noticed from the earlier breseq runs, the final line of the breseq run is "+++ SUCCESSFULLY COMPLETED" when things have finished without errors. Since we have 7 samples, it is useful to check for this all at once:

...

Versions Compared

Old Version 1

New Version 2

Key

Evaluating the run