Introduction To TACC Short Course Cheats

Update Your Profile

On Lonestar:

echo 'source /corral-repl/utexas/BioITeam/bin/profile_ngs_course.bash' >> ~/.profile

source .profile

This will change your PATH to give you access to databases and custom scripts set up by members of the BioITeam at UT. (That's a double >>, not a single >, important!)

 

A Parametric Job Example

Prepare a directory to run the job in, and split the short read files into four chunks.

  • mkdir BWA_Example
  • cd BWA_Example
  • cp /work/01863/benni/IntroToTacc/BWA_Example/SRR* .
  • split -d -l 1000 SRR1580546_1.fastq reads_R1-
  • split -d -l 1000 SRR1580546_2.fastq reads_R2-

 

Create a new text file for a list of commands to run:

nano bwa_commands.txt

Copy the 4 lines below into your editing window:

bwa mem /corral-repl/utexas/BioITeam/tmp/benni/hg19_ref_index/Homo_sapiens.GRCh37.60.dna.fasta reads_R1-00 reads_R2-00 > test00.sam

bwa mem /corral-repl/utexas/BioITeam/tmp/benni/hg19_ref_index/Homo_sapiens.GRCh37.60.dna.fasta reads_R1-01 reads_R2-01 > test01.sam

bwa mem /corral-repl/utexas/BioITeam/tmp/benni/hg19_ref_index/Homo_sapiens.GRCh37.60.dna.fasta reads_R1-02 reads_R2-02 > test02.sam

bwa mem /corral-repl/utexas/BioITeam/tmp/benni/hg19_ref_index/Homo_sapiens.GRCh37.60.dna.fasta reads_R1-03 reads_R2-03 > test03.sam

and exit out of nano using ^X, making sure to save the file.

 

Generate the launcher script with:

launcher_creator.py -a DNAdenovo -n bwa_test -t 00:10:00 -q normal -m 'module load bwa/0.7.7' -w 2 -j bwa_commands.txt

and submit it using qsub:

qsub bwa_test.sge

Check that it's in the queue:

qstat

 

When the job finishes, you should have four SAM files that contain the alignment information: test0*.sam. They have header lines. To combine them using cat, we want to get rid of the headers. (The proper way of combining them would be to use samtools sort and merge, but that's just barely beyond our scope today.)

awk '! /^@/' test00.sam > test_noheader-00.sam

cat test_noheader*.sam