Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titleHow to display all available options of the launcher_createor.py script
collapsetrue
launcher_creator.py -h
Short optionLong optionRequiredDescription

-n

name

Yes

The name of the job.

-at

allocationtime

 The allocation you want to charge the run toYes

Time allotment for job, format must be hh:mm:ss.

-qb

queue

Default: Development

The queue to submit to, like 'normal' or 'largemem', etc.

-w

wayness

 

Optional The number of jobs in a job list you want to give to each node. (Default is 12 for Lonestar, 16 for Stampede.)

-N

number of nodes

 

Optional Specifies a certain number of nodes to use. You probably don't need this option, as the launcher calculates how many nodes you need based on the job list (or Bash command string) you submit. It sometimes comes in handy when writing pipelines.

-t

time

Yes

Time allotment for job, format must be hh:mm:ss.

-e

email

 

Optional Your email address if you want to receive an email from Lonestar when your job starts and ends.

-l

launcher

 

Optional Filename of the launcher. (Default is <name>.sge)

-m

modules

 

Optional String of module management commands. module load launcher is always in the launcher, so there's no need to include that.

-b

Bash commands

 

Optional String of Bash commands to execute.

-j

Command list

 

Optional Filename of list of commands to be distributed to nodes.

-Bash commands

 -b OR -j must be used

Optional String of Bash commands to execute.

-j

Command list

  -b OR -j must be used

Optional Filename of list of commands to be distributed to nodes.

-a

allocation

 

The allocation you want to charge the run to. If you only have one allocation you don't need this option

-m

modules

 

Optional String of module management commands. module load launcher is always in the launcher, so there's no need to include that. Think to all the times in the class that you had to type 'module load xxxxx' while on the idev node. The same will be true for the launcher script. As you are more familiar with what types of analysis you will be doing, you will likely change your .bashrc file to limit the things you have to specify here.

-q

queue

Default: Development

The queue to submit to, like 'normal' or 'largemem', etc. You will usually want to change this to 'normal'

-w

wayness

 

Optional The number of jobs in a job list you want to give to each node. (Default is 12 for Lonestar, 16 for Stampede.)

-N

number of nodes

 

Optional Specifies a certain number of nodes to use. You probably don't need this option, as the launcher calculates how many nodes you need based on the job list (or Bash command string) you submit. It sometimes comes in handy when writing pipelines.

-e

email

 

Optional Your email address if you want to receive an email from Lonestar when your job starts and ends. If you set an environmental variable EMAIL_ADDRESS it will use that variable if you don't put anything after the -e

-l

launcher

 

Optional Filename of the launcher. (Default is <name>.sge)

-s

stdout

 

Optional Setting this flag outputs the name of the launcher to stdout.

...

Code Block
languagebash
titlehow to make a sample commands file
collapsetrue
# remember that things after the # sign are ignored by bash and most all programing languages
cds  # move to your scratch directory
nano commands
 
# the following lines should be entered into nano
echo "My name is _____ and todays date is:" > BDIB.output.txt
date >> BDIB.output.txt
echo "I have just demonstrated that I know how to redirect output to a new file, and to append things to an already created file. Or at least thats what I think I did" >> BDIB.output.txt
 
echo "i'm going to test this by counting the number of lines in the file that I am writing to. So if the next line reads 4 I remember I'm on the right track" >> BDIB.output.txt
wc -l BDIB.output.txt >> BDIB.output.txt
 
echo "I know that normally i would be typing commands on each line of this file, that would be executed on a compute node instead of the head node so that my programs run faster, in parallel, and do not slow down others or risk my tacc account being locked out" >> BDIB.output.txt
 
echo "i'm currently in my scratch directory on lonestar. there are 2 main ways of getting here: cds and cd $SCRATCH:" >>BDIB.output.txt
pwd >> BDIB.output.txt
 
echo "over the last week I've conducted multiple different types of analysis on a variety of sample types and under different conditions. Each of the exercises was taken from the website https://wikis.utexas.edu/display/bioiteam/Genome+Variant+Analysis+Course+20162017" >> BDIB.output.txt
 
echo "using the ls command i'm now going to try to remind you (my future self) what tutorials I did" >> BDIB.output.txt
 
ls -1 >> BDIB.output.txt
 
echo "the contents of those directories (representing the data i downloaded and the work i did) are as follows: ">> BDIB.output.txt
ls */* >> BDIB.output.txt
 
echo "the commands that i have run on the headnode are: " >> BDIB.output.txt
history >> BDIB.output.txt
 
echo "the contents of this, my commands file, which i will use in the launcher_creator.py script are: ">>BDIB.output.txt
cat commands >> BDIB.output.txt
 
echo "finally, I will be generating a job.slurm file using the launcher_creator.py script using the following command:" >> BDIB.output.txt
echo 'launcher_creator.py -w 1 -N 1 -n "what_i_did_at_BDIB_20162017" -t 00:02:00 -a "UT-2015-05-18"' >> BDIB.output.txt # this will create a my_first_job.slurm file that will run for 2 minutes
echo "and i will send this job to the que using the the command: sbatch what_i_did_at_BDIB_20162017.slurm" >> BDIB.output.txt  # this will actually submit the job to the Queue Manager and if everything has gone right, it will be added to the development queue.

 
ctrlo # #keyboardkeyboard command to write your nano output
crtlx # keyboard command to close the nano interface
 
launcher_creator.py -w 1 -N 1 -n "what_i_did_at_BDIB_2016" -t 00:02:00 -a "UT-2015-05-18"
sbatch what_i_did_at_BDIB_2016.slurm

Interrogating the launcher queue

Here are some of the common commands that you can run and what they will do or tell you:

CommandPurposeOutput(s)
showq -uShows only your jobs

Shows all of your currently submitted jobs, a state of:

"qw" means it is still queued and has not run yet

"r" means it is currently running

scancel <job-ID>

Delete a submitted job before it is finished running

note: you can only get the job-ID by using showq -u

There is no confirmation here, so be sure you are deleting the correct job.

There is nothing worse than deleting a job that has sat a long time by accident because you forgot something on a job you just submitted.

showqYou are a nosy person and want to see everyone that has submitted a jobTypically a huge list of jobs, and not actually informative

If the queue is moving very quickly you may not see much output, but don't worry, there will be plenty of opportunity once you are working on your own data.

 

Evaluating your first job submission

Based on our example you may have expected 1 new file to have been created during the job submission (BDIB.output.txt), but instead you will find 3 extra files as follows: what_i_did.e(job-ID), what_i_did.pe(job-ID), and what_i_did.o(job-ID). When things have worked well, these files are typically ignored. When your job fails, these files offer insight into the why so you can fix things and resubmit. 

Many times while working with NGS data you will find yourself with intermediate files. Two of the more difficult challenges of analysis can be trying to decide what files you want to keep, and remembering what each intermediate file represents. Your commands files can serve as a quick reminder of what you did so you can always go back and reproduce the data. Using arbitrary endings (.out in this case) can serve as a way to remind you what type of file you are looking at. Since we've learned that the scratch directory is not backed up and is purged, see if you can turn your intermediate files into a single final file using the cat command, and copy the new final file, the .slurm file you created, and the 3 extra files to work. This way you should be able to come back and regenerate all the intermediate files if needed, and also see your final product.

Code Block
languagebash
titlemake a single final file using the cat command and copy to a useful work directory
collapsetrue
# remember that things after the # sign are ignored by bash 
cat BDIB.output.txt > first_job_submission.final.output 
mkdir $WORK/BDIB_GVA_2016
mkdir $WORK/BDIB_GVA_2016wc -l commands  # use this command to verify the number of lines in your commands file.
# expected output:
31 commands

# if you get a much larger number than 31 edit your commands file with nano so each command is a single line as they appear above. 
launcher_creator.py -w 1 -N 1 -n "what_i_did_at_BDIB_2017" -t 00:02:00 -a "UT-2015-05-18"
sbatch what_i_did_at_BDIB_2017.slurm

Interrogating the launcher queue

Here are some of the common commands that you can run and what they will do or tell you:

CommandPurposeOutput(s)
showq -uShows only your jobs

Shows all of your currently submitted jobs, a state of:

"qw" means it is still queued and has not run yet

"r" means it is currently running

scancel <job-ID>

Delete a submitted job before it is finished running

note: you can only get the job-ID by using showq -u

There is no confirmation here, so be sure you are deleting the correct job.

There is nothing worse than deleting a job that has sat a long time by accident because you forgot something on a job you just submitted.

showqYou are a nosy person and want to see everyone that has submitted a jobTypically a huge list of jobs, and not actually informative

If the queue is moving very quickly you may not see much output, but don't worry, there will be plenty of opportunity once you are working on your own data.

 

Evaluating your first job submission

Based on our example you may have expected 1 new file to have been created during the job submission (BDIB.output.txt), but instead you will find 2 extra files as follows: what_i_did.e(job-ID), and what_i_did.o(job-ID). When things have worked well, these files are typically ignored. When your job fails, these files offer insight into the why so you can fix things and resubmit. 

Many times while working with NGS data you will find yourself with intermediate files. Two of the more difficult challenges of analysis can be trying to decide what files you want to keep, and remembering what each intermediate file represents. Your commands files can serve as a quick reminder of what you did so you can always go back and reproduce the data. Using arbitrary endings (.out in this case) can serve as a way to remind you what type of file you are looking at. Since we've learned that the scratch directory is not backed up and is purged, see if you can turn your intermediate files into a single final file using the cat command, and copy the new final file, the .slurm file you created, and the 3 extra files to work. This way you should be able to come back and regenerate all the intermediate files if needed, and also see your final product.

Code Block
languagebash
titlemake a single final file using the cat command and copy to a useful work directory
collapsetrue
# remember that things after the # sign are ignored by bash 
cat BDIB.output.txt > end_of_class_job_submission.final.output 
mkdir $WORK/BDIB_GVA_2017
mkdir $WORK/BDIB_GVA_2017/end_of_course_summary/  # each directory must be made in order to avoid getting a no such file or directory error
cp end_of_class_job_submission.final.output $WORK/BDIB_GVA_2017/end_of_course_summary/
cp what_i_did* $WORK/BDIB_GVA_2017/end_of_course_summary/  # note this  # each directory must be made in order to avoid getting a no such file or directory error
cp first_job_submission.final.output $WORK/BDIB_GVA_2016/end_of_course_summary/
cp what_i_did*grabs the 2 output files generated by tacc about your job run as well as the .slurm file you created to tell it how to run your commands file

cp commands $WORK/BDIB_GVA_20162017/end_of_course_summary/ 

 

Return to GVA2017 to work on any additional tutorials you are interested in.