Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

TACC resources are partitioned into queues: a named set of compute nodes with different characteristics. The main ones on ls6 are listed below. Generally you use development (-q development) when you are writing and testing first test your code, then normal once you're sure your commands will execute properly.

...

  • When you run a batch job, your project allocation gets "charged" for the time your job runs, in the currency of SUs (System Units).
  • SUs are related in some way to node hours, usually 1 SU = 1 "standard" node hour.

Tip
titleJobs tasks should have similar expected runtimes

Jobs should consist of tasks that will run for approximately the same length of time. This is because the total node hours for your job is calculated as the run time for your longest running task (the one that finishes last).

For example, if you specify 100 commands and 99 finish in 2 seconds but one runs for 24 hours, you'll be charged for 100 x 24 node hours even though the total amount of work performed was only ~24 hours.

...

tasks per node (wayness)cores available to each taskmemory available to each task
1128~256 GB
264~128 GB
432~64 GB
816~32 GB
168~16 GB
324~8 GB
642~4 GB
1281~1 ~2 GB
  • In launcher_creator.py, wayness is specified by the -w argument.
    • the default is 128 (one task per core)
  •  A special case is when you have only 1 command in your job.
    • In that case, it doesn't matter what wayness you request.
    • Your job will run on one compute node, and have all cores available.

Your choice of the wayness parameter will depend on the nature of the work you are performing: its computational intensity, its memory requirements and its ability to take advantage of multi-processing /multi-threading (e.g. bwa -t option or hisat2 -p option).

...

Code Block
languagebash
cat cmd*log

# or, for a listing ordered by nodecommand namenumber (the 11th2nd space-separated field)
cat cmd*log | sort -k 112,112n

The vertical bar ( | ) above is the pipe operator, which connects one program's standard output to the next program's standard input.

piping.pngImage RemovedImage Added

(Read more about the sort command at Some Linux fundamentalscommands: cut, sort, uniq, and more about Piping)

You should see something like output below.

...

Notice that there are 4 different host names. This expression:

Code Block
languagebash
# the host (node) name is in the 11th field
cat cmd*log | awk '{print $11}' | sort | uniq -c

should produce output something like this (read more about piping commands to make Piping a histogram)

Code Block
languagebash
   4 c302c303-005.ls6.tacc.utexas.edu
   4 c302c303-006.ls6.tacc.utexas.edu
   4 c305c304-005.ls6.tacc.utexas.edu
   4 c305c304-006.ls6.tacc.utexas.edu

Some best practices

...

Here's an example directory structure

$SCRATCH$WORK/my_project
                      /01.original      # contains or links to original fastq files
                      /02.fastq_prep    # run fastq QC and trimming jobs here
                      /03.alignment     # run alignment jobs here
             /gene_counts            /04.# analyze gene overlap here
                      /51.test1         # play around with stuff here
                      /52.test2         # play around with other stuff here

...

Code Block
languagebash
titleRelative path syntax
cd $SCRATCH$WORK/my_project/02.fastq_prep
ls ../01.original/my_raw_sequences.fastq.gz

...

Code Block
languagebash
titleSymbolic link to relative path
cd $SCRATCH$WORK/my_project/02.fastq_prep
ln -ssf ../01.original fq
ls ./fq/my_raw_sequences.fastq.gz

...

Code Block
languagebash
titleRelative path exercise
# navigate through the symbolic link in your Home directory
cd ~scratch~/scratch/core_ngs/slurm/simple 
ls ../wayness
ls ../..
ls -l ~/.bashrc

...