sge-tutorial
Introduction
This is a short overview of how to use the batch queuing system on the CCBB clusters. It's a work in progress so please let
us know if something is found to be not correct, or is confusing. In the tutorial below, text in blocks colored, and set off
from the main text like this
This is a sample line.
is the literal text of one or more commands typed at the UNIX prompt, and the resulting output; or it's the literal contents of text file of some sort. Text contained in brackets like this <text> is an indicator that you are supposed to replace the brackets and texts with something more meaningful to your job script.
Basic Overview
Processing data on the cluster is a 5 step process consisting of
- Naming the processing run, and creating a project directory
- Uploading data files to project directory
- Devising proper processing scheme
- What programs are to be used
- What needs to be copied over to the node for processing
- What needs to be copied back to be saved.
- Write, and submit job script; or submit request for interactive login to a node
- Once processing is done review logs to make sure no problems occurred
The last step is critical. You cannot assume that because your job finishes it actually completed successfully. Any number of things can go wrong, such as a node crashing, a resource (disk, memory) being exhausted, programming errors, or usage errors. We cannot predict or control when these things happen so you should be prepared for that. As mentioned below the output and error messages generated during a job run on the cluster are captured into output, and error files for you to review.
Depending on what software is needed we have some pre-written scripts that you will find in /share/scripts (or via \\files.ccbb.utexas.edu\scripts). We can help write others for you as needed, or you can use these as a template. Each of our scripts has several variables that tell the script where to find source data, config files, and other parameters that alter the job. Changing just these few items is enough to have a workable script that you can submit.
Whether using our scripts, or writing your own, it's advisable to run them a few at a time while you determine how much shared resources such as RAM, and disk that you need.
Submitting Jobs
In some cases, you may just want to run a very quick job, or otherwise just need to run some commands without packaging them up in a job script. If the footprint of the jobs is light, then they can just be run directly on the headnode after logging in. If the jobs are going to use a lot of memory, a large percentage of available CPU time (ie, be CPU bound), or a large percentage of avaiable Input/Output bandwidth (ie, be disk bound), then it is better to run them within SGE. If you run commands on the script,and those have any amount of significant amount of runtime, please monitor their CPU usage. If we let the too many CPU hogs pile up on the headnode it will quickly become very painful for interactive users trying to move files around, write scripts, and submit them. If you still do not know enough about your data processing to script it, it's possible to use the '''qlogin''' command to request a slot from SGE. Here is an example.
[cdupree@phylocluster sge]$ qlogin waiting for interactive job to be scheduled ... Your interactive job 4799 has been successfully scheduled. Establishing /opt/gridengine/bin/rocks-qlogin.sh session to host compute-0-2.local ... Last login: Sun Nov 18 11:29:25 2007 from phylocluster3.local Rocks Compute Node Rocks 4.1 (Fuji) Profile built 20:27 01-Nov-2007 Kickstarted 16:35 01-Nov-2007 [me@compute-0-2 ~]$
Here SGE has logged me into node c0-2, and now I'm sitting at another shell prompt. When done with my processing, I type '''exit''' and I'm returned back to the head node:
Connection to compute-0-2.local closed. /opt/gridengine/bin/rocks-qlogin.sh exited with exit code 0 [cdupree@phylocluster sge]$
Please avoid tying up a slot, and exit when you are done with your processing. If jobs start queuing up, and it's determined that an interactive session is not actively being used, then that session will be killed to free up the slot.
By far the nicest way to run jobs, is to write a job script and submit it for a non-interactive run. This lets you exit out of the head node, and walk away waiting until the job is done running. Here is a simple script, in a file called tut1,
#!/bin/bash echo Hello World
This script when run just outputs "Hello World" and then ends. To submit the job into the queue you would use the command
qsub -S /bin/bash tut1
Right after doing this, you can type
qstat -u your_eid
you should see something like this
job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 74952 0.00000 tut1 eid qw 12/02/2007 11:55:54 1
which shows that the job is queued up and waiting for submission. If you type qstat again, you'll probably see the
job seems stuck in this state, and may stay that way for a few minutes. If there are slots available, then the job
will get run, and this job should exit very quickly after that.
We can make one quick simplification to the script. The -S /bin/bash qsub option specifies the shell which should be
used to interpret the job script. We can make the submission process a bit easier, by exploiting the fact that SGE
will interpret lines beginning with ''$#'' as arguments to the qsub command. Thus, we can modify tut1 to become tut2
#!/bin/bash #$ -S /bin/bash echo Hello World
You might wonder why you need to still use the sh-bang line ''#!/bin/bash'' if SGE is just going to ignore it. In fact,
it's only there so that your job script can also be used as a normal shell script.
Writing Job Scripts
More About Job Script Options
There are a number of commonly useful qsub options (or if you'd rather options to embed in your scripts). We've already seen ''-S /bin/bash'' which specifies the shell used to interpret the job script. You can also use ''-q <queue>'' to select a queue (read below to determine what queue might exist on your system, and replace ''<queue>'' with one of those. You can actually specify a particular node with ''queue@node'', but generally this is not needed.
Now look back in your home directory. You should see two files named something like this:
tut1.e6216 tut1.o6216
As mentioned in the [UNIX tutorial] every UNIX process has the concept of standard input from which it reads input, standard output to which it writes normal output, and standard error to which the program can write critical error messages (note that which of these output channels a program uses is it's own choice; most user written programs tend to use only standard output). It's obvious from the above examples that SGE sets your job script up so that when it runs, all programs started in the script will use ''<scriptname>.e<job id>'' for error messages, and ''<scriptname>.o<job id>'' for output messages. These can be overridden by using
-e <filename> -o <otherfilename>
which hard codes the names of the output and input files. A related option is to override SGE's use of the script name; this is done with the option
-N <jobname>
Two final options which are generally useful are used to instruct SGE to mail you when important events occur. These are
-m beasn -M <email>
with 'beasn' having the following meaning
- b - beginning of job
- e - end of job
- a - job is aborted or rescheduled
- s - job is suspended
- n - never
Choose whatever set of options you wish.
Note that SGE depends on the seeing ''$#'' at the beginning of a line in order to find its options. If you temporarily need to override an option you can do so
by separating the ''$'' and ''#'', by putting a space at the beginning of the
line, or by putting in an extra ''#'' to make ''##$''.
Shell Environment Variables
SGE adds the following shell variables to each job it runs
- $HOME home directory of the user on execution machine
- $USER user ID of job owner
- $JOB_ID current job ID
- $JOB_NAME current job name (see -N option)
- $HOSTNAME name of the execution host
- $TASK_ID array job task index number
Bare Bones Script
Generally a processing run will consist of the following steps:
- Create local working directory
- Transfer input files (data and science config files) to working directory
- Perform science processing
- Copy results back to home directory
Here is a bare bones script that accomplishes this.
#!/bin/bash # SGE Options #$ -S/bin/bash #$ -N MyJob # Create Working Directory WDIR=/state/partition1/$USER/$JOB_NAME-$JOB_ID mkdir -p $WDIR if [ ! -d $WDIR ] then echo $WDIR not created exit fi cd $WDIR # Copy Data and Config Files cp $HOME/Data/FrogProject/FrogFile . # Put your Science related commands here /share/apps/runsforever FrogFile # Copy Results Back to Home Directory RDIR=$HOME/FrogProject/Results/$JOB_NAME-$JOB_ID mkdir -p $RDIR cp NobelPrizeWinningResults $RDIR # Cleanup rm -rf $WDIR
Monitoring Queue Status
You can use the ''qstat'' command. By default this shows a list of jobs running in the queue and their state. The output will look like this
job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 74950 0.55500 tut1 eid qw 12/01/2007 20:42:45 queue@compute-0-6.local 1
and lists out the job id, priority, name, user, state, submission time, queue, and CPU slots used by the job. You can view individuals jobs using the
qstat -j <job_id>
option. The output in this case is much more verbose, and includes information about the state of the job, and queuing considerations. You can also use the
qstat -u <user_id>
to see only your jobs. One final option is to use the
-f
option to see the status of the queues on the systems. Note that if you are on one of the clusters with more than one queue (see below), then with -f you must also select a queue using
-q <queue_name>
Deleting Job Scripts
It might happen that you realize there is a mistake with a job, or want to move it to another queue. In this case, you can use the ''qdel'' command to remove the job. This is true whether the job is still in the wait state, or is running. The syntax is
qdel <jobid>
You can also use
-u <user_id>
to remove all of the jobs you have on a particular system.
Cluster Queues
Some of the CCBB clusters might have special queues set up. This is typically done so that the research group owning a cluster, can preempt the use of a node by a non-group member. Sometimes it might be done so that a set number of slots can be reserved for a class to use to complete homework assignments or labs. These queues tend to be very ephemeral in nature, though. Below is the list of queues
Note that on systems with multiple queues you must use -q with qsub to select the queue which you wish to use. Otherwise SGE will use all available queues, and this is probably not what you want to happen.
- queues on phylocluster
- all.q - all users
- phylo - old 4 processor, 2 GB RAM systems (being phased out)
- bigmem - 16 processor, 64 GB system (all can use)
- kirkp - 16 processor, 64 GB systems (all can use, but overridden by kirkp-lab queue use)
- kirkp-lab - same nodes as kirkp. When a job is placed on a node in this queue, the same node is suspended in kirkp. This stops currently running jobs from running in that queue on the node, and it stops further submissions on to that node. This queue should only be used if you you need to ensure that you are reserving all of the resources to a node, or that you need to override the use of a node which is full. This queue is restricted to use by Kirkpatrick lab members only.
- wilke - queue for Wilke lab users
- test - uses 1 of the older phylo nodes. This queue is intended for people that need to test a job script, or other short running program. Jobs in the test queue are terminated after 15 minutes of wall clock runtime. (Note: activation of this queue is pending).
Linux-related Problems & Solutions
Some user may find their script doesn't work as they expected. The reason could be just Linux basics related.
1. Some program require you include Full Path of the input file or other files.
Full Path is also called absolute path.
Use ls $PWD/filename.extension to get the full path of the files. It looks like /root-level/second-level/third-level/filename.extension.
./ or ../ or $HOME or ~/ is Relative Path.
2. How can lock one node so that only me can use the cores on that node and run multi-thread program?
Use qusb -pe serial 16 qsub_Job_script_name.qsub
where 16 is the number of cores you need.
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.