MPI BLAST
MPI BLAST
The mpiBLAST module allows one to perform large, distributed BLAST database searches that could be prohibitively time-consuming on single-node computing systems.
You will learn from this exercise:
The basics of configuring mpiBLAST
Setting up a parallel environment (-pe) variable
Launching MPI jobs using ibrun
Set up .ncbirc
mkdir $WORK/mpiblast
chmod a+rw $WORK/mpiblast
Create a file name .ncbirc in your home directory and open it (e.g. nano ~/.ncbirc) – edit it to be similar to this, though you will to substitute the path your your own $WORK/mpiblast directory
mpiBLAST Shared=/work/01374/vaughn/mpiblast Local=/tmp
NCBI data=/work/01374/vaughn/mpiblast/data
Set up work directory and create an mpiBLAST-formatted database
cp -R /work/01374/vaughn/home/lonestar/UTRC/tutorial02 $WORK
cd $WORK/tutorial02
cp -R data $WORK/mpiblast/
module load mpiblast
$TACC_MPIBLAST_BIN/mpiformatdb -i plantrefseq.fa --nfrags=24
mv plantrefseq.fa.* $WORK/mpiblast/
Write and submit a MPI-based SGE script for mpiBLAST
#!/bin/bash
# You will learn from this exercise:
## The basics of configuring mpiBLAST
## Setting up a parallel environment (-pe) variable
## Launching MPI jobs using ibrun
#$ -V
#$ -cwd #$ -N tut2-mpiblast #$-A 20111206BIO #$ -j y #$ -pe 12way 48 #$ -q development #$ -l h_rt=00:15:00
# Set up module
module load mpiblast
# ibrun is the MPI launcher at TACC
# euc_assembly.fa contains 721 transcript assemblies from Eucalyptus that are 2kb or longer
# plantrefseq.fa is the reference peptide set from NCBI refseq
## Corresponds to the database we built using mpiformatdb
# We are asking blastx to emit tabular results ( -m 9 )
ibrun -n 24 -o 0 $TACC_MPIBLAST_BIN/mpiblast -p blastx \
-i euc_assembly.fa -d plantrefseq.fa -m 9 \
-o euc_assembly.blast9
Links