MPI BLAST

MPI BLAST

MPI BLAST

The mpiBLAST module allows one to perform large, distributed BLAST database searches that could be prohibitively time-consuming on single-node computing systems.

You will learn from this exercise:

  • The basics of configuring mpiBLAST

  • Setting up a parallel environment (-pe) variable

  • Launching MPI jobs using ibrun

Set up .ncbirc

  1. mkdir $WORK/mpiblast

  2. chmod a+rw $WORK/mpiblast

  3. Create a file name .ncbirc in your home directory and open it (e.g. nano ~/.ncbirc) – edit it to be similar to this, though you will to substitute the path your your own $WORK/mpiblast directory


mpiBLAST
Shared=/work/01374/vaughn/mpiblast
Local=/tmp

NCBI
data=/work/01374/vaughn/mpiblast/data

Set up work directory and create an mpiBLAST-formatted database

  1. cp -R /work/01374/vaughn/home/lonestar/UTRC/tutorial02 $WORK

  2. cd $WORK/tutorial02

  3. cp -R data $WORK/mpiblast/

  4. module load mpiblast

  5. $TACC_MPIBLAST_BIN/mpiformatdb -i plantrefseq.fa --nfrags=24

  6. mv plantrefseq.fa.* $WORK/mpiblast/

Write and submit a MPI-based SGE script for mpiBLAST


#!/bin/bash

# You will learn from this exercise: ## The basics of configuring mpiBLAST ## Setting up a parallel environment (-pe) variable ## Launching MPI jobs using ibrun

#$ -V

#$ -cwd
#$ -N tut2-mpiblast
#$-A 20111206BIO
#$ -j y
#$ -pe 12way 48
#$ -q development
#$ -l h_rt=00:15:00

# Set up module module load mpiblast # ibrun is the MPI launcher at TACC # euc_assembly.fa contains 721 transcript assemblies from Eucalyptus that are 2kb or longer # plantrefseq.fa is the reference peptide set from NCBI refseq ## Corresponds to the database we built using mpiformatdb # We are asking blastx to emit tabular results ( -m 9 )

 

ibrun -n 24 -o 0 $TACC_MPIBLAST_BIN/mpiblast -p blastx \ -i euc_assembly.fa -d plantrefseq.fa -m 9 \ -o euc_assembly.blast9

 

Links