Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titlesuggested directory set up
cds
cp -r $BI/gva_course/structural_variation/data GVA_sv_tutorial
cd GVA_sv_tutorial
Note
titlePossible Input/Output errors experienced with the above command

At least 1 student was experiencing an issue with the above command where an "Input/Output" error message was generated, and the files were copied, but the files were


This is Illumina mate-paired data (having a larger insert size than paired-end data) from genome re-sequencing of an E. coli clone.

...

Info
titlePeople have previously asked what makes bowtie2 better/different than 'bowtie'

In short, No. A detailed explanation can be found at this link. This is mentioned here somewhat out of posterity ... years ago when multiple different mappers were used (including bowtie and bowtie2) it was pointed out that SV is very difficult to detect when reads are mapped with bowtie as it does not identify discordantly mapped read pairs.

Analyze read mapping distribution

The first step is to look at all mapped read pairs and whittle down the list only to those that have an unusual insert sizes (distances between the two reads in a pair).

...

Code Block
titleCreate the file svdetect.conf with this text
linenumberstrue
<general>
input_format=sam
sv_type=all
mates_orientation=RF
read1_length=35
read2_length=35
mates_file=/scratch/#####/<USERNAME>/GVA_sv_tutorial/61FTVAAXX.ab.sam
cmap_file=/scratch/#####/<USERNAME<USERNAME>/GVA_sv_tutorial/NC_012967.1.lengths
num_threads=48
</general>

<detection>
split_mate_file=0
window_size=2000
step_length=1000
</detection>

<filtering>
split_link_file=0
nb_pairs_threshold=3
strand_filtering=1
</filtering>

<bed>
  <colorcode>
    255,0,0=1,4
    0,255,0=5,10
    0,0,255=11,100000
  </colorcode>
</bed>

...

Code Block
titleCommands to run SNVDetect
SVDetect linking -conf svdetect.conf
SVDetect filtering -conf svdetect.conf
SVDetect links2SV -conf svdetect.conf
Warning
titlePossible errors on idev nodes

In reviewing these tutorials these commands were not executing for me in idev sessions for unknown reasons. By chance I had an idev session time out with me noticing and I noticed it did run on the head node. Try the above commands 1 at a time, but if you see error messages like the following logout of your idev session with the logout command, and then execute them 1 at a time on the head node. While this is not the best citizenship, the program gave no indications of being a problem.

Feedback from other students says this is not a problem limited to me and that multiple people are experiencing the same problem. Running these commands on the head node should be acceptable, for reasons that will be discussed in zoom.

No Format
Config/General.pm did not return a true value at /corral-repl/utexas/BioITeam/bin/SVDetect line 48.
BEGIN failed--compilation aborted at /corral-repl/utexas/BioITeam/bin/SVDetect line 48.


Take a look at the resulting file: 61FTVAAXX.ab.sam.links.filtered.sv.txt. Another downside of command line applications is that while you can print files to the screen, the formatting is not always the nicest. On the plus side in 95% of cases, you can directly copy the output from the terminal window to excel and make better sense of what the columns actually are

...

Expand
titleclick here for installation instructions

Optional: Install SVDetect

We have installed SVdetect for you already as installation is a bit difficult (though still much easier than the alternatives listed in the introduction). You can verify it's location using which SVDetect in your $PATH under $BI/bin. One of the advantages (or disadvantages) of using the communal resource is that someone else can update all the necessary programs and packages for you. Alternatively, you can make a personal copy of the program yourself using the following commands. NOTE that this is presented mostly to underscore how spoiled we are with modules and the BioITeam.

Install SVDetect scripts

Navigate to the SVDetect project page

More information:

Download the code onto TACC.

Code Block
wget -N http://downloads.sourceforge.net/project/svdetect/SVDetect/0.80/SVDetect_r0.8b.tar.gz
tar -xvzf SVDetect_r*.tar.gz
cd SVDetect_r*

Move the Perl scripts and make them executable

Code Block
cp bin/SVDetect $HOME/local/bin
chmod 775 scripts/BAM_preprocessingPairs.pl
cp scripts/BAM_preprocessingPairs.pl $HOME/local/bin

Install required Perl modules

SVdetect requires a few Perl modules to be installed. In the default TACC environment, you can use the cpan shell to install most well-behaved Perl modules (with the exception of some complicated ones that require other libraries to be installed or things to compile). Here's how:

Code Block
titleInstall Perl modules required for SVDetect
This can not be done from an idev session
login1$ cpan
# choose yes to do as much automatically as possible and 'local::lib' for how you want to install modules as you don't have admin rights on TACC
...
cpan[4]> install Config::General
...
cpan[4]> install Tie::IxHash
...
cpan[4]> install Parallel::ForkManager
...
cpan[4]> quit
login1$

Return to GVA2019 GVA2020 course page.