/
Structural Variant (SV) calling with SVdetect 2019

Structural Variant (SV) calling with SVdetect 2019

Overview

Most approaches for predicting structural variants require you to have paired-end or mate-pair reads. They use the distribution of distances separating these reads to find outliers and also look at pairs with incorrect orientations. As mentioned during several of the presentations, many researchers choose to ignore these types of mutations and combined with the increased difficulty of accurately identifying them, the community is less settled on the "best" way to analyze them. Here we present a tutorial on SVDetect based on the quality of its instructions, and easy of installation despite its use of relatively hefty configuration files.

Other possible tools:

  • BreakDancer - hard to install prerequisites on TACC. Requires installing libgd and the notoriously difficult GD Perl module.
  • PEMer - hard to install prerequisites on TACC. Requires "ROOT" package.

Good discussion of some of the issues of predicting structural variation:

Example: E. coli genome with structural variation

Here's an E. coli genome re-sequencing sample where a key mutation producing a new structural variant was responsible for a new phenotype involving citrate, something the Barrick lab has studied.

suggested directory set up
cds
cp -r $BI/gva_course/structural_variation/data GVA_sv_tutorial
cd GVA_sv_tutorial

This is Illumina mate-paired data (having a larger insert size than paired-end data) from genome re-sequencing of an E. coli clone.

File Name

Description

Sample

61FTVAAXX_2_1.fastq

Paired-end Illumina, First of mate-pair, FASTQ format

Re-sequenced E. coli genome

61FTVAAXX_2_2.fastq

Paired-end Illumina, Second of mate-pair, FASTQ format

Re-sequenced E. coli genome

NC_012967.1.fasta

Reference Genome in FASTA format

E. coli B strain REL606

NC_012967.1.lengths

Simple tab delimtered file based on the size of the reference needed for SVDetect so you don't have to create it yourself

Map data using bowtie2

First we need to (surprise!) map the data. This will hopefully reinforce the bowtie2 tutorial you just completed, but if you are feeling adventurous you could use BWA as optional reinforcement.

Do not run on head node

Make sure you are on an idev node using the command: showq -u

If you need a new idev node