Genome Variant Analysis Course 2019

Genome Variant Analysis Course 2019

Course Overview

We will be meeting daily in PAR room 105 https://utdirect.utexas.edu/apps/campus/buildings/nlogon/facilities/UTM/PAR/. If you have any trouble finding the room or building, please check your email for contact information and directions. 

The course will be built based on 2 ~90 minute sections per day for 4 days, with the goal of teaching you how to preform the standard next-generation sequencing analysis to identify genomic variants. This will be accomplished through: presentations covering information essential to all types of analysis, guided tutorials to reinforce the essential concepts, and self guided tutorials to help you learn the skills that are most specific to your own analysis. By the end of this course, we hope to achieve the following goals:

  1. Teach you different ways next generation sequencing libraries are constructed, and the advantages/disadvantages associated with the different types. 

  2. Familiarize you with how the Texas Advanced Computing Center (TACC) can be used to simplify and speed up your data analysis.

  3. Teach you the basics of read mapping in both individuals and populations, and identifying variants within individuals and rare variants within populations.

  4. Provide reference materials covering a breadth of material sufficient to give you a starting point of where to begin you own data analysis, and enough experience that you can begin that analysis on your own.

Your Instructor

Name

Initials

Affiliation

Expertise

Name

Initials

Affiliation

Expertise

Daniel Deatherage

DD

Barrick Lab

Unix, Python, NGS Library Prep, Capture, Rare Variant Identification

A nod to the past

I think it important to acknowledge a great deal of help with creating these web pages and materials from previous instructors of the Intro to NGS Bioinformatics course taught in 2013 and the Genome Variant Analysis Course taught in 2014-2016. Two individuals warrant special mention, the former director of the GSAF Scott Hunicke-Smith, and Jeffrey Barrick were the driving force behind this class for a number of years, and many of the tutorials presented here were originally developed by them or adapted from their work.

 

Course Schedule

Tuesday, May 28th. Day 1 – "The Basics"

Presentation:

Tutorial: Introduction to linux and lonestar5

Presentation:

Tutorial: Evaluating raw sequencing data

Wednesday May 29th. Day 2 – "Principles of Variant calling"

Presentation: 

Tutorial: Using Bowtie2 to map reads

Presentation:  

Presentation: 

Tutorial: Using samtools to identify SNVs

Tutorial: Using SVDetect to identify SV

Bonus Presentation: 

Thursday May 30th. Day 3 – Visualization and User specific tutorials

Presentation: 

Bonus Presentation:   

Tutorial: Visualization: Bacterial genome variants the easiest way – breseq

Tutorial: Visualization: Integrated Genome Viewer Tutorial

 

At this point in the course, you have the basic tools that will help you regardless of what type of research you are involved in. The remainder of the course is full of topics that are more specific to different research areas. They are divided into broad categories to help you decide which ones you want to complete during the remaining time. If you are unsure just ask and I'll help identify ones which may be more applicable to your work.

Bacterial  Centric Tutorials

Tutorial: Advanced Breseq 

Tutorial: Evaluating Error Correction Using Breseq

Human and Higher Eukaryote Centric Tutorials

Tutorial: Human Trios Analysis

Tutorial: Annovar Analysis

Tutorial: Comparing Multiple samples

Method based Tutorials that may be of help regardless of sample type

Tutorial: Genome Assembly

Tutorial: Exome Capture Metrics

Tutorial: Error Correction (Molecular Indexing)

Friday May 31st. Day 4 – User specific tutorials (continued) and TACC the normal way

The first half of today's class will be done as a continuation of tutorials that you are most interested in. As was the case yesterday, choose your own tutorial, and please don't hesitate to ask what tutorials would be good for you to be working on given your data! After the break, we will be go over a brief review to put things back in prospective and give you a tutorial on how to do things the 'normal way' on TACC which means using the job submission system and commands files before giving you the rest of the time to go through tutorials and ask any remaining questions.

Presentation:

Tutorial: Job Submissions and end of class summary of actions

Tutorial: MultiQC - fastQC summary tool for multiple samples

Tutorial: GATK