Content Comparison

Table of Contents

Your Instructors

Anna Battenhouse, Associate Research Scientist, Iyer Lab, abattenhouse@utexas.edu
- BA English literature, 1978
- Commercial software development 1982 – 2005
- Joined Iyer Lab 2007 (“retirement career”)
- BS Biochemistry, UT Austin, 2013
Amelia Weber Hall, PhD, Iyer Lab, ameliahall@utexas.edu
- BS Molecular Genetics, 2007 (University of Rochester)
- PhD Microbiology, 2017 (University of Texas at Austin)
- Laboratory Technician at UT 2007-2010
Dakota Derryberry, M.S., dakotaz@utexas.edu
- BA Biology, University of Chicago, 2009
- MS Computational Biology, University of Texas at Austin, 2017
Benni Goetz, M.S., (Research Engineering/Scientist Associate III), benni@utexas.edu
- joined the Bioinformatics Consulting Group in 2012

http://iyerlab.org/ Dr. Vishy Iyer, PI
Main focus is functional genomics large-scale transciptional reprogramming in response to diverse stimuli Encode consortium collaborator work in human and yeast
Research methods include microarrays (Dr. Iyer was co-inventor)
high-throughput sequencing (since 2007) especially ChIP-seq also RNA-seq, RIP-seq, MNase-seq ... we now have > 1,700 800 NGS datasets

...

Hands-on, tutorial style – learn by doing
- common bioinformatics tools & file formats
Introduce NGS vocabulary
- both high-level view and practice with specific tools
Cover the NGS basics
- the first few things you'll do after receiving raw sequences
  - raw sequence preparation
  - alignment to reference
  - basic alignment analysis
Understand and practice required skills
- Get you comfortable with Linux and TACC – your best "frenemies"
- Make you self-sufficient enough in 4 days to become experts over time
- Show some "best practices" for working with NGS data

...

The initial fastq files are big (100s of MB to GB) – and they're just the start.

...

2008 – Yeast heat shock remodeling of chromatin
- 2 yeast datasets
- less than 2 million sequences
2010 – Allelic bias in CTCF binding
- 13 CTCF datasets from 3 GM cell lines
- ~200 million sequences
2012 – Transcription factor data analysis (ENCODE2)
- 32 ChIP-seq datasets gathered over 3 years (3 TFs across 11 cell lines)
- ~ 1 billion sequences
2013 – miRNA overexpression effects
- 42 RNAseq datasets (7 conditions)
- ~ 2.6 billion sequences
2014 – eQTL analysis of CTCF binding
- 52 very deeply sequenced CTCF datasets
- ~ 8 billion sequences
2017 (in progress review) – Functional analysis of glioblastoma tumors and cell lines
- > 400 datasets so far nearly 500 datasets in total (ChIP-seq, RNAseq, miRNAseq, 4C, exome/genome sequencing)
- > 22 billion sequences

...