TACC - High Performance Computing for NextGen Sequence Analysis
Oct 8, 2012
Location: ACES 2.402 ; 9 am - 2 pm (CDT)
Your Instructors
Name |
Initials |
Affiliation |
|
---|---|---|---|
Matt Vaughn |
MWV |
Manager, TACC Life Sciences |
|
John Fonner |
JF |
Research Associate, TACC Life Sciences |
|
Scott Hunicke-Smith |
|
Director, GSAF |
in absentia |
Jeff Barrick |
|
Asst. Prof. Biochemistry |
in absentia |
Learning objectives
- Role of TACC for UT System researchers
- Logging into Lonestar and other TACC systems
- Batch vs interactive computing on Lonestar
- The TACC software environmment
- File systems available on TACC Lonestar
- Parallelism strategies for increasing efficiency of NextGen analyses
- Use of idev interactive sessions on TACC systems
- Moving files via SFTP and SCP
Outline
Linux and Lonestar Refresher Course (9:00-10:00)
- Introduction to TACC
- Linux basics with Lonestar
- Logging in via SSH
- Command-line tricks (tab completion and the history)
- Essential Linux commands
- Wildcards and special file names
- Using options with Linux commands
- Getting help
- Extra: Printable cheat sheet of common Linux commands
- Lonestar Essentials
- The login (or head) nodes
- Acceptable uses for login nodes
- What not to use login nodes for
- Lonestar file systems
- Running compute jobs on Lonestar
- Batch (qsub)
- Interactive (idev)
- The login (or head) nodes
- Editing files
- Using nano on the command line
- Demo: Using TextWrangler (Mac), Notepad++ (Windows), or gEdit (Linux) from a desktop environment
- Finding and using software
- The module system
- avail, list, load, swap, unload, key
- Linux PATHs
- Extra: List of genomics modules available at TACC
- Extra: Installing your own Linux software
- The module system
Break (10:00-10:15)
Delving into HPC-oriented NGS analysis (10:15-11:15)
- Tutorial: Read mapping with BWA and BOWTIE
- Supplemental Material
- Presentation: Introduction to Read Mapping (Barrick)
Speeding up your analyses using parallel computing (11:15-11:50)
- Introduction: Parallelism Strategies (PDF)
- Tutorial: Using threads to speed up mapping on a single compute node
- Tutorial: Using the TACC Parametric Launcher to speed up mapping (or any other natively parallel task) across multiple nodes
- Bonus example: Using the Launcher to automate Rscript analyses
Set-up for Afternoon (11:50-12:00)
- Pre-flight instructions for Interactive Computing session
- Please complete before going to lunch!
Lunch (12:00-1:00)
- Please return promptly!
Variant Calling with SAMtools (1:00-1:40)
- Tutorial: Variant calling using SAMtools
- Calling SNPs and Indels
- Inspecting alignments supporting a variant using tview
- Filtering variants
- Finding summary statistics
- Supplemental material
Downloading and uploading files using SFTP (1:40-1:50)
- The sftp and scp commands
- DEMO: Using CyberDuck on the desktop
- Supplemental material
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.