Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • SRA search page http://www.ncbi.nlm.nih.gov/sra.
  • Type in SRX112044 ? Search
  • On experiment summary page click SRR390925
    • takes you to the Run browser where you can see example reads
  • Under "Download", "Run" click "ftp" under .sra
    • save the file locally
  • Open a Terminal window, change into the directory where the file was stored
  • Copy from local machine to TACC

    Code Block
    languagebash
    scp SRR390925.sra username@lonestarusername@stampede.tacc.utexas.edu:~/
    
    • the colon ( : ) after the hostname indicates this is a remote destination
    • the ~/ indicates your home directory
  • Login to Lonestar stampede:

    Code Block
    languagebash
    ssh username@lonestarusername@stampede.tacc.utexas.edu:~/
    
    • check that the file is in your home directory

      login2$
      Code Block
      languagebash
      stamp:~ ls
      SRR390925.sra
      
  • Find the SRA toolkit module

    login2$
    Code Block
    languagebash
    stamp:~ module spider sratoolkit
    
      ----------------------------------------------------------------------------
      sratoolkit: sratoolkit/2.1.9
      ----------------------------------------------------------------------------
        Description:
          The SRA Toolkit and SDK from NCBI is a collection of tools and
          libraries for using data in the INSDC Sequence Read Archives.
    
        This module can be loaded directly: module load sratoolkit/2.1.9
    
        Help:
          The sratoolkit module file defines the following environment variables:
          TACC_SRATOOLKIT_DIR for the location of the sratoolkit distribution.
    
          Version 2.1.9
    
  • Load the module

    login2$
    Code Block
    languagebash
    stamp:~ module load sratoolkit
    
  • Invoke fastq-dump with no arguments to get basic usage

    login2$
    Code Block
    languagebash
    stamp:~ fastq-dump
    
    Usage:
      /opt/apps/sratoolkit/2.1.9//fastq-dump [options] [ -A ] <accession>
      /opt/apps/sratoolkit/2.1.9//fastq-dump [options] <path [path...]>
    
    Use option --help for more information
    
    /opt/apps/sratoolkit/2.1.9//fastq-dump : 2.1.9
    
  • Extract to fastq

    login2$
    Code Block
    languagebash
    stamp:~ $TACC_SRATOOLKIT_DIR/fastq-dump SRR390925.sra
    Written 1981132 spots for SRR390925.sra
    Written 1981132 spots total
    
  • Look at some data

    login2$
    Code Block
    languagebash
    stamp:~ ls
    SRR390925.fastq  SRR390925.sra
    login2$
    stamp:~ head SRR390925.fastq
    @SRR390925.1 ROCKFORD:1:1:0:1260 length=36
    NCAACAAGTTTCTTTGGTTATTAACTACGACTTACC
    \+SRR390925.1 ROCKFORD:1:1:0:1260 length=36
    \#CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
    @SRR390925.2 ROCKFORD:1:1:0:293 length=36
    NAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    \+SRR390925.2 ROCKFORD:1:1:0:293 length=36
    \####################################
    @SRR390925.3 ROCKFORD:1:1:0:330 length=36
    NAAAAAAAAAAAAAAAAAAAAAAAATAAAAAAAAAA
    
  • Count lines and number of reads (fastq has 4 lines/read)

    login2$
    Code Block
    languagebash
    stamp:~ wc -l SRR390925.fastq
    7924528 SRR390925.fastq
    login2$ echo $((7924528 / 4))
    1981132
    

...