...
| Expand | |||||
|---|---|---|---|---|---|
| |||||
There are 1,152,831 overlapping reads in 6,141 non-0 gene annotations. |
Use bedtools coverage to create a signal track
A signal track is a bedGraph (BED3+) file with an entry for every base in a defined set of regions (see https://genome.ucsc.edu/goldenpath/help/bedgraph.html). bedGraph files can be visualized in the Broad's IGV (Integrative Genomics Viewer) application (https://software.broadinstitute.org/software/igv/download) or in the UCSC Genome Browser (https://genome.ucsc.edu/).
The bedtools coverage function (https://bedtools.readthedocs.io/en/latest/content/tools/coverage.html), with the -d (per-base depth) option produces output that can be made into a bedGraph. Here we'll analyze the per-base coverage of yeast RNAseq reads in our merged yeast gene regions.
Make sure you're in an idev session, then prepare a directory for this exercise.
| Code Block | ||||
|---|---|---|---|---|
| ||||
idev -m 120 -N 1 -A OTH21164 -r CoreNGSday5
module load biocontainers
module load bedtools
mkdir -p $SCRATCH/core_ngs/bedtools_coverage
cp $CORENGS/catchup/bedtools_merge/merged*bed $SCRATCH/core_ngs/bedtools_coverage/
cp $CORENGS/yeast_rnaseq/yeast_mrna.sort.filt.bam* $SCRATCH/core_ngs/bedtools_coverage/ |
Then calling bedtools coverage is easy. The "A" file will be our gene regions, and the "B" file will be the yeast RNAseq reads. We also use the -d (per-base depth) and -s (force "strandedness") options.
| Code Block | ||
|---|---|---|
| ||
cds; cd core_ngs/bedtools_coverage
bedtools coverage -s -d -a merged.good.sc_genes.bed -b yeast_mrna.sort.filt.bam > yeast_mrna.gene_coverage.txt |
It will complain a bit because our genes file includes the yeast plasmid "2-micron" but the RNAseq BAM doesn't include that contig. We'll ignore that warning.
The bedtools coverage output is a bit strange. It lists each region in the A file, followed by information from the B reads. Here the column order will be gene_chrom gene_start gene_end gene_name gene_score gene_strand offset_in_the_gene_region read_overlap count.
Let's look at coverage for gene YAL067C:
| Code Block | ||
|---|---|---|
| ||
cat yeast_mrna.gene_coverage.txt | grep -P 'YAL067C' | head -50 |
Will look like this:
| Code Block |
|---|
chrI 7234 9016 YAL067C 1 - 1 0
chrI 7234 9016 YAL067C 1 - 2 0
chrI 7234 9016 YAL067C 1 - 3 0
chrI 7234 9016 YAL067C 1 - 4 0
chrI 7234 9016 YAL067C 1 - 5 0
chrI 7234 9016 YAL067C 1 - 6 0
chrI 7234 9016 YAL067C 1 - 7 0
chrI 7234 9016 YAL067C 1 - 8 0
chrI 7234 9016 YAL067C 1 - 9 0
chrI 7234 9016 YAL067C 1 - 10 0
chrI 7234 9016 YAL067C 1 - 11 0
chrI 7234 9016 YAL067C 1 - 12 0
chrI 7234 9016 YAL067C 1 - 13 0
chrI 7234 9016 YAL067C 1 - 14 0
chrI 7234 9016 YAL067C 1 - 15 0
chrI 7234 9016 YAL067C 1 - 16 0
chrI 7234 9016 YAL067C 1 - 17 1
chrI 7234 9016 YAL067C 1 - 18 1
chrI 7234 9016 YAL067C 1 - 19 1
chrI 7234 9016 YAL067C 1 - 20 1
chrI 7234 9016 YAL067C 1 - 21 1
chrI 7234 9016 YAL067C 1 - 22 1
chrI 7234 9016 YAL067C 1 - 23 1
chrI 7234 9016 YAL067C 1 - 24 1
chrI 7234 9016 YAL067C 1 - 25 1
chrI 7234 9016 YAL067C 1 - 26 1
chrI 7234 9016 YAL067C 1 - 27 1
chrI 7234 9016 YAL067C 1 - 28 1
chrI 7234 9016 YAL067C 1 - 29 1
chrI 7234 9016 YAL067C 1 - 30 1
chrI 7234 9016 YAL067C 1 - 31 1
chrI 7234 9016 YAL067C 1 - 32 1
chrI 7234 9016 YAL067C 1 - 33 1
chrI 7234 9016 YAL067C 1 - 34 1
chrI 7234 9016 YAL067C 1 - 35 1
chrI 7234 9016 YAL067C 1 - 36 1
chrI 7234 9016 YAL067C 1 - 37 1
chrI 7234 9016 YAL067C 1 - 38 2
chrI 7234 9016 YAL067C 1 - 39 2
chrI 7234 9016 YAL067C 1 - 40 2
chrI 7234 9016 YAL067C 1 - 41 3
chrI 7234 9016 YAL067C 1 - 42 3
chrI 7234 9016 YAL067C 1 - 43 3
chrI 7234 9016 YAL067C 1 - 44 3
chrI 7234 9016 YAL067C 1 - 45 4
chrI 7234 9016 YAL067C 1 - 46 4
chrI 7234 9016 YAL067C 1 - 47 4
chrI 7234 9016 YAL067C 1 - 48 4
chrI 7234 9016 YAL067C 1 - 49 4
chrI 7234 9016 YAL067C 1 - 50 4 |