Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Overview

The fastQC tool was presented on the first day of the class as the go to tool for quality control analysis of fastq files, but there is an underlying issue that checking each fastq file is quite daunting and evaluating each file individually can introduce its own set of artifacts or biases. The MultiQC tool represents a tool which works directly on fastQC reports to quickly generate summary reports to both identify samples that are different among a group and to make global decisions about how to treat a set of files.

Learning Objectives

In this tutorial, we will:

  1. work with some simple bash scripting from the command line (for loops) to generate multiple fastqc reports simultaneously and look at 200+ plasmid samples.
  2. work with MultiQC to make decisions about read preprocessing.
  3. identify outlier files that are clearly different from the group as a whole and determine how to deal with these files.


Get some data


Use a bash for loop on the command line to generate a fastQC command for all plasmid samples


Run MultiQC tool on all fastQC output


Evaluate MultiQC report


Optional Exercise

Using information gained in the MultiQC report, modify the bash loop used for the fastQC commands to improve the raw reads.


Return to the Genome Variant Analysis Course 2019 Home Page

  • No labels