A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Report generated on 2022-07-07, 08:54 based on data in:
/vast/scratch/users/cmero.ma/projects/G000204/AGRF_CAGRF220410419_HFVGHDSX3/results/QC/fastQC
/vast/scratch/users/cmero.ma/projects/G000204/AGRF_CAGRF220410419_HFVGHDSX3/results/QC/fastqScreen
/vast/scratch/users/cmero.ma/projects/G000204/AGRF_CAGRF220410419_HFVGHDSX3/results/QC/qualimap
/vast/scratch/users/cmero.ma/projects/G000204/AGRF_CAGRF220410419_HFVGHDSX3/results/QC/samtools
General Statistics
Showing 18/18 rows and 16/27 columns.Sample Name | M Reads Mapped | % GC | Ins. size | ≥ 30X | Median cov | Mean cov | % Aligned | Error rate | M Non-Primary | M Reads Mapped | % Mapped | % Proper Pairs | M Total seqs | % Dups | % GC | M Seqs |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0-K12Rep1_HFVGHDSX3_CGGCTAAT-CTCGTTCT_L004 | 156.6 | 50% | 300 | 99.9% | 4746.0X | 4993.7X | 97.2% | 0.47% | 0.0 | 156.6 | 97.2% | 95.6% | 161.1 | 93.4% | 49% | 80.6 |
0-K12Rep1_HFVGHDSX3_CGGCTAAT-CTCGTTCT_L004_R2 | 90.4% | 49% | 80.6 | |||||||||||||
0-K12Rep2_HFVGHDSX3_ATCGATCG-TGGAAGCA_L004 | 113.2 | 50% | 255 | 99.9% | 3399.0X | 3603.0X | 96.8% | 0.38% | 0.0 | 113.1 | 96.8% | 95.9% | 116.9 | 94.2% | 49% | 58.4 |
0-K12Rep2_HFVGHDSX3_ATCGATCG-TGGAAGCA_L004_R2 | 92.9% | 49% | 58.4 | |||||||||||||
1-10-K12Rep1_HFVGHDSX3_CGCATGAT-AAGCCTGA_L004 | 3755.9 | 50% | 269 | 100.0% | 113949.0X | 119687.7X | 96.8% | 0.46% | 0.0 | 3755.0 | 96.8% | 95.3% | 3877.8 | 90.4% | 49% | 1938.9 |
1-10-K12Rep1_HFVGHDSX3_CGCATGAT-AAGCCTGA_L004_R2 | 90.0% | 49% | 1938.9 | |||||||||||||
1-K12Rep1_HFVGHDSX3_CTTGTCGA-CGATGTTC_L004 | 313.1 | 50% | 268 | 100.0% | 9592.0X | 9980.7X | 97.1% | 0.45% | 0.0 | 313.0 | 97.1% | 95.7% | 322.3 | 93.8% | 49% | 161.1 |
1-K12Rep1_HFVGHDSX3_CTTGTCGA-CGATGTTC_L004_R2 | 91.4% | 49% | 161.1 | |||||||||||||
1-K12Rep2_HFVGHDSX3_TTCCAAGG-CTACAAGG_L004 | 305.7 | 50% | 268 | 100.0% | 9300.0X | 9743.5X | 97.0% | 0.47% | 0.0 | 305.6 | 97.0% | 95.8% | 315.0 | 93.8% | 49% | 157.5 |
1-K12Rep2_HFVGHDSX3_TTCCAAGG-CTACAAGG_L004_R2 | 91.4% | 49% | 157.5 | |||||||||||||
10-K12Rep1_HFVGHDSX3_CTGATCGT-GCGCATAT_L004 | 43.7 | 50% | 261 | 100.0% | 1319.0X | 1388.9X | 95.3% | 0.55% | 0.0 | 43.6 | 95.3% | 94.0% | 45.8 | 93.7% | 49% | 22.9 |
10-K12Rep1_HFVGHDSX3_CTGATCGT-GCGCATAT_L004_R2 | 91.6% | 49% | 22.9 | |||||||||||||
10-K12Rep2_HFVGHDSX3_ACTCTCGA-CTGTACCA_L004 | 40.8 | 50% | 256 | 99.9% | 1224.0X | 1298.3X | 95.4% | 0.53% | 0.0 | 40.8 | 95.4% | 94.1% | 42.8 | 94.2% | 49% | 21.4 |
10-K12Rep2_HFVGHDSX3_ACTCTCGA-CTGTACCA_L004_R2 | 92.1% | 49% | 21.4 | |||||||||||||
5-K12Rep1_HFVGHDSX3_TGAGCTAG-GAACGGTT_L004 | 99.8 | 50% | 257 | 100.0% | 3037.0X | 3176.1X | 96.3% | 0.46% | 0.0 | 99.7 | 96.3% | 95.0% | 103.6 | 93.9% | 49% | 51.8 |
5-K12Rep1_HFVGHDSX3_TGAGCTAG-GAACGGTT_L004_R2 | 91.8% | 49% | 51.8 | |||||||||||||
5-K12Rep2_HFVGHDSX3_GAGACGAT-ACCGGTTA_L004 | 130.7 | 50% | 283 | 100.0% | 3963.0X | 4166.3X | 96.4% | 0.48% | 0.0 | 130.7 | 96.4% | 94.8% | 135.5 | 94.4% | 49% | 67.8 |
5-K12Rep2_HFVGHDSX3_GAGACGAT-ACCGGTTA_L004_R2 | 92.4% | 49% | 67.8 |
QualiMap
QualiMap is a platform-independent application to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Coverage histogram
Distribution of the number of locations in the reference genome with a given depth of coverage.
For a set of DNA or RNA reads mapped to a reference sequence, such as a genome or transcriptome, the depth of coverage at a given base position is the number of high-quality reads that map to the reference at that position (Sims et al. 2014).
Bases of a reference sequence (y-axis) are groupped by their depth of coverage (0×, 1×, …, N×) (x-axis). This plot shows the frequency of coverage depths relative to the reference sequence for each read dataset, which provides an indirect measure of the level and variation of coverage depth in the corresponding sequenced sample.
If reads are randomly distributed across the reference sequence, this plot should resemble a Poisson distribution (Lander & Waterman 1988), with a peak indicating approximate depth of coverage, and more uniform coverage depth being reflected in a narrower spread. The optimal level of coverage depth depends on the aims of the experiment, though it should at minimum be sufficiently high to adequately address the biological question; greater uniformity of coverage is generally desirable, because it increases breadth of coverage for a given depth of coverage, allowing equivalent results to be achieved at a lower sequencing depth (Sampson et al. 2011; Sims et al. 2014). However, it is difficult to achieve uniform coverage depth in practice, due to biases introduced during sample preparation (van Dijk et al. 2014), sequencing (Ross et al. 2013) and read mapping (Sims et al. 2014).
This plot may include a small peak for regions of the reference sequence with zero depth of coverage. Such regions may be absent from the given sample (due to a deletion or structural rearrangement), present in the sample but not successfully sequenced (due to bias in sequencing or preparation), or sequenced but not successfully mapped to the reference (due to the choice of mapping algorithm, the presence of repeat sequences, or mismatches caused by variants or sequencing errors). Related factors cause most datasets to contain some unmapped reads (Sims et al. 2014).