• Diagnostics and Quality Control Tools
  • ASEReadCounter
  • AnalyzeCovariates
  • CallableLoci
  • CheckPileup
  • CompareCallableLoci
  • ContEst
  • CountBases
  • CountIntervals
  • CountLoci
  • CountMales
  • CountRODs
  • CountRODsByRef
  • CountReadEvents
  • CountReads
  • CountTerminusEvent
  • DepthOfCoverage
  • DiagnoseTargets
  • DiffObjects
  • ErrorRatePerCycle
  • FastaStats
  • FindCoveredIntervals
  • FlagStat
  • GCContentByInterval
  • GatherBqsrReports
  • Pileup
  • PrintRODs
  • QualifyMissingIntervals
  • ReadClippingStats
  • ReadGroupProperties
  • ReadLengthDistribution
  • SimulateReadsForVariants
  • Sequence Data Processing Tools
  • BaseRecalibrator
  • ClipReads
  • IndelRealigner
  • LeftAlignIndels
  • PrintReads
  • RealignerTargetCreator
  • SplitNCigarReads
  • SplitSamFile
  • Variant Discovery Tools
  • ApplyRecalibration
  • CalculateGenotypePosteriors
  • GATKPaperGenotyper
  • GenotypeGVCFs
  • HaplotypeCaller
  • MuTect2
  • RegenotypeVariants
  • UnifiedGenotyper
  • VariantRecalibrator
  • Variant Evaluation Tools
  • GenotypeConcordance
  • ValidateVariants
  • VariantEval
  • VariantFiltration
  • Variant Manipulation Tools
  • CatVariants
  • CombineGVCFs
  • CombineVariants
  • HaplotypeResolver
  • LeftAlignAndTrimVariants
  • PhaseByTransmission
  • RandomlySplitVariants
  • ReadBackedPhasing
  • SelectHeaders
  • SelectVariants
  • ValidationSiteSelector
  • VariantAnnotator
  • VariantsToAllelicPrimitives
  • VariantsToBinaryPed
  • VariantsToTable
  • VariantsToVCF

  • Annotation Modules
  • AS_BaseQualityRankSumTest
  • AS_FisherStrand
  • AS_InbreedingCoeff
  • AS_InsertSizeRankSum
  • AS_MQMateRankSumTest
  • AS_MappingQualityRankSumTest
  • AS_QualByDepth
  • AS_RMSMappingQuality
  • AS_ReadPosRankSumTest
  • AS_StrandOddsRatio
  • AlleleBalance
  • AlleleBalanceBySample
  • AlleleCountBySample
  • BaseCounts
  • BaseCountsBySample
  • BaseQualityRankSumTest
  • BaseQualitySumPerAlleleBySample
  • ChromosomeCounts
  • ClippingRankSumTest
  • ClusteredReadPosition
  • Coverage
  • DepthPerAlleleBySample
  • DepthPerSampleHC
  • ExcessHet
  • FisherStrand
  • FractionInformativeReads
  • GCContent
  • GenotypeSummaries
  • HaplotypeScore
  • HardyWeinberg
  • HomopolymerRun
  • InbreedingCoeff
  • LikelihoodRankSumTest
  • LowMQ
  • MVLikelihoodRatio
  • MappingQualityRankSumTest
  • MappingQualityZero
  • MappingQualityZeroBySample
  • NBaseCount
  • OxoGReadCounts
  • PossibleDeNovo
  • QualByDepth
  • RMSMappingQuality
  • ReadPosRankSumTest
  • SampleList
  • SnpEff
  • SpanningDeletions
  • StrandAlleleCountsBySample
  • StrandBiasBySample
  • StrandOddsRatio
  • TandemRepeatAnnotator
  • TransmissionDisequilibriumTest
  • VariantType
  • Read Filters
  • BadCigarFilter
  • BadMateFilter
  • CountingFilteringIterator.CountingReadFilter
  • DuplicateReadFilter
  • FailsVendorQualityCheckFilter
  • HCMappingQualityFilter
  • LibraryReadFilter
  • MalformedReadFilter
  • MappingQualityFilter
  • MappingQualityUnavailableFilter
  • MappingQualityZeroFilter
  • MateSameStrandFilter
  • MaxInsertSizeFilter
  • MissingReadGroupFilter
  • NoOriginalQualityScoresFilter
  • NotPrimaryAlignmentFilter
  • OverclippedReadFilter
  • Platform454Filter
  • PlatformFilter
  • PlatformUnitFilter
  • ReadGroupBlackListFilter
  • ReadLengthFilter
  • ReadNameFilter
  • ReadStrandFilter
  • ReassignMappingQualityFilter
  • ReassignOneMappingQualityFilter
  • ReassignOriginalMQAfterIndelRealignmentFilter
  • SampleFilter
  • SingleReadGroupFilter
  • UnmappedReadFilter
  • Resource File Codecs
  • BeagleCodec
  • BedTableCodec
  • RawHapMapCodec
  • RefSeqCodec
  • SAMPileupCodec
  • SAMReadCodec
  • TableCodec

  • Reference Utilities
  • FastaAlternateReferenceMaker
  • FastaReferenceMaker
  • QCRef
  • Showing docs for version 3.7-0


    AnalyzeCovariates

    Create plots to visualize base recalibration results

    Category Diagnostics and Quality Control Tools

    Traversal LocusWalker

    PartitionBy LOCUS


    Overview

    This tool generates plots for visualizing the quality of a recalibration run (effected by BaseRecalibrator).

    Input

    The tool can take up to three different sets of recalibration tables. The resulting plots will be overlaid on top of each other to make comparisons easy.


    SetArgumentLabelColorDescription
    Original-beforeBEFOREPink First pass recalibration tables obtained from applying BaseRecalibration on the original alignment.
    Recalibrated-afterAFTERBlue Second pass recalibration tables results from the application of BaseRecalibration on the alignment recalibrated using the first pass tables
    Input-BQSRBQSRBlack Any recalibration table without a specific role

    You need to specify at least one set. Multiple sets need to have the same values for the following parameters:

    covariate (order is not important), no_standard_covs, run_without_dbsnp, solid_recal_mode, solid_nocall_strategy, mismatches_context_size, mismatches_default_quality, deletions_default_quality, insertions_default_quality, maximum_cycle_value, low_quality_tail, default_platform, force_platform, quantizing_levels and binary_tag_name

    Output

    A pdf document with plots that show the quality of the recalibration, and an optional csv file that contains a table with all the data required to generate those plots.

    Usage examples

    Plot a single recalibration table

     java -jar GenomeAnalysisTK.jar \
          -T AnalyzeCovariates \
          -R myrefernce.fasta \
          -BQSR myrecal.table \
          -plots BQSR.pdf
     

    Plot before (first pass) and after (second pass) recalibration tables to compare them

     java -jar GenomeAnalysisTK.jar \
          -T AnalyzeCovariates \
          -R myrefernce.fasta \
          -before recal2.table \
          -after recal3.table \
          -plots recalQC.pdf
     

    Plot up to three recalibration tables for comparison

    
     # You can ignore the before/after semantics completely if you like (if you do, add -ignoreLMT
     # to avoid a possible warning), but all tables must have been generated using the same parameters.
    
     java -jar GenomeAnalysisTK.jar \
          -T AnalyzeCovariates \
          -R myrefernce.fasta \
          -ignoreLMT \
          -BQSR recal1.table \   # you can discard any two
          -before recal2.table \
          -after recal3.table \
          -plots myrecals.pdf
     

    Full BQSR quality assessment pipeline

     # Generate the first pass recalibration table file
     java -jar GenomeAnalysisTK.jar \
          -T BaseRecalibrator \
          -R reference.fasta \
          -I myinput.bam \
          -knownSites bundle/my-trusted-snps.vcf \ # optional but recommended
          -knownSites bundle/my-trusted-indels.vcf \ # optional but recommended
          -o firstpass.table
    
     # Generate the second pass recalibration table file
     java -jar GenomeAnalysisTK.jar \
          -T BaseRecalibrator \
          -R reference.fasta \
          -I myinput.bam \
          -knownSites bundle/my-trusted-snps.vcf \
          -knownSites bundle/my-trusted-indels.vcf \
          -BQSR firstpass.table \
          -o secondpass.table
    
     # Finally generate the plots and also keep a copy of the csv (optional)
     java -jar GenomeAnalysisTK.jar \
          -T AnalyzeCovariates \
          -R reference.fasta \
          -before firstpass.table \
          -after secondpass.table \
          -csv BQSR.csv \ # optional
          -plots BQSR.pdf
     

    Additional Information

    Read filters

    These Read Filters are automatically applied to the data by the Engine before processing by AnalyzeCovariates.


    Command-line Arguments

    Engine arguments

    All tools inherit arguments from the GATK Engine' "CommandLineGATK" argument collection, which can be used to modify various aspects of the tool's function. For example, the -L argument directs the GATK engine to restrict processing to specific genomic intervals; or the -rf argument allows you to apply certain read filters to exclude some of the data from the analysis.

    AnalyzeCovariates specific arguments

    This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.

    Argument name(s) Default value Summary
    Optional Inputs
    --afterReportFile
     -after
    NA file containing the BQSR second-pass report file
    --beforeReportFile
     -before
    NA file containing the BQSR first-pass report file
    Optional Outputs
    --intermediateCsvFile
     -csv
    NA location of the csv intermediate file
    --plotsReportFile
     -plots
    NA location of the output report
    Optional Flags
    --ignoreLastModificationTimes
     -ignoreLMT
    false do not emit warning messages related to suspicious last modification time order of inputs

    Argument details

    Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.


    --afterReportFile / -after

    file containing the BQSR second-pass report file
    File containing the recalibration tables from the second pass.

    File  NA


    --beforeReportFile / -before

    file containing the BQSR first-pass report file
    File containing the recalibration tables from the first pass.

    File  NA


    --ignoreLastModificationTimes / -ignoreLMT

    do not emit warning messages related to suspicious last modification time order of inputs
    If true, it won't show a warning if the last-modification time of the before and after input files suggest that they have been reversed.

    boolean  false


    --intermediateCsvFile / -csv

    location of the csv intermediate file
    Output csv file name.

    File  NA


    --plotsReportFile / -plots

    location of the output report
    Output report file name.

    File  NA