• Diagnostics and Quality Control Tools
  • ASEReadCounter
  • AnalyzeCovariates
  • CallableLoci
  • CheckPileup
  • CompareCallableLoci
  • ContEst
  • CountBases
  • CountIntervals
  • CountLoci
  • CountMales
  • CountRODs
  • CountRODsByRef
  • CountReadEvents
  • CountReads
  • CountTerminusEvent
  • DepthOfCoverage
  • DiagnoseTargets
  • DiffObjects
  • ErrorRatePerCycle
  • FastaStats
  • FindCoveredIntervals
  • FlagStat
  • GCContentByInterval
  • GatherBqsrReports
  • Pileup
  • PrintRODs
  • QualifyMissingIntervals
  • ReadClippingStats
  • ReadGroupProperties
  • ReadLengthDistribution
  • SimulateReadsForVariants
  • Sequence Data Processing Tools
  • BaseRecalibrator
  • ClipReads
  • IndelRealigner
  • LeftAlignIndels
  • PrintReads
  • RealignerTargetCreator
  • SplitNCigarReads
  • SplitSamFile
  • Variant Discovery Tools
  • ApplyRecalibration
  • CalculateGenotypePosteriors
  • GATKPaperGenotyper
  • GenotypeGVCFs
  • HaplotypeCaller
  • MuTect2
  • RegenotypeVariants
  • UnifiedGenotyper
  • VariantRecalibrator
  • Variant Evaluation Tools
  • GenotypeConcordance
  • ValidateVariants
  • VariantEval
  • VariantFiltration
  • Variant Manipulation Tools
  • CatVariants
  • CombineGVCFs
  • CombineVariants
  • HaplotypeResolver
  • LeftAlignAndTrimVariants
  • PhaseByTransmission
  • RandomlySplitVariants
  • ReadBackedPhasing
  • SelectHeaders
  • SelectVariants
  • ValidationSiteSelector
  • VariantAnnotator
  • VariantsToAllelicPrimitives
  • VariantsToBinaryPed
  • VariantsToTable
  • VariantsToVCF

  • Annotation Modules
  • AS_BaseQualityRankSumTest
  • AS_FisherStrand
  • AS_InbreedingCoeff
  • AS_InsertSizeRankSum
  • AS_MQMateRankSumTest
  • AS_MappingQualityRankSumTest
  • AS_QualByDepth
  • AS_RMSMappingQuality
  • AS_ReadPosRankSumTest
  • AS_StrandOddsRatio
  • AlleleBalance
  • AlleleBalanceBySample
  • AlleleCountBySample
  • BaseCounts
  • BaseCountsBySample
  • BaseQualityRankSumTest
  • BaseQualitySumPerAlleleBySample
  • ChromosomeCounts
  • ClippingRankSumTest
  • ClusteredReadPosition
  • Coverage
  • DepthPerAlleleBySample
  • DepthPerSampleHC
  • ExcessHet
  • FisherStrand
  • FractionInformativeReads
  • GCContent
  • GenotypeSummaries
  • HaplotypeScore
  • HardyWeinberg
  • HomopolymerRun
  • InbreedingCoeff
  • LikelihoodRankSumTest
  • LowMQ
  • MVLikelihoodRatio
  • MappingQualityRankSumTest
  • MappingQualityZero
  • MappingQualityZeroBySample
  • NBaseCount
  • OxoGReadCounts
  • PossibleDeNovo
  • QualByDepth
  • RMSMappingQuality
  • ReadPosRankSumTest
  • SampleList
  • SnpEff
  • SpanningDeletions
  • StrandAlleleCountsBySample
  • StrandBiasBySample
  • StrandOddsRatio
  • TandemRepeatAnnotator
  • TransmissionDisequilibriumTest
  • VariantType
  • Read Filters
  • BadCigarFilter
  • BadMateFilter
  • CountingFilteringIterator.CountingReadFilter
  • DuplicateReadFilter
  • FailsVendorQualityCheckFilter
  • HCMappingQualityFilter
  • LibraryReadFilter
  • MalformedReadFilter
  • MappingQualityFilter
  • MappingQualityUnavailableFilter
  • MappingQualityZeroFilter
  • MateSameStrandFilter
  • MaxInsertSizeFilter
  • MissingReadGroupFilter
  • NoOriginalQualityScoresFilter
  • NotPrimaryAlignmentFilter
  • OverclippedReadFilter
  • Platform454Filter
  • PlatformFilter
  • PlatformUnitFilter
  • ReadGroupBlackListFilter
  • ReadLengthFilter
  • ReadNameFilter
  • ReadStrandFilter
  • ReassignMappingQualityFilter
  • ReassignOneMappingQualityFilter
  • ReassignOriginalMQAfterIndelRealignmentFilter
  • SampleFilter
  • SingleReadGroupFilter
  • UnmappedReadFilter
  • Resource File Codecs
  • BeagleCodec
  • BedTableCodec
  • RawHapMapCodec
  • RefSeqCodec
  • SAMPileupCodec
  • SAMReadCodec
  • TableCodec

  • Reference Utilities
  • FastaAlternateReferenceMaker
  • FastaReferenceMaker
  • QCRef
  • Showing docs for version 3.7-0


    CallableLoci

    Collect statistics on callable, uncallable, poorly mapped, and other parts of the genome

    Category Diagnostics and Quality Control Tools

    Traversal LocusWalker

    PartitionBy LOCUS


    Overview

    A very common question about a NGS set of reads is what areas of the genome are considered callable. This tool considers the coverage at each locus and emits either a per base state or a summary interval BED file that partitions the genomic intervals into the following callable states:

    REF_N
    The reference base was an N, which is not considered callable the GATK
    PASS
    The base satisfied the min. depth for calling but had less than maxDepth to avoid having EXCESSIVE_COVERAGE
    NO_COVERAGE
    Absolutely no reads were seen at this locus, regardless of the filtering parameters
    LOW_COVERAGE
    There were fewer than min. depth bases at the locus, after applying filters
    EXCESSIVE_COVERAGE
    More than -maxDepth read at the locus, indicating some sort of mapping problem
    POOR_MAPPING_QUALITY
    More than --maxFractionOfReadsWithLowMAPQ at the locus, indicating a poor mapping quality of the reads

    Input

    A BAM file containing exactly one sample.

    Output

    A file with the callable status covering each base and a table of callable status x count of all examined bases

    Usage example

      java -jar GenomeAnalysisTK.jar \
         -T CallableLoci \
         -R reference.fasta \
         -I myreads.bam \
         -summary table.txt \
         -o callable_status.bed
     

    would produce a BED file that looks like:

         20 10000000 10000864 PASS
         20 10000865 10000985 POOR_MAPPING_QUALITY
         20 10000986 10001138 PASS
         20 10001139 10001254 POOR_MAPPING_QUALITY
         20 10001255 10012255 PASS
         20 10012256 10012259 POOR_MAPPING_QUALITY
         20 10012260 10012263 PASS
         20 10012264 10012328 POOR_MAPPING_QUALITY
         20 10012329 10012550 PASS
         20 10012551 10012551 LOW_COVERAGE
         20 10012552 10012554 PASS
         20 10012555 10012557 LOW_COVERAGE
         20 10012558 10012558 PASS
     
    as well as a summary table that looks like:

                            state nBases
                            REF_N 0
                             PASS 996046
                      NO_COVERAGE 121
                     LOW_COVERAGE 928
               EXCESSIVE_COVERAGE 0
             POOR_MAPPING_QUALITY 2906
     

    Additional Information

    Read filters

    These Read Filters are automatically applied to the data by the Engine before processing by CallableLoci.

    Downsampling settings

    This tool applies the following downsampling settings by default.


    Command-line Arguments

    Engine arguments

    All tools inherit arguments from the GATK Engine' "CommandLineGATK" argument collection, which can be used to modify various aspects of the tool's function. For example, the -L argument directs the GATK engine to restrict processing to specific genomic intervals; or the -rf argument allows you to apply certain read filters to exclude some of the data from the analysis.

    CallableLoci specific arguments

    This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.

    Argument name(s) Default value Summary
    Required Outputs
    --summary
    NA Name of file for output summary
    Optional Outputs
    --out
     -o
    stdout An output file created by the walker. Will overwrite contents if file exists
    Optional Parameters
    --maxDepth
    -1 Maximum read depth before a locus is considered poorly mapped
    --maxFractionOfReadsWithLowMAPQ
     -frlmq
    0.1 If the fraction of reads at a base with low mapping quality exceeds this value, the site may be poorly mapped
    --maxLowMAPQ
     -mlmq
    1 Maximum value for MAPQ to be considered a problematic mapped read.
    --minBaseQuality
     -mbq
    20 Minimum quality of bases to count towards depth.
    --minMappingQuality
     -mmq
    10 Minimum mapping quality of reads to count towards depth.
    Advanced Parameters
    --format
    BED Output format
    --minDepth
    4 Minimum QC+ read depth before a locus is considered callable
    --minDepthForLowMAPQ
     -mdflmq
    10 Minimum read depth before a locus is considered a potential candidate for poorly mapped

    Argument details

    Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.


    --format / -format

    Output format
    The output of this tool will be written in this format. The recommended option is BED.

    The --format argument is an enumerated type (OutputFormat), which can have one of the following values:

    BED
    The output will be written as a BED file. There's a BED element for each continuous run of callable states (i.e., PASS, REF_N, etc). This is the recommended format
    STATE_PER_BASE
    Emit chr start stop state quads for each base. Produces a potentially disastrously large amount of output.

    OutputFormat  BED


    --maxDepth / -maxDepth

    Maximum read depth before a locus is considered poorly mapped
    If the QC+ depth exceeds this value the site is considered to have EXCESSIVE_DEPTH

    int  -1  [ [ -∞  ∞ ] ]


    --maxFractionOfReadsWithLowMAPQ / -frlmq

    If the fraction of reads at a base with low mapping quality exceeds this value, the site may be poorly mapped
    If the number of reads at this site is greater than minDepthForLowMAPQ and the fraction of reads with low mapping quality exceeds this fraction then the site has POOR_MAPPING_QUALITY.

    double  0.1  [ [ -∞  ∞ ] ]


    --maxLowMAPQ / -mlmq

    Maximum value for MAPQ to be considered a problematic mapped read.
    The gap between this value and mmq are reads that are not sufficiently well mapped for calling but aren't indicative of mapping problems. For example, if maxLowMAPQ = 1 and mmq = 20, then reads with MAPQ == 0 are poorly mapped, MAPQ >= 20 are considered as contributing to calling, where reads with MAPQ >= 1 and < 20 are not bad in and of themselves but aren't sufficiently good to contribute to calling. In effect this reads are invisible, driving the base to the NO_ or LOW_COVERAGE states

    byte  1  [ [ -∞  ∞ ] ]


    --minBaseQuality / -mbq

    Minimum quality of bases to count towards depth.
    Bases with less than minBaseQuality are viewed as not sufficiently high quality to contribute to the PASS state

    byte  20  [ [ -∞  ∞ ] ]


    --minDepth / -minDepth

    Minimum QC+ read depth before a locus is considered callable
    If the number of QC+ bases (on reads with MAPQ > minMappingQuality and with base quality > minBaseQuality) exceeds this value and is less than maxDepth the site is considered PASS.

    int  4  [ [ -∞  ∞ ] ]


    --minDepthForLowMAPQ / -mdflmq

    Minimum read depth before a locus is considered a potential candidate for poorly mapped
    We don't want to consider a site as POOR_MAPPING_QUALITY just because it has two reads, and one is MAPQ. We won't assign a site to the POOR_MAPPING_QUALITY state unless there are at least minDepthForLowMAPQ reads covering the site.

    int  10  [ [ -∞  ∞ ] ]


    --minMappingQuality / -mmq

    Minimum mapping quality of reads to count towards depth.
    Reads with MAPQ > minMappingQuality are treated as usable for variation detection, contributing to the PASS state.

    byte  10  [ [ -∞  ∞ ] ]


    --out / -o

    An output file created by the walker. Will overwrite contents if file exists

    PrintStream  stdout


    --summary / -summary

    Name of file for output summary
    Callable loci summary counts will be written to this file.

    R File  NA