• Diagnostics and Quality Control Tools
  • ASEReadCounter
  • AnalyzeCovariates
  • CallableLoci
  • CheckPileup
  • CompareCallableLoci
  • ContEst
  • CountBases
  • CountIntervals
  • CountLoci
  • CountMales
  • CountRODs
  • CountRODsByRef
  • CountReadEvents
  • CountReads
  • CountTerminusEvent
  • DepthOfCoverage
  • DiagnoseTargets
  • DiffObjects
  • ErrorRatePerCycle
  • FastaStats
  • FindCoveredIntervals
  • FlagStat
  • GCContentByInterval
  • GatherBqsrReports
  • Pileup
  • PrintRODs
  • QualifyMissingIntervals
  • ReadClippingStats
  • ReadGroupProperties
  • ReadLengthDistribution
  • SimulateReadsForVariants
  • Sequence Data Processing Tools
  • BaseRecalibrator
  • ClipReads
  • IndelRealigner
  • LeftAlignIndels
  • PrintReads
  • RealignerTargetCreator
  • SplitNCigarReads
  • SplitSamFile
  • Variant Discovery Tools
  • ApplyRecalibration
  • CalculateGenotypePosteriors
  • GATKPaperGenotyper
  • GenotypeGVCFs
  • HaplotypeCaller
  • MuTect2
  • RegenotypeVariants
  • UnifiedGenotyper
  • VariantRecalibrator
  • Variant Evaluation Tools
  • GenotypeConcordance
  • ValidateVariants
  • VariantEval
  • VariantFiltration
  • Variant Manipulation Tools
  • CatVariants
  • CombineGVCFs
  • CombineVariants
  • HaplotypeResolver
  • LeftAlignAndTrimVariants
  • PhaseByTransmission
  • RandomlySplitVariants
  • ReadBackedPhasing
  • SelectHeaders
  • SelectVariants
  • ValidationSiteSelector
  • VariantAnnotator
  • VariantsToAllelicPrimitives
  • VariantsToBinaryPed
  • VariantsToTable
  • VariantsToVCF

  • Annotation Modules
  • AS_BaseQualityRankSumTest
  • AS_FisherStrand
  • AS_InbreedingCoeff
  • AS_InsertSizeRankSum
  • AS_MQMateRankSumTest
  • AS_MappingQualityRankSumTest
  • AS_QualByDepth
  • AS_RMSMappingQuality
  • AS_ReadPosRankSumTest
  • AS_StrandOddsRatio
  • AlleleBalance
  • AlleleBalanceBySample
  • AlleleCountBySample
  • BaseCounts
  • BaseCountsBySample
  • BaseQualityRankSumTest
  • BaseQualitySumPerAlleleBySample
  • ChromosomeCounts
  • ClippingRankSumTest
  • ClusteredReadPosition
  • Coverage
  • DepthPerAlleleBySample
  • DepthPerSampleHC
  • ExcessHet
  • FisherStrand
  • FractionInformativeReads
  • GCContent
  • GenotypeSummaries
  • HaplotypeScore
  • HardyWeinberg
  • HomopolymerRun
  • InbreedingCoeff
  • LikelihoodRankSumTest
  • LowMQ
  • MVLikelihoodRatio
  • MappingQualityRankSumTest
  • MappingQualityZero
  • MappingQualityZeroBySample
  • NBaseCount
  • OxoGReadCounts
  • PossibleDeNovo
  • QualByDepth
  • RMSMappingQuality
  • ReadPosRankSumTest
  • SampleList
  • SnpEff
  • SpanningDeletions
  • StrandAlleleCountsBySample
  • StrandBiasBySample
  • StrandOddsRatio
  • TandemRepeatAnnotator
  • TransmissionDisequilibriumTest
  • VariantType
  • Read Filters
  • BadCigarFilter
  • BadMateFilter
  • CountingFilteringIterator.CountingReadFilter
  • DuplicateReadFilter
  • FailsVendorQualityCheckFilter
  • HCMappingQualityFilter
  • LibraryReadFilter
  • MalformedReadFilter
  • MappingQualityFilter
  • MappingQualityUnavailableFilter
  • MappingQualityZeroFilter
  • MateSameStrandFilter
  • MaxInsertSizeFilter
  • MissingReadGroupFilter
  • NoOriginalQualityScoresFilter
  • NotPrimaryAlignmentFilter
  • OverclippedReadFilter
  • Platform454Filter
  • PlatformFilter
  • PlatformUnitFilter
  • ReadGroupBlackListFilter
  • ReadLengthFilter
  • ReadNameFilter
  • ReadStrandFilter
  • ReassignMappingQualityFilter
  • ReassignOneMappingQualityFilter
  • ReassignOriginalMQAfterIndelRealignmentFilter
  • SampleFilter
  • SingleReadGroupFilter
  • UnmappedReadFilter
  • Resource File Codecs
  • BeagleCodec
  • BedTableCodec
  • RawHapMapCodec
  • RefSeqCodec
  • SAMPileupCodec
  • SAMReadCodec
  • TableCodec

  • Reference Utilities
  • FastaAlternateReferenceMaker
  • FastaReferenceMaker
  • QCRef
  • Showing docs for version 3.7-0


    SelectVariants

    Select a subset of variants from a larger callset

    Category Variant Manipulation Tools

    Traversal LocusWalker

    PartitionBy LOCUS


    Overview

    Often, a VCF containing many samples and/or variants will need to be subset in order to facilitate certain analyses (e.g. comparing and contrasting cases vs. controls; extracting variant or non-variant loci that meet certain requirements, displaying just a few samples in a browser like IGV, etc.). SelectVariants can be used for this purpose.

    There are many different options for selecting subsets of variants from a larger callset:

    There are also several options for recording the original values of certain annotations that are recalculated when a subsetting the new callset, trimming alleles, and so on.

    Input

    A variant call set from which to select a subset.

    Output

    A new VCF file containing the selected subset of variants.

    Usage examples

    Select two samples out of a VCF with many samples

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -sn SAMPLE_A_PARC \
       -sn SAMPLE_B_ACTG
     

    Select two samples and any sample that matches a regular expression

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -sn SAMPLE_1_PARC \
       -sn SAMPLE_1_ACTG \
       -se 'SAMPLE.+PARC'
     

    Exclude two samples and any sample that matches a regular expression:

     java -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       --variant input.vcf \
       -o output.vcf \
       -xl_sn SAMPLE_1_PARC \
       -xl_sn SAMPLE_1_ACTG \
       -xl_se 'SAMPLE.+PARC'
     

    Select any sample that matches a regular expression and sites where the QD annotation is more than 10:

     java -Xmx2g -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -se 'SAMPLE.+PARC' \
       -select "QD > 10.0"
     

    Select any sample that does not match a regular expression and sites where the QD annotation is more than 10:

     java  -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       --variant input.vcf \
       -o output.vcf \
       -se 'SAMPLE.+PARC' \
       -select "QD > 10.0"
       -invertSelect
     

    Select a sample and exclude non-variant loci and filtered loci (trim remaining alleles by default):

     java -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -sn SAMPLE_1_ACTG \
       -env \
       -ef
     

    Select a sample, subset remaining alleles, but don't trim:

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -sn SAMPLE_1_ACTG \
       -env \
       -noTrim
    

    Select a sample and restrict the output vcf to a set of intervals:

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -L /path/to/my.interval_list \
       -sn SAMPLE_1_ACTG
     

    Select all calls missed in my vcf, but present in HapMap (useful to take a look at why these variants weren't called in my dataset):

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V hapmap.vcf \
       --discordance myCalls.vcf \
       -o output.vcf \
       -sn mySample
     

    Select all calls made by both myCalls and theirCalls (useful to take a look at what is consistent between two callers):

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V myCalls.vcf \
       --concordance theirCalls.vcf \
       -o output.vcf \
       -sn mySample
     

    Generating a VCF of all the variants that are mendelian violations. The optional argument '-mvq' restricts the selection to sites that have a QUAL score of 50 or more

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -ped family.ped \
       -mv -mvq 50 \
       -o violations.vcf
     

    Generating a VCF of all the variants that are not mendelian violations. The optional argument '-mvq' together with '-invMv' restricts the selection to sites that have a QUAL score of 50 or less

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -ped family.ped \
       -mv -mvq 50 -invMv \
       -o violations.vcf
     

    Create a set with 50% of the total number of variants in the variant VCF:

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -fraction 0.5
     

    Select only indels between 2 and 5 bases long from a VCF:

     java -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -selectType INDEL
       --minIndelSize 2
       --maxIndelSize 5
     

    Exclude indels from a VCF:

     java -Xmx2g -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       --variant input.vcf \
       -o output.vcf \
       --selectTypeToExclude INDEL
     

    Select only multi-allelic SNPs and MNPs from a VCF (i.e. SNPs with more than one allele listed in the ALT column):

     java -jar GenomeAnalysisTK.jar \
       -T SelectVariants \
       -R reference.fasta \
       -V input.vcf \
       -o output.vcf \
       -selectType SNP -selectType MNP \
       -restrictAllelesTo MULTIALLELIC
     

    Select IDs in fileKeep and exclude IDs in fileExclude:

     java -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       --variant input.vcf \
       -o output.vcf \
       -IDs fileKeep \
       -excludeIDs fileExclude
     

    Select sites where there are between 2 and 5 samples and between 10 and 50 percent of the sample genotypes are filtered:

     java -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       --variant input.vcf \
       --maxFilteredGenotypes 5
       --minFilteredGenotypes 2
       --maxFractionFilteredGenotypes 0.60
       --minFractionFilteredGenotypes 0.10
     

    Set filtered genotypes to no-call (./.):

     java -jar GenomeAnalysisTK.jar \
       -R ref.fasta \
       -T SelectVariants \
       --variant input.vcf \
       --setFilteredGtToNocall
     

    Additional Information

    Read filters

    These Read Filters are automatically applied to the data by the Engine before processing by SelectVariants.

    Parallelism options

    This tool can be run in multi-threaded mode using this option.


    Command-line Arguments

    Engine arguments

    All tools inherit arguments from the GATK Engine' "CommandLineGATK" argument collection, which can be used to modify various aspects of the tool's function. For example, the -L argument directs the GATK engine to restrict processing to specific genomic intervals; or the -rf argument allows you to apply certain read filters to exclude some of the data from the analysis.

    SelectVariants specific arguments

    This table summarizes the command-line arguments that are specific to this tool. For more details on each argument, see the list further down below the table or click on an argument name to jump directly to that entry in the list.

    Argument name(s) Default value Summary
    Required Inputs
    --variant
     -V
    NA Input VCF file
    Optional Inputs
    --concordance
     -conc
    none Output variants also called in this comparison track
    --discordance
     -disc
    none Output variants not called in this comparison track
    --exclude_sample_expressions
     -xl_se
    [] List of sample expressions to exclude
    --exclude_sample_file
     -xl_sf
    [] List of samples to exclude
    --sample_file
     -sf
    NA File containing a list of samples to include
    Optional Outputs
    --out
     -o
    stdout File to which variants should be written
    Optional Parameters
    --exclude_sample_name
     -xl_sn
    [] Exclude genotypes from this sample
    --excludeIDs
     -xlIDs
    NA List of variant IDs to select
    --keepIDs
     -IDs
    NA List of variant IDs to select
    --maxFilteredGenotypes
    2147483647 Maximum number of samples filtered at the genotype level
    --maxFractionFilteredGenotypes
    1.0 Maximum fraction of samples filtered at the genotype level
    --maxIndelSize
    2147483647 Maximum size of indels to include
    --maxNOCALLfraction
    1.0 Maximum fraction of samples with no-call genotypes
    --maxNOCALLnumber
    2147483647 Maximum number of samples with no-call genotypes
    --mendelianViolationQualThreshold
     -mvq
    0.0 Minimum GQ score for each trio member to accept a site as a violation
    --minFilteredGenotypes
    0 Minimum number of samples filtered at the genotype level
    --minFractionFilteredGenotypes
    0.0 Maximum fraction of samples filtered at the genotype level
    --minIndelSize
    0 Minimum size of indels to include
    --remove_fraction_genotypes
     -fractionGenotypes
    0.0 Select a fraction of genotypes at random from the input and sets them to no-call
    --restrictAllelesTo
    ALL Select only variants of a particular allelicity
    --sample_expressions
     -se
    NA Regular expression to select multiple samples
    --sample_name
     -sn
    [] Include genotypes from this sample
    --select_random_fraction
     -fraction
    0.0 Select a fraction of variants at random from the input
    --selectexpressions
     -select
    [] One or more criteria to use when selecting the data
    --selectTypeToExclude
     -xlSelectType
    [] Do not select certain type of variants from the input file
    --selectTypeToInclude
     -selectType
    [] Select only a certain type of variants from the input file
    Optional Flags
    --excludeFiltered
     -ef
    false Don't include filtered sites
    --excludeNonVariants
     -env
    false Don't include non-variant sites
    --forceValidOutput
    false Forces output VCF to be compliant to up-to-date version
    --invertMendelianViolation
     -invMv
    false Output non-mendelian violation sites only
    --invertselect
     -invertSelect
    false Invert the selection criteria for -select
    --keepOriginalAC
    false Store the original AC, AF, and AN values after subsetting
    --keepOriginalDP
    false Store the original DP value after subsetting
    --mendelianViolation
     -mv
    false Output mendelian violation sites only
    --preserveAlleles
     -noTrim
    false Preserve original alleles, do not trim
    --removeUnusedAlternates
     -trimAlternates
    false Remove alternate alleles not present in any genotypes
    --setFilteredGtToNocall
    false Set filtered genotypes to no-call

    Argument details

    Arguments in this list are specific to this tool. Keep in mind that other arguments are available that are shared with other tools (e.g. command-line GATK arguments); see Inherited arguments above.


    --concordance / -conc

    Output variants also called in this comparison track
    A site is considered concordant if (1) we are not looking for specific samples and there is a variant called in both the variant and concordance tracks or (2) every sample present in the variant track is present in the concordance track and they have the sample genotype call.

    This argument supports reference-ordered data (ROD) files in the following formats: BCF2, VCF, VCF3

    RodBinding[VariantContext]  none


    --discordance / -disc

    Output variants not called in this comparison track
    A site is considered discordant if there exists some sample in the variant track that has a non-reference genotype and either the site isn't present in this track, the sample isn't present in this track, or the sample is called reference in this track.

    This argument supports reference-ordered data (ROD) files in the following formats: BCF2, VCF, VCF3

    RodBinding[VariantContext]  none


    --exclude_sample_expressions / -xl_se

    List of sample expressions to exclude
    Using a regular expression allows you to match multiple sample names that have that pattern in common. Note that sample exclusion takes precedence over inclusion, so that if a sample is in both lists it will be excluded. This argument can be specified multiple times in order to use multiple different matching patterns.

    Set[String]  []


    --exclude_sample_file / -xl_sf

    List of samples to exclude
    Sample names should be in a plain text file listing one sample name per line. Note that sample exclusion takes precedence over inclusion, so that if a sample is in both lists it will be excluded. This argument can be specified multiple times in order to provide multiple sample list files.

    Set[File]  []


    --exclude_sample_name / -xl_sn

    Exclude genotypes from this sample
    Note that sample exclusion takes precedence over inclusion, so that if a sample is in both lists it will be excluded. This argument can be specified multiple times in order to provide multiple sample names.

    Set[String]  []


    --excludeFiltered / -ef

    Don't include filtered sites
    If this flag is enabled, sites that have been marked as filtered (i.e. have anything other than `.` or `PASS` in the FILTER field) will be excluded from the output.

    boolean  false


    --excludeIDs / -xlIDs

    List of variant IDs to select
    If a file containing a list of IDs is provided to this argument, the tool will not select variants whose ID field is present in this list of IDs. The matching is done by exact string matching. The expected file format is simply plain text with one ID per line.

    File  NA


    --excludeNonVariants / -env

    Don't include non-variant sites

    boolean  false


    --forceValidOutput / NA

    Forces output VCF to be compliant to up-to-date version
    If this argument is provided, the output will be compliant with the version in the header, however it will also cause the tool to run slower than without the argument. Without the argument the header will be compliant with the up-to-date version, but the output in the body may not be compliant. If an up-to-date input file is used, then the output will also be up-to-date regardless of this argument.

    boolean  false


    --invertMendelianViolation / -invMv

    Output non-mendelian violation sites only
    If this flag is enabled, this tool will select only variants that do not correspond to a mendelian violation as determined on the basis of family structure. Requires passing a pedigree file using the engine-level `-ped` argument.

    Boolean  false


    --invertselect / -invertSelect

    Invert the selection criteria for -select
    Invert the selection criteria for -select.

    boolean  false


    --keepIDs / -IDs

    List of variant IDs to select
    If a file containing a list of IDs is provided to this argument, the tool will only select variants whose ID field is present in this list of IDs. The matching is done by exact string matching. The expected file format is simply plain text with one ID per line.

    File  NA


    --keepOriginalAC / -keepOriginalAC

    Store the original AC, AF, and AN values after subsetting
    When subsetting a callset, this tool recalculates the AC, AF, and AN values corresponding to the contents of the subset. If this flag is enabled, the original values of those annotations will be stored in new annotations called AC_Orig, AF_Orig, and AN_Orig.

    boolean  false


    --keepOriginalDP / -keepOriginalDP

    Store the original DP value after subsetting
    When subsetting a callset, this tool recalculates the site-level (INFO field) DP value corresponding to the contents of the subset. If this flag is enabled, the original value of the DP annotation will be stored in a new annotation called DP_Orig.

    boolean  false


    --maxFilteredGenotypes / NA

    Maximum number of samples filtered at the genotype level
    If this argument is provided, select sites where at most a maximum number of samples are filtered at the genotype level.

    int  2147483647  [ [ -∞  ∞ ] ]


    --maxFractionFilteredGenotypes / NA

    Maximum fraction of samples filtered at the genotype level
    If this argument is provided, select sites where a fraction or less of the samples are filtered at the genotype level.

    double  1.0  [ [ -∞  ∞ ] ]


    --maxIndelSize / NA

    Maximum size of indels to include
    If this argument is provided, indels that are larger than the specified size will be excluded.

    int  2147483647  [ [ -∞  ∞ ] ]


    --maxNOCALLfraction / NA

    Maximum fraction of samples with no-call genotypes
    If this argument is provided, select sites where at most the given fraction of samples have no-call genotypes.

    double  1.0  [ [ -∞  ∞ ] ]


    --maxNOCALLnumber / NA

    Maximum number of samples with no-call genotypes
    If this argument is provided, select sites where at most the given number of samples have no-call genotypes.

    int  2147483647  [ [ -∞  ∞ ] ]


    --mendelianViolation / -mv

    Output mendelian violation sites only
    If this flag is enabled, this tool will select only variants that correspond to a mendelian violation as determined on the basis of family structure. Requires passing a pedigree file using the engine-level `-ped` argument.

    Boolean  false


    --mendelianViolationQualThreshold / -mvq

    Minimum GQ score for each trio member to accept a site as a violation
    This argument specifies the genotype quality (GQ) threshold that all members of a trio must have in order for a site to be accepted as a mendelian violation. Note that the `-mv` flag must be set for this argument to have an effect.

    double  0.0  [ [ -∞  ∞ ] ]


    --minFilteredGenotypes / NA

    Minimum number of samples filtered at the genotype level
    If this argument is provided, select sites where at least a minimum number of samples are filtered at the genotype level.

    int  0  [ [ -∞  ∞ ] ]


    --minFractionFilteredGenotypes / NA

    Maximum fraction of samples filtered at the genotype level
    If this argument is provided, select sites where a fraction or more of the samples are filtered at the genotype level.

    double  0.0  [ [ -∞  ∞ ] ]


    --minIndelSize / NA

    Minimum size of indels to include
    If this argument is provided, indels that are smaller than the specified size will be excluded.

    int  0  [ [ -∞  ∞ ] ]


    --out / -o

    File to which variants should be written

    VariantContextWriter  stdout


    --preserveAlleles / -noTrim

    Preserve original alleles, do not trim
    The default behavior of this tool is to remove bases common to all remaining alleles after subsetting operations have been completed, leaving only their minimal representation. If this flag is enabled, the original alleles will be preserved as recorded in the input VCF.

    boolean  false


    --remove_fraction_genotypes / -fractionGenotypes

    Select a fraction of genotypes at random from the input and sets them to no-call
    The value of this argument should be a number between 0 and 1 specifying the fraction of total variants to be randomly selected from the input callset and set to no-call (./). Note that this is done using a probabilistic function, so the final result is not guaranteed to carry the exact fraction requested. Can be used for large fractions.

    double  0.0  [ [ -∞  ∞ ] ]


    --removeUnusedAlternates / -trimAlternates

    Remove alternate alleles not present in any genotypes
    When this flag is enabled, all alternate alleles that are not present in the (output) samples will be removed. Note that this even extends to biallelic SNPs - if the alternate allele is not present in any sample, it will be removed and the record will contain a '.' in the ALT column. Note also that sites-only VCFs, by definition, do not include the alternate allele in any genotype calls.

    boolean  false


    --restrictAllelesTo / -restrictAllelesTo

    Select only variants of a particular allelicity
    When this argument is used, we can choose to include only multiallelic or biallelic sites, depending on how many alleles are listed in the ALT column of a VCF. For example, a multiallelic record such as: 1 100 . A AAA,AAAAA will be excluded if `-restrictAllelesTo BIALLELIC` is used, because there are two alternate alleles, whereas a record such as: 1 100 . A T will be included in that case, but would be excluded if `-restrictAllelesTo MULTIALLELIC` is used. Valid options are ALL (default), MULTIALLELIC or BIALLELIC.

    The --restrictAllelesTo argument is an enumerated type (NumberAlleleRestriction), which can have one of the following values:

    ALL
    BIALLELIC
    MULTIALLELIC

    NumberAlleleRestriction  ALL


    --sample_expressions / -se

    Regular expression to select multiple samples
    Using a regular expression allows you to match multiple sample names that have that pattern in common. This argument can be specified multiple times in order to use multiple different matching patterns.

    Set[String]  NA


    --sample_file / -sf

    File containing a list of samples to include
    Sample names should be in a plain text file listing one sample name per line. This argument can be specified multiple times in order to provide multiple sample list files.

    Set[File]  NA


    --sample_name / -sn

    Include genotypes from this sample
    This argument can be specified multiple times in order to provide multiple sample names.

    Set[String]  []


    --select_random_fraction / -fraction

    Select a fraction of variants at random from the input
    The value of this argument should be a number between 0 and 1 specifying the fraction of total variants to be randomly selected from the input callset. Note that this is done using a probabilistic function, so the final result is not guaranteed to carry the exact fraction requested. Can be used for large fractions.

    double  0.0  [ [ -∞  ∞ ] ]


    --selectexpressions / -select

    One or more criteria to use when selecting the data
    See example commands above for detailed usage examples. Note that these expressions are evaluated *after* the specified samples are extracted and the INFO field annotations are updated.

    ArrayList[String]  []


    --selectTypeToExclude / -xlSelectType

    Do not select certain type of variants from the input file
    This argument excludes particular kinds of variants out of a list. If left empty, there is no type selection and all variant types are considered for other selection criteria. Valid types are INDEL, SNP, MIXED, MNP, SYMBOLIC, NO_VARIATION. Can be specified multiple times.

    List[Type]  []


    --selectTypeToInclude / -selectType

    Select only a certain type of variants from the input file
    This argument selects particular kinds of variants out of a list. If left empty, there is no type selection and all variant types are considered for other selection criteria. Valid types are INDEL, SNP, MIXED, MNP, SYMBOLIC, NO_VARIATION. Can be specified multiple times.

    List[Type]  []


    --setFilteredGtToNocall / NA

    Set filtered genotypes to no-call
    If this argument is provided, set filtered genotypes to no-call (./.).

    boolean  false


    --variant / -V

    Input VCF file
    Variants from this VCF file are used by this tool as input. The file must at least contain the standard VCF header lines, but can be empty (i.e., no variants are contained in the file).

    This argument supports reference-ordered data (ROD) files in the following formats: BCF2, VCF, VCF3

    R RodBinding[VariantContext]  NA