Consequence Prioritization
Depending on the representation of a genetic variant (e.g. right vs. left aligned) the effect on the transcript may differ, leading to a possibility of different set of consequence annotations for the same variant. Even though both representations of a genetic variant correspond to the exact same haplotype, one may be annotated as more pathogenic than the other. We hypothesize that reporting of variants as likely pathogenic when the same exact haplotype can be interperted in a less pathogenic way is less likely to reflect the realized functional impact. As such, we report the set of consequences with lowest priority.
The priority of a set of consequences is determined by the highest priority consequence within that set.
Conceptual Priority Tiers
Variants are ranked using two principles, (1) variant pathogenicity (2) annotation definition. The annotations are first ranked based on pathogenicity, then the annotations are given a lower priority if their definition is loss of a specific sequence. Variants types that are defined as the loss of a specific sequence are ranked at the bottom of the list so that if there exists a representation of the variant that does not result in the loss of that specific sequence, that representation should be preferred.
Prioritization Ordering
(Very Low Probability of Harm)
- synonymous_variant
- start_retained_variant
- stop_retained_variant
- transcript_variant
- non_coding_transcript_variant
- non_coding_transcript_exon_variant
- mature_miRNA_variant
- coding_sequence_variant
- intron_variant
- upstream_gene_variant
- downstream_gene_variant
- exon_variant
(Low Probability of Harm)
- splice_region_variant
- splice_donor_region_variant
- splice_donor_5th_base_variant
- splice_polypyrimidine_tract_variant
- exonic_splice_region_variant
- regulatory_region_variant
- three_prime_UTR_variant
- five_prime_UTR_variant
- protein_altering_variant
(High Probability of Harm)
- missense_variant
- inframe_indel
- inframe_insertion
- stop_gained
- frameshift_variant
- inframe_deletion
- start_lost
- stop_lost
- splice_donor_variant
- splice_acceptor_variant
Consequences not in the above list are treated as lowest priority.
Disabling Consequence Prioritization
If the user is not interested in such consequence prioritization, they can disable it via the --disable-consequence-prioritization command line option.
In this case, the consequence list for the left aligned variant will be reported.
References
- Ensembl Variant Effect Predictor consequence definitions and prioritization context: https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html