Skip to main content
Version: 3.2.5

Nirvana JSON File Format

Overview

Conventions

In the Nirvana JSON representation, we try to maximize the amount of useful information that is relayed in the output file. As such, we have several conventions that are useful to know about:

  • With boolean key/value pairs, we only output the keys that have a true value. I.e. there's no reason to display "isStructuralVariant":false a few million times when annotating a small variant VCF.
  • When transferring data from the VCF file to the JSON (e.g. for allele depths (AD)), it is common to use a period (.) as a placeholder for missing data in the VCF file. Nirvana treats periods like empty or null strings and therefore will not output those entries.

JSON Layout

info

In general, each position corresponds to a row in the original VCF file.

For each gene that was referenced in the transcripts found in the positions section, there will be additional gene-level annotation in the gene section.

{ 
"header":{
"annotator":"Nirvana 3.2.5",
"creationTime":"2022-12-05 16:43:41",
"genomeAssembly":"GRCh37",
"schemaVersion":6,
"dataVersion":"91.26.50",
"dataSources":[
{
"name":"VEP",
"version":"91",
"description":"RefSeq",
"releaseDate":"2018-03-05"
},
{
"name":"ClinVar",
"version":"20190204",
"description":"A freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence",
"releaseDate":"2019-02-04"
}
],
"samples":[
"NA12878",
"NA12891",
"NA12892"
]
},
FieldTypeNotes
annotatorstringthe name of the annotator and the current version
creationTimestringyyyy-MM-dd hh:mm:ss
genomeAssemblystringsee possible values below
schemaVersionintegerincremented whenever the core structure of the JSON file introduces breaking changes
dataVersionstring
dataSourcesobject arraysee Data Source entry below
samplesstring arraythe order of these sample names will be used throughout the JSON file when enumerating samples

Data Source

FieldTypeNotes
namestring
versionstring
descriptionstringoptional description of the data source
releaseDatestringyyyy-MM-dd

Genome Assemblies

  • GRCh37
  • GRCh38
  • hg19

Positions

"positions":[ 
{
"chromosome":"chr2",
"position":48010488,
"repeatUnit":"GGCCCC",
"refRepeatCount":3,
"svEnd":48020488,
"refAllele":"G",
"altAlleles":[
"A",
"GT"
],
"quality":461,
"filters":[
"PASS"
],
"ciPos":[
-170,
170
],
"ciEnd":[
-175,
175
],
"svLength":1000,
"strandBias":1.23,
"jointSomaticNormalQuality":29,
"cytogeneticBand":"2p16.3",
FieldTypeVariant TypeNotes
chromosomestringallexactly as displayed in the vcf
postionintegerallexactly as displayed in the vcf (1-based notation). Range: 1 - 250 million
repeatUnitstringSTRprovided by ExpansionHunter
refRepeatCountintegerSTRprovided by ExpansionHunter
svEndintegerSV
refAllelestringallexactly as displayed in the vcf
altAllelestring arrayallexactly as displayed in the vcf
qualityfloatallexactly as displayed in the vcf (Normally an integer, but some variant callers using floating point. Has been observed as high as 500k)
filtersstring arrayallexactly as displayed in the vcf
ciPosinteger arraySV
ciEndinteger arraySV
svLengthintegerSV
strandBiasfloatsmall variantprovided by GATK (from SB)
jointSomaticNormalQualityintegerSVprovided by the Manta variant caller (SOMATICSCORE)
cytogeneticBandstringalle.g. 17p13.1

1000 Genomes (SV)

"oneKg":[
{
"chromosome":"1",
"begin":1595369,
"end":1612441,
"variantType": "copy_number_variation",
"id": "esv3635753;esv3635754;esv3635755;esv3635756;esv3635757",
"allAn": 5008,
"allAc": 2702,
"allAf": 0.539537,
"afrAf": 0.6052,
"amrAf": 0.3675,
"eurAf": 0.5357,
"easAf": 0.5368,
"sasAf": 0.5797,
"reciprocalOverlap": 0.07555
}
],
FieldTypeNotes
chromosomestring
begininteger
endinteger
variantTypestring
idstring
allAnfloating pointallele number for all populations. Non-zero integer.
allAcfloating pointallele count for all populations. Integer.
allAffloating pointallele frequency for all populations. Range: 0 - 1.0
afrAffloating pointallele frequency for the African super population. Range: 0 - 1.0
amrAffloating pointallele frequency for the Ad Mixed American super population. Range: 0 - 1.0
eurAffloating pointallele frequency for the European super population. Range: 0 - 1.0
easAfintegerallele frequency for the East Asian super population. Range: 0 - 1.0
sasAfintegerallele frequency for the South Asian super population. Range: 0 - 1.0
reciprocalOverlapfloating pointrange: 0 - 1.

Samples

"samples":[
{
"genotype":"0/1",
"variantFrequencies":[
0.333,
0.5
],
"totalDepth":57,
"genotypeQuality":12,
"copyNumber":3,
"repeatUnitCounts":[
10,
20
],
"alleleDepths":[
10,
20,
30
],
"failedFilter":true,
"splitReadCounts":[
10,
20
],
"pairedEndReadCounts":[
10,
20
],
"diseaseAffectedStatuses":[
"-"
],
"artifactAdjustedQualityScore":89.3,
"likelihoodRatioQualityScore":78.2
}
]
FieldTypeNotes
genotypestring
repeatNumbersstringExpansionHunter-specific
repeatNumberSpansstringExpansionHunter-specific
variantFrequenciesfloat arrayrange: 0 - 1.0. One value per alternate allele
totalDepthintegernon-negative integer values
genotypeQualityintegernon-negative integer values. Typically maxes out at 99
copyNumberintegernon-negative integer values
alleleDepthsinteger arraynon-negative integer values
failedFilterbool
splitReadCountsinteger arrayManta-specific
pairedEndReadCountsinteger arrayManta-specific
lossOfHeterozygositybool
deNovoQualityfloat
mpileupAlleleDepthsint arraySMN1-specific
silentCarrierHaplotypestringSMN1-specific
paralogousEntrezGeneIdsint arraySMN1-specific
paralogousGeneCopyNumbersint arraySMN1-specific
diseaseClassificationSourcesstring arraySMN1-specific
diseaseIdsstring arraySMN1-specific
diseaseAffectedStatusesstring arraySMN1-specific
proteinAlteringVariantPositionsint arraySMN1-specific
isCompoundHetCompatibleboolSMN1-specific
artifactAdjustedQualityScorefloatPEPE-specific. Range: 0 - 100.0
likelihoodRatioQualityScorefloatPEPE-specific. Range: 0 - 100.0
Empty Samples

If a sample does not contain any entries, we will create a sample object that contains the isEmpty key. This ensures that sample ordering is preserved while indicating that a sample is intentionally empty.

"samples":[ 
{
"isEmpty":true
}
],

Variants

"variants":[ 
{
"vid":"2:48010488:A",
"chromosome":"chr2",
"begin":48010488,
"end":48010488,
"isReferenceMinorAllele":true,
"isStructuralVariant":true,
"refAllele":"G",
"altAllele":"A",
"variantType":"SNV",
"isDecomposedVariant":true,
"isRecomposedVariant":true,
"hgvsg":"NC_000002.11:g.48010488G>A",
"phylopScore":0.459
FieldTypeNotes
vidstringsee Variant Identifiers
chromosomestring
beginint1-based non-negative integer values. Range: 1 - 250 million
endint1-based non-negative integer values. Range: 1 - 250 million
isReferenceMinorAllelebooltrue when this is a reference minor allele
isStructuralVariantbooltrue when the variant is a structural variant
refAllelestringparsimonious representation of the reference allele
altAllelestringparsimonious representation of the alternate allele.
variantTypestringuses Sequence Ontology sequence alterations
isDecomposedVariantbooltrue when the decomposed variant has been used to create another recomposed variant
isRecomposedVariantbooltrue when the variant is recomposed from two or more decomposed variants
hgvsgstringHGVS g. notation
phylopScorefloatphyloP conservation score. Range: -14.08 to 6.424
Reference Minor Alleles

Nirvana supports annotating reference minor alleles. In such a case, refAllele will be replaced by the global major allele and altAllele will be replaced with the original reference allele.

Flagging Decomposed & Recomposed Variants

When two or more decomposed variants are recomposed into an MNV, the decomposed variants will be marked with "isDecomposedVariant":true.

Similarly, the recomposed variant will be shown as a new VCF position. This recomposed variant will be flagged with "isRecomposedVariant":true.

Transcripts

"transcripts":[
{
"transcript":"ENST00000445503.1",
"source":"Ensembl",
"bioType":"nonsense_mediated_decay",
"codons":"gGg/gAg",
"aminoAcids":"G/E",
"cdnaPos":"268",
"cdsPos":"116",
"exons":"1/9",
"introns":"1/8",
"proteinPos":"39",
"geneId":"ENSG00000116062",
"hgnc":"MSH6",
"consequence":[
"missense_variant",
"NMD_transcript_variant"
],
"hgvsc":"ENST00000445503.1:c.116G>A",
"hgvsp":"ENSP00000405294.1:p.(Gly39Glu)",
"geneFusion":{
"exon":6,
"intron":5,
"fusions":[
{
"hgvsc":"ETV6{ENST00000396373.4}:c.1_1009+3402_RUNX1{ENST00000437180.1}:c.58+568_1443",
"exon":3,
"intron":2
},
{
"hgvsc":"ETV6{ENST00000396373.4}:c.1_1009+3402_RUNX1{ENST00000300305.3}:c.58+568_1443",
"exon":2,
"intron":1
}
]
},
"isCanonical":true,
"polyPhenScore":0.95,
"polyPhenPrediction":"probably damaging",
"proteinId":"ENSP00000405294.1",
"siftScore":0.61,
"siftPrediction":"tolerated",
"completeOverlap":true
}
]
FieldTypeNotes
transcriptstringtranscript ID. e.g. ENST00000445503.1
sourcestringRefSeq / Ensembl
bioTypestringdescriptions of the biotypes from Ensembl
codonsstring
aminoAcidsstring
cdnaPosstring
cdsPosstring
exonsstringexons affected by the variant
intronsstringintrons affected by the variant
proteinPosstring
geneIdstringgene ID. e.g. ENSG00000116062
hgncstringgene symbol. e.g. MSH6
consequencestring arraySequence Ontology Consequences
hgvscstringHGVS coding nomenclature
hgvspstringHGVS protein nomenclature
geneFusionobjectsee Gene Fusions entry below
isCanonicalbooltrue when this is a canonical transcript
polyPhenScorefloatrange: 0 - 1.0
polyPhenPredictionstringsee possible values below
proteinIdstringprotein ID. E.g. ENSP00000405294.1
siftScorefloatrange: 0 - 1.0
siftPredictionstringsee possible values below
completeOverlapbooltrue when this transcript is completely overlapped by the variant

PolyPhen

  • probably damaging
  • possibly damaging
  • benign
  • unknown

SIFT

  • tolerated
  • deleterious
  • tolerated - low confidence
  • deleterious - low confidence

Gene Fusions

FieldTypeNotes
exonintactual exon where the breakpoint was located
intronintactual intron where the breakpoint was located
fusionsobject arraysee Fusion entry below

Fusion

FieldTypeNotes
exonintactual exon where the other breakpoint was located
intronintactual intron where the other breakpoint was located
hgvscstringHGVS coding nomenclature describing the two genes and the transcripts that are fused along with

Regulatory Regions

"regulatoryRegions":[ 
{
"id":"ENSR00001542175",
"type":"promoter",
"consequence":[
"regulatory_region_variant"
]
}
]
FieldTypeNotes
idstring
typestringsee possible values below
consequencestring arraysee possible values below

Regulatory Types

  • CTCF_binding_site
  • enhancer
  • open_chromatin_region
  • promoter
  • promoter_flanking_region
  • TF_binding_site

Regulatory Consequences

  • regulatory_region_variant
  • regulatory_region_ablation
  • regulatory_region_amplification
  • regulatory_region_truncation

ClinVar

"clinvar":[
{
"id":"RCV000030258.4",
"reviewStatus":"reviewed by expert panel",
"alleleOrigins":[
"germline"
],
"refAllele":"G",
"altAllele":"A",
"phenotypes":[
"Lynch syndrome"
],
"medGenIds":[
"C1333990"
],
"omimIds":[
"120435"
],
"significance":[
"benign"
],
"lastUpdatedDate":"2017-05-01",
"isAlleleSpecific":true
}
]
FieldTypeNotes
idstringClinVar ID
reviewStatusstringsee possible values below
alleleOriginsstring arraysee possible values below
refAllelestring
altAllelestring
phenotypesstring array
medGenIdsstring arrayMedGen IDs
omimIdsstring arrayOMIM IDs
orphanetIdsstring arrayOrphanet IDs
significancestring arraysee possible values below
lastUpdatedDatestringyyyy-MM-dd
pubMedIdsstring arrayPubMed IDs
isAlleleSpecificbooltrue when the current variant alternate allele matches the ClinVar alternate allele

reviewStatus:

  • no assertion provided
  • no assertion criteria provided
  • criteria provided, single submitter
  • practice guideline
  • classified by multiple submitters
  • criteria provided, conflicting interpretations
  • criteria provided, multiple submitters, no conflicts
  • no interpretation for the single variant

alleleOrigins:

  • unknown
  • other
  • germline
  • somatic
  • inherited
  • paternal
  • maternal
  • de-novo
  • biparental
  • uniparental
  • not-tested
  • tested-inconclusive

significance:

  • uncertain significance
  • not provided
  • benign
  • likely benign
  • likely pathogenic
  • pathogenic
  • drug response
  • histocompatibility
  • association
  • risk factor
  • protective
  • affects
  • conflicting data from submitters
  • other
  • no interpretation for the single variant
  • conflicting interpretations of pathogenicity

1000 Genomes

"oneKg":{
"allAf":0.200879,
"afrAf":0.210287,
"amrAf":0.139769,
"easAf":0.275794,
"eurAf":0.181909,
"sasAf":0.173824,
"allAn":5008,
"afrAn":1322,
"amrAn":694,
"easAn":1008,
"eurAn":1006,
"sasAn":978,
"allAc":1006,
"afrAc":278,
"amrAc":97,
"easAc":278,
"eurAc":183,
"sasAc":170
}
FieldTypeNotes
allAffloatallele frequency for all populations. Range: 0 - 1.0
allAcintallele count for all populations. Integer.
allAnintallele number for all populations. Non-zero integer.
afrAffloatallele frequency for the African super population. Range: 0 - 1.0
afrAcintallele count for the African super population. Integer.
afrAnintallele number for the African super population. Non-zero integer.
amrAffloatallele frequency for the Ad Mixed American super population. Range: 0 - 1.0
amrAcintallele count for the Ad Mixed American super population. Integer.
amrAnintallele number for the Ad Mixed American super population. Non-zero integer.
easAffloatallele frequency for the East Asian super population. Range: 0 - 1.0
easAcintallele count for the East Asian super population. Integer.
easAnintallele number for the East Asian super population. Non-zero integer.
eurAffloatallele frequency for the European super population. Range: 0 - 1.0
eurAcintallele count for the European super population. Integer.
eurAnintallele number for the European super population. Non-zero integer.
sasAffloatallele frequency for the South Asian super population. Range: 0 - 1.0
sasAcintallele count for the South Asian super population. Integer.
sasAnintallele number for the South Asian super population. Non-zero integer.

gnomAD (genomes)

"gnomad":{ 
"coverage":20,
"allAf":0.190317,
"afrAf":0.222876,
"amrAf":0.121394,
"easAf":0.239802,
"finAf":0.136833,
"nfeAf":0.181282,
"asjAf":0.258278,
"othAf":0.186094,
"allAn":30796,
"afrAn":8664,
"amrAn":832,
"easAn":1618,
"finAn":3486,
"nfeAn":14916,
"asjAn":302,
"othAn":978,
"allAc":5861,
"afrAc":1931,
"amrAc":101,
"easAc":388,
"finAc":477,
"nfeAc":2704,
"asjAc":78,
"othAc":182,
"allHc":561,
"afrHc":208,
"amrHc":6,
"easHc":42,
"finHc":31,
"nfeHc":242,
"asjHc":13,
"othHc":19,
"failedFilter":true
}
FieldTypeNotes
coverageintaverage coverage (non-negative integer values)
allAffloatallele frequency for all populations. Range: 0 - 1.0
allAcintallele count for all populations. Integer.
allAnintallele number for all populations. Non-zero integer.
allHcintcount of homozygous individuals for all populations. Non-negative integer.
afrAffloatallele frequency for the African / African American population. Range: 0 - 1.0
afrAcintallele count for the African / African American population. Integer.
afrAnintallele number for the African / African American population. Non-zero integer.
afrHcintcount of homozygous individuals for African / African American population. Non-negative integer.
amrAffloatallele frequency for the Latino population. Range: 0 - 1.0
amrAcintallele count for the Latino population. Integer.
amrAnintallele number for the Latino population. Non-zero integer.
amrHcintcount of homozygous individuals for Latino population. Non-negative integer.
easAffloatallele frequency for the East Asian population. Range: 0 - 1.0
easAcintallele count for the East Asian population. Integer.
easAnintallele number for the East Asian population. Non-zero integer.
easHcintcount of homozygous individuals for East Asian population. Non-negative integer.
finAffloatallele frequency for the Finnish population. Range: 0 - 1.0
finAcintallele count for the Finnish population. Integer.
finAnintallele number for the Finnish population. Non-zero integer.
finHcintcount of homozygous individuals for Finnish population. Non-negative integer
nfeAffloatallele frequency for the Non-Finnish European population. Range: 0 - 1.0
nfeAcintallele count for the Non-Finnish European population. Integer.
nfeAnintallele number for the Non-Finnish European population. Non-zero integer.
nfeHcintcount of homozygous individuals for Non-Finnish European population. Non-negative integer
othAffloatallele frequency for the Other population. Range: 0 - 1.0
othAcintallele count for the Other population. Integer.
othAnintallele number for the Other population. Non-zero integer.
othHcintcount of homozygous individuals for Other population. Non-negative integer
asjAffloatallele frequency for the Ashkenazi Jewish population. Range: 0 - 1.0
asjAcintallele count for the Ashkenazi Jewish population Integer.
asjAnintallele number for the Ashkenazi Jewish population. Non-zero integer.
asjHcintcount of homozygous individuals for the Ashkenazi Jewish population. Non-negative integer
failedFilterboolTrue if this variant failed any filters (Note: we do not list the failed filters)

gnomAD (exomes)

"gnomadExome":{ 
"coverage":20,
"allAf":0.190317,
"afrAf":0.222876,
"amrAf":0.121394,
"easAf":0.239802,
"finAf":0.136833,
"nfeAf":0.181282,
"asjAf":0.258278,
"othAf":0.186094,
"allAn":30796,
"afrAn":8664,
"amrAn":832,
"easAn":1618,
"finAn":3486,
"nfeAn":14916,
"asjAn":302,
"othAn":978,
"allAc":5861,
"afrAc":1931,
"amrAc":101,
"easAc":388,
"finAc":477,
"nfeAc":2704,
"asjAc":78,
"othAc":182,
"allHc":561,
"afrHc":208,
"amrHc":6,
"easHc":42,
"finHc":31,
"nfeHc":242,
"asjHc":13,
"othHc":19,
"failedFilter":true
}
FieldTypeNotes
coverageintaverage coverage (non-negative integer values)
allAffloatallele frequency for all populations. Range: 0 - 1.0
allAcintallele count for all populations. Integer.
allAnintallele number for all populations. Non-zero integer.
allHcintcount of homozygous individuals for all populations. Non-negative integer.
afrAffloatallele frequency for the African / African American population. Range: 0 - 1.0
afrAcintallele count for the African / African American population. Integer.
afrAnintallele number for the African / African American population. Non-zero integer.
afrHcintcount of homozygous individuals for African / African American population. Non-negative integer.
amrAffloatallele frequency for the Latino population. Range: 0 - 1.0
amrAcintallele count for the Latino population. Integer.
amrAnintallele number for the Latino population. Non-zero integer.
amrHcintcount of homozygous individuals for Latino population. Non-negative integer.
easAffloatallele frequency for the East Asian population. Range: 0 - 1.0
easAcintallele count for the East Asian population. Integer.
easAnintallele number for the East Asian population. Non-zero integer.
easHcintcount of homozygous individuals for East Asian population. Non-negative integer.
finAffloatallele frequency for the Finnish population. Range: 0 - 1.0
finAcintallele count for the Finnish population. Integer.
finAnintallele number for the Finnish population. Non-zero integer.
finHcintcount of homozygous individuals for Finnish population. Non-negative integer
nfeAffloatallele frequency for the Non-Finnish European population. Range: 0 - 1.0
nfeAcintallele count for the Non-Finnish European population. Integer.
nfeAnintallele number for the Non-Finnish European population. Non-zero integer.
nfeHcintcount of homozygous individuals for Non-Finnish European population. Non-negative integer
othAffloatallele frequency for the Other population. Range: 0 - 1.0
othAcintallele count for the Other population. Integer.
othAnintallele number for the Other population. Non-zero integer.
othHcintcount of homozygous individuals for Other population. Non-negative integer
asjAffloatallele frequency for the Ashkenazi Jewish population. Range: 0 - 1.0
asjAcintallele count for the Ashkenazi Jewish population Integer.
asjAnintallele number for the Ashkenazi Jewish population. Non-zero integer.
asjHcintcount of homozygous individuals for the Ashkenazi Jewish population. Non-negative integer
sasAffloatallele frequency for the South Asian population. Range: 0 - 1.0
sasAcintallele count for the South Asian population Integer.
sasAnintallele number for the South Asian population. Non-zero integer.
sasHcintcount of homozygous individuals for the South Asian population. Non-negative integer.
failedFilterboolTrue if this variant failed any filters (Note: we do not list the failed filters)

dbSNP

"dbsnp":[
"rs1042821"
]
FieldTypeNotes
dbsnpstring arraydbSNP rsIDs