Promoter AI
Professional data source
This is a Professional data source and is not available freely. Please contact annotation_support@illumina.com if you would like to obtain it.
Overview
Promoter AI is one of the many AI-based annotation data from Illumina. It calculates score of possible variant on the promoter region. A promoter region is described as 500nt upstream and downstream of a transcript's transcription starting site (TSS). Each annotation will show score of the variant and which transcript and gene the promoter region corresponds to. The gene and transcript data is generated using gene annotation data from GENCODE Release 39 (corresponds to Ensembl release 105).
Transcript and Gene IDs from Promoter-AI
Illumina Connected Annotation uses its own transcript and gene cache data for reporting transcript annotation and can be different from the one reported in Promoter AI (based on GENCODE Release 39). This can result in some differences in the transcript ID and gene ID shown in the annotation result.
Parsing
TSV File
gene_id chrom pos ref alt strand gene transcript_id tss_pos strand_adj_dist_tss score_x score_y score
ENSG00000181404 chr9 73364 G A -1 WASHC1 ENST00000642633.1 73864 500 -0.0032 -0.0298 -0.0165
ENSG00000181404 chr9 73364 G C -1 WASHC1 ENST00000642633.1 73864 500 -0.0372 -0.052000000000000005 -0.0446
ENSG00000181404 chr9 73364 G T -1 WASHC1 ENST00000642633.1 73864 500 0.0202 -0.0315 -0.0056500000000000005
ENSG00000181404 chr9 73365 G A -1 WASHC1 ENST00000642633.1 73864 499 -0.015 -0.0643 -0.03965
ENSG00000181404 chr9 73365 G C -1 WASHC1 ENST00000642633.1 73864 499 0.0285 -0.0035 0.0125
ENSG00000181404 chr9 73365 G T -1 WASHC1 ENST00000642633.1 73864 499 -0.0095 -0.0352 -0.022350000000000002
ENSG00000181404 chr9 73366 G A -1 WASHC1 ENST00000642633.1 73864 498 0.0068 -0.0565 -0.02485
ENSG00000181404 chr9 73366 G C -1 WASHC1 ENST00000642633.1 73864 498 0.0596 0.0854 0.07250000000000001
ENSG00000181404 chr9 73366 G T -1 WASHC1 ENST00000642633.1 73864 498 -0.0086 -0.0009 -0.00475
ENSG00000181404 chr9 73367 C A -1 WASHC1 ENST00000642633.1 73864 497 0.0091 -0.0068 0.0011500000000000004
...
From the file, we extract column gene_id
, transcript_id
, strand
, strand_adj_dist_tss
, and score
.
JSON Output
"promoterAI": [
{
"strand": 1,
"distanceFromTss": -292,
"geneId": "ENSG00000274391",
"transcriptId": "ENST00000618007.5",
"score": 0.032
}
]
Field | Type | Notes |
---|---|---|
strand | int | Strand location of the transcript |
distanceFromTss | int | Number of nucleotides calculated from TSS of teh corresponding transcript |
geneId | string | Gene ID |
transcriptId | string | Transcript ID (Ensembl) |
score | decimal | Calculated Promoter AI score |
Interpreting Score
The score annotation from Promoter AI has range between -1 and 1. Positive score shows that the corresponding variant has over expression effect to the target gene while Negative score has under expression effect to the target gene. If the score is between -0.05 and 0.05 it doesn't have any efefct to the target gene.