DANN
Overview
DANN uses the same feature set and training data as CADD (Combined Annotation-Dependent Depletion) to train a deep neural network (DNN). CADD is an algorithm designed to annotate both coding and non-coding variants, and has been shown to outperform other annotation algorithms. DANN improves on CADD (which uses Support Vector Machines (SVMs)) by capturing non-linear relationships by using a deep neural network instead of SVMs. DANN achieves about a 19% relative reduction in the error rate and about a 14% relative increase in the area under the curve (AUC) metric over CADD’s SVM methodology.
Publication
Quang, Daniel, Yifei Chen, and Xiaohui Xie. DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics 31.5 761-763 (2015). https://doi.org/10.1093/bioinformatics/btu703
TSV File
Example
chr     grch37_pos  ref     alt     DANN
1       10001       T       A       0.16461391399220135
1       10001       T       C       0.4396994049749739
1       10001       T       G       0.38108629377072734
1       10002       A       C       0.36182020272810128
1       10002       A       G       0.44413258111779291
1       10002       A       T       0.16812846819989813
Parsing
From the CSV file, we are interested in all columns:
- chr
- grch37_pos
- ref
- alt
- DANN
GRCh38 liftover
The data is not available for GRCh38 on DANN website. We performed a liftover from GRCh37 to GRCh38 using crossmap.
Known Issues
None
Download URL
https://cbcl.ics.uci.edu/public_data/DANN/
JSON Output
"dannScore": 0.27
| Field | Type | Notes | 
|---|---|---|
| dannScore | float | Range: 0 - 1.0 |