pr_data.RdSimpler interface to retrieve a data.frame of PR metrics from a happy_result object.
pr_data(happy_result, var_type = c("both", "snv", "indel"), filter = c("ALL", "PASS", "SEL"), subtype = c("*", "C16_PLUS", "C1_5", "C6_15", "D16_PLUS", "D1_5", "D6_15", "I16_PLUS", "I1_5", "I6_15"), subset = "*", quietly = TRUE)
| happy_result | a happy result loaded
via |
|---|---|
| var_type | subset for either insertions
and deletions |
| filter | include all records (ALL), only passing (PASS) or with selective filters applied (SEL) |
| subtype | variant subtype of the form |
| subset | when run with stratification regions, the subset is
the region ID. |
| quietly | suppress info messages |
a data.frame of Precision-Recall metrics for the
selected subset
Subsets: hap.py v0.3.7+ writes subsets TS_contained and
TS_boundary by default, corresponding to truth variants
well contained or at the boundary of confident regions. In some
truthsets, those in TS_boundary will show worse performance
metrics due to issues with variant representation or a partial
haplotype description.
Subtypes: Insertion subtypes are of the form: [IDC]length_range
where the first letter indicates the variant classification: I insertion;
D deletion; and C complex. Hap.py bins the lengths of these records
into ranges by ALT allele length in basepairs: 1_5, 6_15 and 16_PLUS.
# figure out prefix from pkg install location happy_input <- system.file("extdata", "happy_demo.summary.csv", package = "happyR") happy_prefix <- sub(".summary.csv", "", happy_input) # load happy result hapdata <- read_happy(happy_prefix)#>#>#># long deletion PR curve del_pr <- pr_data(hapdata, var_type = "indel", subtype = "D16_PLUS")