pr_data.Rd
Simpler interface to retrieve a data.frame of PR metrics from a happy_result object.
pr_data(happy_result, var_type = c("both", "snv", "indel"), filter = c("ALL", "PASS", "SEL"), subtype = c("*", "C16_PLUS", "C1_5", "C6_15", "D16_PLUS", "D1_5", "D6_15", "I16_PLUS", "I1_5", "I6_15"), subset = "*", quietly = TRUE)
happy_result | a happy result loaded
via |
---|---|
var_type | subset for either insertions
and deletions |
filter | include all records (ALL), only passing (PASS) or with selective filters applied (SEL) |
subtype | variant subtype of the form |
subset | when run with stratification regions, the subset is
the region ID. |
quietly | suppress info messages |
a data.frame
of Precision-Recall metrics for the
selected subset
Subsets: hap.py v0.3.7+ writes subsets TS_contained
and
TS_boundary
by default, corresponding to truth variants
well contained or at the boundary of confident regions. In some
truthsets, those in TS_boundary
will show worse performance
metrics due to issues with variant representation or a partial
haplotype description.
Subtypes: Insertion subtypes are of the form: [IDC]length_range
where the first letter indicates the variant classification: I
insertion;
D
deletion; and C
complex. Hap.py bins the lengths of these records
into ranges by ALT allele length in basepairs: 1_5
, 6_15
and 16_PLUS
.
# figure out prefix from pkg install location happy_input <- system.file("extdata", "happy_demo.summary.csv", package = "happyR") happy_prefix <- sub(".summary.csv", "", happy_input) # load happy result hapdata <- read_happy(happy_prefix)#>#>#># long deletion PR curve del_pr <- pr_data(hapdata, var_type = "indel", subtype = "D16_PLUS")