Estimate highest density intervals (HDIs) from performance counts

Estimate highest density intervals and success rates from hap.py counts using a Binomial model and empirical Bayes. See package docs for details on method implementation.

estimate_hdi(df, successes_col, totals_col, group_cols, aggregate_only = TRUE,
  significance = 0.05, sample_size = 1e+05, max_alpha1 = 1000)

Arguments

df	A `data.frame`. Required columns: `Replicate.Id`, `Subset`, columns specified in `group_cols` argument.
successes_col	Name of the column that contains success counts.
totals_col	Name of the column that contains total counts.
group_cols	Vector of columns to group counts by. Observations within the same group will be treated as replicates.
aggregate_only	Estimate HDIs for aggregate replicate only (speeds up execution). Default: TRUE.
significance	Significance for HDI estimation. Default: 0.05 (= 95% HDIs).
sample_size	Number of observations to draw from the Beta posterior to estimate HDIs. Default: 1e5.
max_alpha1	Upper bound for alpha hyperparameter in the aggregate Beta posterior.

Value

A data.frame with performance counts, model hyperparameters, success rate and HDI estimates.

Examples


# NOT RUN {
hdi <- estimate_hdi(df, successes_col = 'TRUTH.TP', totals_col = 'TRUTH.TOTAL',
                   group_cols = c('Group.Id', 'Subset', 'Type', 'Subtype'))
# }

Estimate highest density intervals (HDIs) from performance counts

Arguments

Value

Examples

Contents