Apply an equivocal zone to a binary classification model.

Equivocal zones describe intervals of predicted probabilities that are deemed too uncertain or ambiguous to be assigned a hard class. Rather than predicting a hard class when the probability is very close to a threshold, tailors using this adjustment predict "[EQ]".

Usage

adjust_equivocal_zone(x, value = 0.1, threshold = NULL)

Arguments

x: A tailor().
value: A numeric value (between zero and 1/2) or hardhat::tune(). The value is the size of the buffer around the threshold.
threshold: A numeric value (between zero and one) or hardhat::tune(). Defaults to adjust_probability_threshold(threshold) if previously set in x, or 1 / 2 if not.

Value

An updated tailor() containing the new operation.

Details

This function transforms the class prediction column estimate to have type class_pred from probably::class_pred(). You can loosely think of this column type as a factor, except there's a possible entry "[EQ]" that is not a level and will be excluded from performance metric calculations. As a result, the output column has the same number of levels as the input, except now has a possible entry "[EQ]" that tidymodels functions know to exclude from further analyses.

Data Usage

This adjustment doesn't require estimation and, as such, the same data that's used to train it with fit() can be predicted on with predict(); fitting this adjustment just collects metadata on the supplied column names and does not risk data leakage.

Examples

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
library(modeldata)
#> 
#> Attaching package: ‘modeldata’
#> The following object is masked from ‘package:datasets’:
#> 
#>     penguins

head(two_class_example)
#>    truth      Class1       Class2 predicted
#> 1 Class2 0.003589243 0.9964107574    Class2
#> 2 Class1 0.678621054 0.3213789460    Class1
#> 3 Class2 0.110893522 0.8891064779    Class2
#> 4 Class1 0.735161703 0.2648382969    Class1
#> 5 Class2 0.016239960 0.9837600397    Class2
#> 6 Class1 0.999275071 0.0007249286    Class1

# `predicted` gives hard class predictions based on probabilities
two_class_example |> count(predicted)
#>   predicted   n
#> 1    Class1 277
#> 2    Class2 223

# when probabilities are within (.25, .75), consider them equivocal
tlr <-
  tailor() |>
  adjust_equivocal_zone(value = 1 / 4)

tlr
#> 
#> ── tailor ──────────────────────────────────────────────────────────────────────
#> A binary postprocessor with 1 adjustment:
#> 
#> • Add equivocal zone of size 0.25.

# fit by supplying column names.
tlr_fit <- fit(
  tlr,
  two_class_example,
  outcome = c(truth),
  estimate = c(predicted),
  probabilities = c(Class1, Class2)
)

tlr_fit
#> 
#> ── tailor ──────────────────────────────────────────────────────────────────────
#> A binary postprocessor with 1 adjustment:
#> 
#> • Add equivocal zone of size 0.25. [trained]

# adjust hard class predictions
predict(tlr_fit, two_class_example) |> count(predicted)
#> # A tibble: 3 × 2
#>    predicted     n
#>   <clss_prd> <int>
#> 1       [EQ]    86
#> 2     Class1   229
#> 3     Class2   185