Re-calibrate classification probability predictions
Source:R/adjust-probability-calibration.R
adjust_probability_calibration.Rd
Calibration is the process of adjusting a model's outputted probabilities to match the observed frequencies of events. This technique aims to ensure that when a model predicts a certain probability for an outcome, that probability accurately reflects the true likelihood of that outcome occurring.
Arguments
- x
A
tailor()
.- method
Character. One of
"logistic"
,"multinomial"
,"beta"
,"isotonic"
,"isotonic_boot"
, or"none"
, corresponding to the function from the probably packageprobably::cal_estimate_logistic()
,probably::cal_estimate_multinomial()
, etc., respectively. The default is to use"logistic"
which, despite its name, fits a generalized additive model. Note that whenfit.tailor()
is called, the value may be changed to"none"
if there is insufficient data.- ...
Optional arguments to pass to the corresponding function in the probably package. These arguments must be named.
Value
An updated tailor()
containing the new operation.
Details
The "logistic" and "multinomial" methods fit models that predict the observed
classes as a function of the predicted class probabilities. These models
remove any overt systematic trends from the linear predictor and correct new
predictions. The underlying code fits that model using mgcv::gam()
.
If smooth = FALSE
is passed to the ...
, it uses stats::glm()
for binary
outcomes or nnet::multinom()
for 3+ classes.
The isotonic method uses stats::isoreg()
to force the predicted
probabilities to increase with the observed outcome class. This creates a
step function that will map new predictions to values that are monotonically
increasing with the binary (0/1) form of the outcome. One side effect is
that there are fewer, perhaps far fewer, unique predicted probabilities.
For 3+ classes, this is done using a one-versus-all strategy that ensures
that the probabilities add to 1.0. The "isotonic boot" method resamples the
data and generates multiple isotonic regressions that are averaged and used
to correct the predictions. This may not be perfectly monotonic, but the
number of unique calibrated predictions increases with the number of
bootstrap samples (controlled by passing the times
argument to ...
).
Beta calibration (Kull et al, 2017) assumes that the probability estimates
follow a Beta distribution. This leads to a sigmoidal model that can be fit
to the data via maximum likelihood. There are a few different ways to fit
the model; see betacal:: beta_calibration()
options parameters
to select
a specific sigmoidal model.
Data Usage
This adjustment requires estimation and, as such, different subsets of data should be used to train it and evaluate its predictions.
Note that, when calling fit.tailor()
, if the calibration data have zero or
one row, the method
is changed to "none"
.
References
Kull, Meelis, Telmo Silva Filho, and Peter Flach. "Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers." Artificial intelligence and statistics. PMLR, 2017.
https://aml4td.org/chapters/cls-metrics.html#sec-cls-calibration
Examples
library(modeldata)
# split example data
set.seed(1)
in_rows <- sample(c(TRUE, FALSE), nrow(two_class_example), replace = TRUE)
d_calibration <- two_class_example[in_rows, ]
d_test <- two_class_example[!in_rows, ]
head(d_calibration)
#> truth Class1 Class2 predicted
#> 1 Class2 0.003589243 0.9964107574 Class2
#> 3 Class2 0.110893522 0.8891064779 Class2
#> 4 Class1 0.735161703 0.2648382969 Class1
#> 6 Class1 0.999275071 0.0007249286 Class1
#> 7 Class1 0.999201149 0.0007988510 Class1
#> 8 Class1 0.812351997 0.1876480026 Class1
# specify calibration
tlr <-
tailor() |>
adjust_probability_calibration(method = "logistic")
# train tailor on a subset of data.
tlr_fit <- fit(
tlr,
d_calibration,
outcome = c(truth),
estimate = c(predicted),
probabilities = c(Class1, Class2)
)
# apply to predictions on another subset of data
head(d_test)
#> truth Class1 Class2 predicted
#> 2 Class1 0.6786210540 0.321378946 Class1
#> 5 Class2 0.0162399603 0.983760040 Class2
#> 9 Class2 0.4570372595 0.542962741 Class2
#> 10 Class2 0.0976377342 0.902362266 Class2
#> 16 Class1 0.9919162773 0.008083723 Class1
#> 17 Class2 0.0004250834 0.999574917 Class2
predict(tlr_fit, d_test)
#> # A tibble: 244 × 4
#> truth Class1 Class2 predicted
#> <fct> <dbl[1d]> <dbl[1d]> <fct>
#> 1 Class1 0.459 0.541 Class2
#> 2 Class2 0.0396 0.960 Class2
#> 3 Class2 0.333 0.667 Class2
#> 4 Class2 0.0954 0.905 Class2
#> 5 Class1 0.964 0.0359 Class1
#> 6 Class2 0.0328 0.967 Class2
#> 7 Class2 0.112 0.888 Class2
#> 8 Class1 0.665 0.335 Class1
#> 9 Class1 0.957 0.0434 Class1
#> 10 Class2 0.0534 0.947 Class2
#> # ℹ 234 more rows