Truncate the range of numeric predictions — adjust_numeric

Truncating ranges involves limiting the output of a model to a specific range of values, typically to avoid extreme or unrealistic predictions. This technique can help improve the practical applicability of a model's outputs by constraining them within reasonable bounds based on domain knowledge or physical limitations.

Usage

adjust_numeric_range(x, lower_limit = -Inf, upper_limit = Inf)

Arguments

x: A tailor().
upper_limit, lower_limit: A numeric value, NA (for no truncation) or hardhat::tune().

Value

An updated tailor() containing the new operation.

Data Usage

This adjustment doesn't require estimation and, as such, the same data that's used to train it with fit() can be predicted on with predict(); fitting this adjustment just collects metadata on the supplied column names and does not risk data leakage.

Examples

library(tibble)

# create example data
set.seed(1)
d <- tibble(y = rnorm(100), y_pred = y/2 + rnorm(100))
d
#> # A tibble: 100 × 2
#>         y y_pred
#>     <dbl>  <dbl>
#>  1 -0.626 -0.934
#>  2  0.184  0.134
#>  3 -0.836 -1.33 
#>  4  1.60   0.956
#>  5  0.330 -0.490
#>  6 -0.820  1.36 
#>  7  0.487  0.960
#>  8  0.738  1.28 
#>  9  0.576  0.672
#> 10 -0.305  1.53 
#> # ℹ 90 more rows

# specify calibration
tlr <-
  tailor() |>
  adjust_numeric_range(lower_limit = 1)

# train tailor by passing column names.
tlr_fit <- fit(tlr, d, outcome = y, estimate = y_pred)

predict(tlr_fit, d)
#> # A tibble: 100 × 2
#>         y y_pred
#>     <dbl>  <dbl>
#>  1 -0.626   1   
#>  2  0.184   1   
#>  3 -0.836   1   
#>  4  1.60    1   
#>  5  0.330   1   
#>  6 -0.820   1.36
#>  7  0.487   1   
#>  8  0.738   1.28
#>  9  0.576   1   
#> 10 -0.305   1.53
#> # ℹ 90 more rows