Addendum: rsample

tidymodels is a fairly new companion collection of packages to tidyverse. The vision, as I understand it, is that tidymodels will eventually replace modelr.

One of the tidymodels packages, rsample, provides some of the same (re)sampling tools that modelr provides. I stuck with modelr in the bootstrapping reading because the functionality provided by rsample is more flexible than we really need and I didn’t want to obscure the main point without more programming-related technicalities. But, if you’d like to understand how rsample works, here are some brief notes to get you going. First of all, install the package:

install.packages("rsample")

Here’s a toy tibble to get things going. It has just one column, containing the numbers 1 through 6.

df <- tibble(x = 1:6)

The role of the modelr object type “resample” is played by the rsample object type “rsplit.” To construct an rsplit object, the basic constructor is make_split. This function takes two arguments: the first is a named list of the form

list(analysis = ..., assessment = ...)

where both ... are lists of integers, and the second is the tibble.

For example:

library(rsample)
indices <- list(analysis = 1:3, assessment = 4:6)
split <- make_split(indices, df)

This creates an rsplit object named split which contains (pointers to) rows 1 through 3 in its “analysis group” and (pointers to) rows 4 through 6 in its “assessment group.” As far as we’re concerned in this class, the assessment group is irrelevant. If you run as_tibble(split), you’ll get a tibble that just contains rows 1 through 3.

Note. If you really want to specify which group to make a tibble out of, you can run as_tibble(analysis(split)) and as_tibble(assessment(split)).

There is no dedicated rsample function that takes a single resample with replacement (ie, a function analogous to modelr::resample_bootstrap), but we can recreate this functionality as follows:

resample_bootstrap = function(df) { 
    seq(nrow(df)) %>%
        sample(nrow(df), replace = TRUE) %>%
        as_mapper(~ list(analysis = ., assessment = setdiff(seq(nrow(df)), .)))() %>%
        make_splits(df)
}

as_tibble(resample_bootstrap(df))

There is a built-in rsample function which takes multiple bootstrap resamples, analogous to modelr::bootstrap. It is the function bootstraps (note the extra s in the name).

resamples <- bootstraps(df, times = 100)

With this understanding, you should now be able to go through the bootstrapping reading using rsample in place of modelr.