crossv_kfold splits the data into k exclusive partitions, and uses each partition for a test-training split. crossv_mc generates n random partitions, holding out p of the data for training.

crossv_mc(data, n, test = 0.2, id = ".id")

crossv_kfold(data, k = 5, id = ".id")

Arguments

data

A data frame

n

Number of test-training pairs to generate (an integer).

test

Proportion of observations that should be held out for testing (a double).

id

Name of variable that gives each model a unique integer id.

k

Number of folds (an integer).

Value

A data frame with n/k rows and columns test and train. test and train are list-columns containing resample() objects.

Examples

cv1 <- crossv_kfold(mtcars, 5) cv1
#> # A tibble: 5 x 3 #> train test .id #> <list> <list> <chr> #> 1 <S3: resample> <S3: resample> 1 #> 2 <S3: resample> <S3: resample> 2 #> 3 <S3: resample> <S3: resample> 3 #> 4 <S3: resample> <S3: resample> 4 #> 5 <S3: resample> <S3: resample> 5
library(purrr) cv2 <- crossv_mc(mtcars, 100) models <- map(cv2$train, ~ lm(mpg ~ wt, data = .)) errs <- map2_dbl(models, cv2$test, rmse) hist(errs)