This function is based on train, which runs models (in our case different smoothing algorithms) on data across different parameter values (in our case different smoothness parameters).
Usage
train_smooth_data(
...,
x = NULL,
y = NULL,
sm_method,
preProcess = NULL,
weights = NULL,
metric = ifelse(is.factor(y), "Accuracy", "RMSE"),
maximize = ifelse(metric %in% c("RMSE", "logLoss", "MAE", "logLoss"), FALSE, TRUE),
trControl = caret::trainControl(method = "cv"),
tuneGrid = NULL,
tuneLength = ifelse(trControl$method == "none", 1, 3),
return_trainobject = FALSE
)
Arguments
- ...
Arguments passed to smooth_data. These arguments cannot overlap with any of those to be tuned.
- x
A vector of predictor values to smooth along (e.g. time)
- y
A vector of response values to be smoothed (e.g. density).
- sm_method
Argument specifying which smoothing method should be used. Options include "moving-average", "moving-median", "loess", "gam", and "smooth.spline".
- preProcess
A string vector that defines a pre-processing of the predictor data. The default is no pre-processing. See train for more details.
- weights
A numeric vector of case weights. This argument currently does not affect any
train_smooth_data
models.- metric
A string that specifies what summary metric will be used to select the optimal model. By default, possible values are "RMSE" and "Rsquared" for regression. See train for more details.
- maximize
A logical: should the metric be maximized or minimized?
- trControl
A list of values that define how this function acts. See train and trainControl for more details.
- tuneGrid
A data frame with possible tuning values, or a named list containing vectors with possible tuning values. If a data frame, the columns should be named the same as the tuning parameters. If a list, the elements of the list should be named the same as the tuning parameters. If a list, expand.grid will be used to make all possible combinations of tuning parameter values.
- tuneLength
An integer denoting the amount of granularity in the tuning parameter grid. By default, this argument is the number of levels for each tuning parameter that should be generated. If
trControl
has the optionsearch = "random"
, this is the maximum number of tuning parameter combinations that will be generated by the random search. (NOTE: If given, this argument must be named.)- return_trainobject
A logical indicating whether the entire result of train should be returned, or only the
results
element.
Value
If return_trainobject = FALSE
(the default), a data frame
with the values of all tuning parameter combinations and the
training error rate for each combination (i.e. the results
element of the output of train).
If return_trainobject = TRUE
, the output of train
Details
See train for more information.
The default method is k-fold cross-validation
(trControl = caret::trainControl(method = "cv")
).
For less variable, but more computationally costly, cross-validation,
users may choose to increase the number of folds. This can be
done by altering the number
argument in
trainControl, or by setting method = "LOOCV"
for leave one out cross-validation where the number of folds is
equal to the number of data points.
For less variable, but more computationally costly, cross-validation,
users may alternatively choose method = "repeatedcv"
for
repeated k-fold cross-validation.
For more control, advanced users may wish to call
train directly, using
makemethod_train_smooth_data to specify the method
argument.