Horje
Adding progress bar or percentage to tune function in R

Tuning hyperparameters is a critical step in optimizing machine learning models. In R, the tune function from various packages like caret or e1071 is widely used for this purpose. However, tuning can be time-consuming, especially with large datasets or complex models. Adding a progress bar or percentage completion indicator is highly beneficial to provide feedback on the tuning process. This article explores how to implement progress indicators when tuning models in the R Programming Language.

Tune Function in R

When tuning hyperparameters, the model is trained and validated across a grid of parameter values to find the optimal set. This involves repetitive computations, and without progress feedback, users might be left wondering about the process status. A progress bar or percentage indicator helps in the following:

  • Monitoring Progress: Knowing how much of the task is completed.
  • Estimating Completion Time: Gauging how much longer the process will take.
  • Improving User Experience: Providing real-time feedback to users.

Using caret Package with Progress Bar

The caret package provides functions for model training and tuning. Integrating a progress bar into the train function, which internally calls the tune function, can be done using the progress package. Here’s an step by step implementation of adding a progress bar while tuning hyperparameters for an SVM model using the caret package.

Step 1: Load required packages and data

First we will install and Load required packages and data.

R
library(caret)
library(e1071)
library(progress)

# Sample data
data(iris)
set.seed(123)

Step 2: Define training control with repeated cross-validation

Now we will Define training control with repeated cross-validation.

R
# Define training control with repeated cross-validation
train_control <- trainControl(method = "repeatedcv", number = 10, repeats = 3, 
                              verboseIter = FALSE)

# Define grid of hyperparameters
tune_grid <- expand.grid(
  .sigma = c(0.01, 0.05, 0.1),
  .C = c(1, 2, 5)
)

Step 3: Initialize progress bar and Custom training function

Now we will Initialize progress bar and Custom training function.

R
# Initialize progress bar
total <- nrow(tune_grid) * 10 * 3  # Total iterations = grid size * CV folds * repeats
pb <- progress_bar$new(
  format = "  Tuning [:bar] :percent eta: :eta",
  total = total, clear = FALSE, width = 60
)

# Custom training function with progress update
train_with_progress <- function(...) {
  pb$tick()
  train(...)
}

Step 4: Train model with progress updates

Now we will Train model with progress updates.

R
# Train model with progress updates
model <- train_with_progress(
  Species ~ .,
  data = iris,
  method = "svmRadial",
  trControl = train_control,
  tuneGrid = tune_grid
)

print(model)

Output:

Support Vector Machines with Radial Basis Function Kernel 

150 samples
  4 predictor
  3 classes: 'setosa', 'versicolor', 'virginica' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 135, 135, 135, 135, 135, 135, ... 
Resampling results across tuning parameters:

  sigma  C  Accuracy   Kappa    
  0.01   1  0.9022222  0.8533333
  0.01   2  0.9555556  0.9333333
  0.01   5  0.9600000  0.9400000
  0.05   1  0.9600000  0.9400000
  0.05   2  0.9666667  0.9500000
  0.05   5  0.9733333  0.9600000
  0.10   1  0.9600000  0.9400000
  0.10   2  0.9711111  0.9566667
  0.10   5  0.9622222  0.9433333

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were sigma = 0.05 and C = 5.

The output describes the results of tuning a Support Vector Machine (SVM) model with a Radial Basis Function (RBF) kernel on the iris dataset using the caret package in R.

  • Cross-Validated (10 fold, repeated 3 times): The data was split into 10 folds, and the cross-validation process was repeated 3 times to ensure robust performance estimates.
  • Tuning Results: The model was tuned across different values of the hyperparameters sigma (the kernel parameter) and C (the regularization parameter).
  • Accuracy and Kappa: For each combination of sigma and C, the model’s accuracy and Kappa statistics (a measure of agreement) were reported. The best model was selected based on the highest accuracy.
  • Final values: The optimal hyperparameters found were sigma = 0.05 and C = 5.

Conclusion

Adding a progress bar or percentage completion indicator to the tuning process in R enhances user experience and provides valuable feedback during long-running computations. Using packages like caret and e1071 along with progress, you can easily integrate progress indicators into your model tuning workflow.




Reffered: https://www.geeksforgeeks.org


AI ML DS

Related
Fuzzy Logic for Uncertainty Management in Machine Learning Fuzzy Logic for Uncertainty Management in Machine Learning
Repeated Training on the Same Data: Implications for Model Performance Repeated Training on the Same Data: Implications for Model Performance
Semantic Roles in NLP Semantic Roles in NLP
Credit Card Fraud Detection in R Credit Card Fraud Detection in R
The Impact of Big Data on Business The Impact of Big Data on Business

Type:
Geek
Category:
Coding
Sub Category:
Tutorial
Uploaded by:
Admin
Views:
30