Adding progress bar or percentage to tune function in R - Coding

Tuning hyperparameters is a critical step in optimizing machine learning models. In R, the tune function from various packages like caret or e1071 is widely used for this purpose. However, tuning can be time-consuming, especially with large datasets or complex models. Adding a progress bar or percentage completion indicator is highly beneficial to provide feedback on the tuning process. This article explores how to implement progress indicators when tuning models in the R Programming Language.

Tune Function in R

When tuning hyperparameters, the model is trained and validated across a grid of parameter values to find the optimal set. This involves repetitive computations, and without progress feedback, users might be left wondering about the process status. A progress bar or percentage indicator helps in the following:

Monitoring Progress: Knowing how much of the task is completed.
Estimating Completion Time: Gauging how much longer the process will take.
Improving User Experience: Providing real-time feedback to users.

Using `caret` Package with Progress Bar

The caret package provides functions for model training and tuning. Integrating a progress bar into the train function, which internally calls the tune function, can be done using the progress package. Here’s an step by step implementation of adding a progress bar while tuning hyperparameters for an SVM model using the caret package.

Step 1: Load required packages and data

First we will install and Load required packages and data.

library(caret)
library(e1071)
library(progress)

# Sample data
data(iris)
set.seed(123)

Step 2: Define training control with repeated cross-validation

Now we will Define training control with repeated cross-validation.

# Define training control with repeated cross-validation
train_control <- trainControl(method = "repeatedcv", number = 10, repeats = 3, 
                              verboseIter = FALSE)

# Define grid of hyperparameters
tune_grid <- expand.grid(
  .sigma = c(0.01, 0.05, 0.1),
  .C = c(1, 2, 5)
)

Step 3: Initialize progress bar and Custom training function

Now we will Initialize progress bar and Custom training function.

# Initialize progress bar
total <- nrow(tune_grid) * 10 * 3  # Total iterations = grid size * CV folds * repeats
pb <- progress_bar$new(
  format = "  Tuning [:bar] :percent eta: :eta",
  total = total, clear = FALSE, width = 60
)

# Custom training function with progress update
train_with_progress <- function(...) {
  pb$tick()
  train(...)
}

Step 4: Train model with progress updates

Now we will Train model with progress updates.

# Train model with progress updates
model <- train_with_progress(
  Species ~ .,
  data = iris,
  method = "svmRadial",
  trControl = train_control,
  tuneGrid = tune_grid
)

print(model)

Output:

Support Vector Machines with Radial Basis Function Kernel 

150 samples
  4 predictor
  3 classes: 'setosa', 'versicolor', 'virginica' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times) 
Summary of sample sizes: 135, 135, 135, 135, 135, 135, ... 
Resampling results across tuning parameters:

  sigma  C  Accuracy   Kappa    
  0.01   1  0.9022222  0.8533333
  0.01   2  0.9555556  0.9333333
  0.01   5  0.9600000  0.9400000
  0.05   1  0.9600000  0.9400000
  0.05   2  0.9666667  0.9500000
  0.05   5  0.9733333  0.9600000
  0.10   1  0.9600000  0.9400000
  0.10   2  0.9711111  0.9566667
  0.10   5  0.9622222  0.9433333

Accuracy was used to select the optimal model using the largest value.
The final values used for the model were sigma = 0.05 and C = 5.

The output describes the results of tuning a Support Vector Machine (SVM) model with a Radial Basis Function (RBF) kernel on the iris dataset using the caret package in R.

Cross-Validated (10 fold, repeated 3 times): The data was split into 10 folds, and the cross-validation process was repeated 3 times to ensure robust performance estimates.
Tuning Results: The model was tuned across different values of the hyperparameters sigma (the kernel parameter) and C (the regularization parameter).
Accuracy and Kappa: For each combination of sigma and C, the model’s accuracy and Kappa statistics (a measure of agreement) were reported. The best model was selected based on the highest accuracy.
Final values: The optimal hyperparameters found were sigma = 0.05 and C = 5.

Conclusion

Adding a progress bar or percentage completion indicator to the tuning process in R enhances user experience and provides valuable feedback during long-running computations. Using packages like caret and e1071 along with progress, you can easily integrate progress indicators into your model tuning workflow.

Reffered: https://www.geeksforgeeks.org

AI ML DS

Related
Fuzzy Logic for Uncertainty Management in Machine Learning
Repeated Training on the Same Data: Implications for Model Performance
Semantic Roles in NLP
Credit Card Fraud Detection in R
The Impact of Big Data on Business

Type:	Geek
Category:	Coding
Sub Category:	Tutorial
Uploaded by:	Admin
Views:	30