![]() |
In this article we explore what is hyperparameter optimization and how can we use Bayesian Optimization to tune hyperparameters in various machine learning models to obtain better prediction accuracy. Before we dive into the how’s of implementing Bayesian Optimization, let us learn what is meant by hyperparameters and hyperparameter optimization. HyperparametersMachine/deep learning models consist of two types of parameters: model parameters and hyperparameters. Hyperparameters are external configuration variables set by us to operate machine model training. They are parameters that define the details of learning process. Examples of hyperparameters include number of nodes and layers in neural networks, learning rates, epochs etc. They have major impact on the accuracy and efficiency of the training model and hence they need to be defined in such a way so as to get the best results. This leads us to the topic of hyperparameter optimization. Hyperparameter OptimizationHyperparameter optimization or tuning is the process of selecting optimal values for a machine learning model’s hyperparameters. Its job is to find a tuple of hyperparameters that gives an optimal model with enhanced accuracy/prediction. It minimizes the loss function on a given data obtained from the objective function that uses a particular tuple of hyperparameters. There are various techniques that can be used to tune hyperparameters:
We are now going to dive deep into what bayesian optimization is and how it can be used with machine learning models for optimization. Bayesian OptimizationBayesian Optimization is an automated optimization technique designed to find optimal hyperparameters by treating the search process as an optimization problem. It aims to maximize an objective function f(x), particularly beneficial for functions that are computationally expensive to evaluate and are treated as “black boxes,” where their internal structure is unknown. One of the key features of Bayesian Optimization is its ability to consider previous evaluations when selecting the next set of hyperparameter combinations. This is achieved through the use of a probabilistic model, which estimates the probability of an objective function’s result given a set of hyperparameters: P ( score | hyperparameters) This model is called a “surrogate” for the objective function and is represented by P(y | x). The Bayesian Optimization algorithm involves several steps:
The surrogate model begins with a prior distribution f(x), representing initial beliefs or knowledge about the parameters of the model before observing any data. As more evaluations are conducted, the surrogate model learns from the data, updating its beliefs according to Bayes’ rule to form a posterior distribution. Sampling points in the search space is facilitated by acquisition functions, which balance exploitation and exploration. Exploitation involves sampling where the surrogate model predicts a high objective value, while exploration entails sampling at locations with high uncertainty. Popular acquisition functions include Maximum Probability of Improvement (MPI), Expected Improvement (EI), and Upper Confidence Bound (UCB). Bayesian Optimization is efficient because it intelligently selects the next set of hyperparameters, reducing the number of calls made to the objective function. Surrogate models such as Gaussian processes, Random Forest Regression, and Tree-Structured Parzen Estimators (TPE) are commonly used in Bayesian Optimization due to their effectiveness. Hyperparameter Optimization Based on Bayesian OptimizationIn this section we are going to learn how to use the BayesSearchCV model provided in the scikit-optimize library to improve the results of Support Vector Classifier on Breast Cancer Dataset. For implementing bayesian optimization, we are going to use scikit-optimize library. Install the scikit-optimize library using the following command: pip install scikit-optimize
Import PackagesWe have imported various important libraries like numpy, pandas, train_test_split and also the breast_cancer dataset which is essentially the popular Wisconsin breast cancer dataset from the sklearn library. Python
Load the Dataset and Extract Train Test SplitSometimes dual coefficients or intercepts are not finite and this arises generally in SVMs and leads to the model running for an indefinite amount of time. To address this issue prepocessing of data is necessary. Here we have used the Scaling technique to normalize the data so that they have a similar range. Python
Training a Machine Learning ModelPython
Output: Train Accuracy 0.9912087912087912 Here we have fit the SVC model using “rbf” kernel and obtain the accuracy of 91.6% and also print other performance metrics like execution time, f1_score, recall etc. We observe that there is a slight scope of improvement. Define Hyperparameter Search SpaceWe have specified the hyperparameters we want to optimize for SVM. Common hyperparameters include the choice of kernel (linear, polynomial, radial basis function, etc.), the regularization parameter (C), and the kernel coefficient (gamma). Python
Bayesian OptimizationInitialize Bayesian OptimizationWe have defined the Bayesian optimization process, including the objective function, search space, acquisition function, and any other necessary parameters. Python
Run Bayesian OptimizationPython
Output: val. score: 0.9780411293133496 Here, we have fit the bayesian optimization model with our train and test split and compared the best score and accuracy of the model. The best set of hyperparameters happen to be: [(‘C’, 0.3317383202555499), (‘degree’, 8), (‘gamma’, 2.8889304722800495), (‘kernel’, ‘linear’)]. Implementing SVM with Best HyperparametersPython3
Output: Train Accuracy with best parameters: 0.9868131868131869 |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 14 |