![]() |
Random Forest is a powerful and versatile machine-learning algorithm capable of performing both classification and regression tasks. It operates by constructing a multitude of decision trees during training time and outputting the mode of the classes (for classification) or mean prediction (for regression) of the individual trees. In this article, we will focus on using Random Forest for binary classification and handling unknown classes in R. What is Binary Classification?Binary classification is a type of classification task that outputs one of two possible classes. It is commonly used in applications like spam detection, disease diagnosis (predicting whether a patient has a certain disease), and sentiment analysis (positive or negative sentiment). Setting Up Random Forest for Binary Classification in RNow we will discuss step by step for Setting Up Random Forest for Binary Classification in R Programming Language. Step 1: Install and Load Necessary LibrariesFirst, ensure that you have the necessary libraries installed and loaded in R. The primary libraries needed are randomForest and caret.
Step 2: Load and Prepare the DataFor illustration, we will use the famous Iris dataset. Although it’s a multi-class dataset, we’ll modify it for binary classification by considering only two species.
Step 3: Train the Random Forest ModelFit a Random Forest model to the training data.
Output: Call:
randomForest(formula = Species ~ ., data = irisTrain, importance = TRUE, ntree = 500)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 2
OOB estimate of error rate: 0%
Confusion matrix:
non-setosa setosa class.error
non-setosa 80 0 0
setosa 0 40 0 Step 4: Predict and Evaluate the ModelUse the model to make predictions on the test set and evaluate its performance.
Output: Confusion Matrix and Statistics
Reference
Prediction setosa non-setosa unknown
setosa 5 0 0
non-setosa 0 15 0
unknown 0 0 0
Overall Statistics
Accuracy : 1
95% CI : (0.8316, 1)
No Information Rate : 0.75
P-Value [Acc > NIR] : 0.003171
Kappa : 1
Mcnemar's Test P-Value : NA
Statistics by Class:
Class: setosa Class: non-setosa Class: unknown
Sensitivity 1.00 1.00 NA
Specificity 1.00 1.00 1
Pos Pred Value 1.00 1.00 NA
Neg Pred Value 1.00 1.00 NA
Prevalence 0.25 0.75 0
Detection Rate 0.25 0.75 0
Detection Prevalence 0.25 0.75 0
Balanced Accuracy 1.00 1.00 NA Handling Unknown ClassesWhen dealing with unknown or missing classes in a real-world scenario, you can include additional steps to manage and predict unknown classes. Here’s an approach to deal with unknown classes in the data: Step 5: Handle Unknown Classes in the DatasetSuppose we have some unknown species labeled as “unknown” in the dataset. We can follow these steps:
Step 6: Modify Predictions for Unknown ClassesIf a class is unknown, we may decide to label it separately or use some heuristic to handle it. Here we’ll predict normally and then check if we need to reclassify unknowns.
Output: Confusion Matrix and Statistics
Reference
Prediction setosa non-setosa unknown
setosa 5 0 0
non-setosa 0 15 0
unknown 0 0 0
Overall Statistics
Accuracy : 1
95% CI : (0.8316, 1)
No Information Rate : 0.75
P-Value [Acc > NIR] : 0.003171
Kappa : 1
Mcnemar's Test P-Value : NA
Statistics by Class:
Class: setosa Class: non-setosa Class: unknown
Sensitivity 1.00 1.00 NA
Specificity 1.00 1.00 1
Pos Pred Value 1.00 1.00 NA
Neg Pred Value 1.00 1.00 NA
Prevalence 0.25 0.75 0
Detection Rate 0.25 0.75 0
Detection Prevalence 0.25 0.75 0
Balanced Accuracy 1.00 1.00 NA ConclusionRandom Forest is a robust and flexible algorithm for binary classification in R. Handling unknown classes requires additional steps, such as implementing a reject option, adding an “unknown” class, or using anomaly detection. By following the steps outlined above, you can effectively build and deploy a Random Forest model for binary classification while also managing unknown classes. |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 14 |