![]() |
The objective of this project is to classify sonar data to differentiate between rocks and mines using machine learning techniques. Sonar data, collected through sound waves, is processed to detect underwater objects. Machine learning models can analyze this data to predict whether an object is a rock or a mine. Table of Content Classification Project: Differentiating Between Rocks and MinesDataset DescriptionThe dataset used in this project is Dataset Link – SonarData
General Approach
Let’s Implement this project stepwise, and classify between Rocks and Mines Step 1: Exploratory Data Analysis (EDA)Understand the structure and characteristics of the dataset. Import Libraries and Dataset
Check the Preview of Data Display the first few rows of the dataset to understand its structure.
Output: First few rows of the dataset:
0 1 2 3 4 5 6 7 8 ... 52 53 54 55 56 57 58 59 60
0 0.0200 0.0371 0.0428 0.0207 0.0954 0.0986 0.1539 0.1601 0.3109 ... 0.0065 0.0159 0.0072 0.0167 0.0180 0.0084 0.0090 0.0032 R
1 0.0453 0.0523 0.0843 0.0689 0.1183 0.2583 0.2156 0.3481 0.3337 ... 0.0089 0.0048 0.0094 0.0191 0.0140 0.0049 0.0052 0.0044 R
2 0.0262 0.0582 0.1099 0.1083 0.0974 0.2280 0.2431 0.3771 0.5598 ... 0.0166 0.0095 0.0180 0.0244 0.0316 0.0164 0.0095 0.0078 R
3 0.0100 0.0171 0.0623 0.0205 0.0205 0.0368 0.1098 0.1276 0.0598 ... 0.0036 0.0150 0.0085 0.0073 0.0050 0.0044 0.0040 0.0117 R
4 0.0762 0.0666 0.0481 0.0394 0.0590 0.0649 0.1209 0.2467 0.3564 ... 0.0054 0.0105 0.0110 0.0015 0.0072 0.0048 0.0107 0.0094 R
[5 rows x 61 columns] Check the Dataset Shape Check the number of rows and columns to get an overview of the dataset’s size.
Output: Shape of the dataset: (208, 61) View the Statistical Summary Use
Output: 0 1 2 3 4 5 ... 54 55 56 57 58
59
count 208.000000 208.000000 208.000000 208.000000 208.000000 208.000000 ... 208.000000 208.000000 208.000000 208.000000 208.000000 208.000000
mean 0.029164 0.038437 0.043832 0.053892 0.075202 0.104570 ... 0.009290 0.008222 0.007820 0.007949 0.007941 0.006507
std 0.022991 0.032960 0.038428 0.046528 0.055552 0.059105 ... 0.007088 0.005736 0.005785 0.006470 0.006181 0.005031
min 0.001500 0.000600 0.001500 0.005800 0.006700 0.010200 ... 0.000600 0.000400 0.000300 0.000300 0.000100 0.000600
25% 0.013350 0.016450 0.018950 0.024375 0.038050 0.067025 ... 0.004150 0.004400 0.003700 0.003600 0.003675 0.003100
50% 0.022800 0.030800 0.034300 0.044050 0.062500 0.092150 ... 0.007500 0.006850 0.005950 0.005800 0.006400 0.005300
75% 0.035550 0.047950 0.057950 0.064500 0.100275 0.134125 ... 0.012100 0.010575 0.010425 0.010350 0.010325 0.008525
max 0.137100 0.233900 0.305900 0.426400 0.401000 0.382300 ... 0.044700 0.039400 0.035500 0.044000 0.036400 0.043900 Step 2: Data PreparationPrepare the dataset for machine learning models. Separate Features and Target The last column is the target variable (rock or mine), and the rest are features.
Encode Labels Convert categorical labels (‘R’ for rock and ‘M’ for mine) into numerical format.
Split Data Divide the dataset into training and testing sets to evaluate model performance.
Step 3: Model DevelopmentTrain and evaluate machine learning models to classify sonar data.
Train kNN Model Fit kNN models with different neighbor values and record accuracy.
Plot Results Visualize accuracy for different neighbor values to select the best k.
Output: ![]() Plot the K-NN Final kNN Model Train the kNN model with the optimal number of neighbors and make predictions.
Logistic Regression Fit a logistic regression model to the training data.
Principal Component Analysis (PCA) Reduce the feature dimensions using PCA and fit a Logistic Regression model to the reduced features.
Support Vector Machines (SVM)
Step 4. Model EvaluationCheck the performance of the trained models. Evaluate kNN Compute the accuracy of the kNN model on the test set.
Output: kNN Accuracy: 0.8809523809523809 Confusion Matrix Display the confusion matrix to understand prediction results.
Output: kNN Confusion Matrix:
[[25 1]
[ 4 12]] Evaluate Logistic Regression Compute the accuracy of the logistic regression model.
Output: Logistic Regression Accuracy: 0.7857142857142857 Confusion Matrix Show the confusion matrix for logistic regression results.
Output: Logistic Regression Confusion Matrix:
[[19 7]
[ 2 14]] Evaluate the PCA-based model
Output: PCA + Logistic Regression Accuracy: 0.7619047619047619
PCA + Logistic Regression Confusion Matrix:
[[18 8]
[ 2 14]] Evaluate the SVM model
Output: SVM Accuracy: 0.8571428571428571
SVM Confusion Matrix:
[[22 4]
[ 2 14]] ConclusionIn this project, the k-Nearest Neighbors (kNN) algorithm demonstrated better performance compared to Logistic Regression in classifying sonar data into rocks and mines. The accuracy of the kNN model was higher, making it a more suitable choice for this task. |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 23 |