![]() |
Credit card fraud is a major issue that affects both consumers and financial institutions. With the rise of online transactions, detecting fraudulent activities has become increasingly challenging. Here, we will explore the process of detecting credit card fraud using machine learning techniques. We will use a real-world dataset to understand the data, apply various fraud detection algorithms, and discuss the importance of security and prevention measures. Credit Card Fraud Detection in RCredit card fraud detection involves identifying unusual patterns in transaction data that deviate from normal behavior. This is typically approached using machine learning algorithms that can classify transactions as fraudulent or legitimate. The main challenges include:
Below, we provide a step-by-step guide to implementing credit card fraud detection in R Programming Language using a popular dataset and a Random Forest classifier. Step 1: Load Libraries and DataFirst, install and load the necessary libraries. We’ll use the Dataset Link: Credit Card Fraud
Output: Time V1 V2 V3 V4 V5 V6 V7
1 0 -1.3598071 -0.07278117 2.5363467 1.3781552 -0.33832077 0.46238778 0.23959855
2 0 1.1918571 0.26615071 0.1664801 0.4481541 0.06001765 -0.08236081 -0.07880298
3 1 -1.3583541 -1.34016307 1.7732093 0.3797796 -0.50319813 1.80049938 0.79146096
4 1 -0.9662717 -0.18522601 1.7929933 -0.8632913 -0.01030888 1.24720317 0.23760894
5 2 -1.1582331 0.87773675 1.5487178 0.4030339 -0.40719338 0.09592146 0.59294075
6 2 -0.4259659 0.96052304 1.1411093 -0.1682521 0.42098688 -0.02972755 0.47620095
V8 V9 V10 V11 V12 V13 V14
1 0.09869790 0.3637870 0.09079417 -0.5515995 -0.61780086 -0.9913898 -0.3111694
2 0.08510165 -0.2554251 -0.16697441 1.6127267 1.06523531 0.4890950 -0.1437723
3 0.24767579 -1.5146543 0.20764287 0.6245015 0.06608369 0.7172927 -0.1659459
4 0.37743587 -1.3870241 -0.05495192 -0.2264873 0.17822823 0.5077569 -0.2879237
5 -0.27053268 0.8177393 0.75307443 -0.8228429 0.53819555 1.3458516 -1.1196698
6 0.26031433 -0.5686714 -0.37140720 1.3412620 0.35989384 -0.3580907 -0.1371337
V15 V16 V17 V18 V19 V20 V21
1 1.4681770 -0.4704005 0.20797124 0.02579058 0.40399296 0.25141210 -0.018306778
2 0.6355581 0.4639170 -0.11480466 -0.18336127 -0.14578304 -0.06908314 -0.225775248
3 2.3458649 -2.8900832 1.10996938 -0.12135931 -2.26185710 0.52497973 0.247998153
4 -0.6314181 -1.0596472 -0.68409279 1.96577500 -1.23262197 -0.20803778 -0.108300452
5 0.1751211 -0.4514492 -0.23703324 -0.03819479 0.80348692 0.40854236 -0.009430697
6 0.5176168 0.4017259 -0.05813282 0.06865315 -0.03319379 0.08496767 -0.208253515
V22 V23 V24 V25 V26 V27 V28
1 0.277837576 -0.11047391 0.06692807 0.1285394 -0.1891148 0.133558377 -0.02105305
2 -0.638671953 0.10128802 -0.33984648 0.1671704 0.1258945 -0.008983099 0.01472417
3 0.771679402 0.90941226 -0.68928096 -0.3276418 -0.1390966 -0.055352794 -0.05975184
4 0.005273597 -0.19032052 -1.17557533 0.6473760 -0.2219288 0.062722849 0.06145763
5 0.798278495 -0.13745808 0.14126698 -0.2060096 0.5022922 0.219422230 0.21515315
6 -0.559824796 -0.02639767 -0.37142658 -0.2327938 0.1059148 0.253844225 0.08108026
Amount Class
1 149.62 0
2 2.69 0
3 378.66 0
4 123.50 0
5 69.99 0
6 3.67 0 Step 2: Data PreprocessingPreprocess the data by scaling features and handling missing values if any. Also, split the data into training and testing sets.
Output: [1] 0
Time V1 V2 V3
Min. : 0 Min. :-56.40751 Min. :-72.71573 Min. :-48.3256
1st Qu.: 54202 1st Qu.: -0.92037 1st Qu.: -0.59855 1st Qu.: -0.8904
Median : 84692 Median : 0.01811 Median : 0.06549 Median : 0.1799
Mean : 94814 Mean : 0.00000 Mean : 0.00000 Mean : 0.0000
3rd Qu.:139321 3rd Qu.: 1.31564 3rd Qu.: 0.80372 3rd Qu.: 1.0272
Max. :172792 Max. : 2.45493 Max. : 22.05773 Max. : 9.3826
V4 V5 V6 V7
Min. :-5.68317 Min. :-113.74331 Min. :-26.1605 Min. :-43.5572
1st Qu.:-0.84864 1st Qu.: -0.69160 1st Qu.: -0.7683 1st Qu.: -0.5541
Median :-0.01985 Median : -0.05434 Median : -0.2742 Median : 0.0401
Mean : 0.00000 Mean : 0.00000 Mean : 0.0000 Mean : 0.0000
3rd Qu.: 0.74334 3rd Qu.: 0.61193 3rd Qu.: 0.3986 3rd Qu.: 0.5704
Max. :16.87534 Max. : 34.80167 Max. : 73.3016 Max. :120.5895
V8 V9 V10 V11
Min. :-73.21672 Min. :-13.43407 Min. :-24.58826 Min. :-4.79747
1st Qu.: -0.20863 1st Qu.: -0.64310 1st Qu.: -0.53543 1st Qu.:-0.76249
Median : 0.02236 Median : -0.05143 Median : -0.09292 Median :-0.03276
Mean : 0.00000 Mean : 0.00000 Mean : 0.00000 Mean : 0.00000
3rd Qu.: 0.32735 3rd Qu.: 0.59714 3rd Qu.: 0.45392 3rd Qu.: 0.73959
Max. : 20.00721 Max. : 15.59500 Max. : 23.74514 Max. :12.01891
V12 V13 V14 V15
Min. :-18.6837 Min. :-5.79188 Min. :-19.2143 Min. :-4.49894
1st Qu.: -0.4056 1st Qu.:-0.64854 1st Qu.: -0.4256 1st Qu.:-0.58288
Median : 0.1400 Median :-0.01357 Median : 0.0506 Median : 0.04807
Mean : 0.0000 Mean : 0.00000 Mean : 0.0000 Mean : 0.00000
3rd Qu.: 0.6182 3rd Qu.: 0.66251 3rd Qu.: 0.4931 3rd Qu.: 0.64882
Max. : 7.8484 Max. : 7.12688 Max. : 10.5268 Max. : 8.87774
V16 V17 V18 V19
Min. :-14.12985 Min. :-25.16280 Min. :-9.498746 Min. :-7.213527
1st Qu.: -0.46804 1st Qu.: -0.48375 1st Qu.:-0.498850 1st Qu.:-0.456299
Median : 0.06641 Median : -0.06568 Median :-0.003636 Median : 0.003735
Mean : 0.00000 Mean : 0.00000 Mean : 0.000000 Mean : 0.000000
3rd Qu.: 0.52330 3rd Qu.: 0.39968 3rd Qu.: 0.500807 3rd Qu.: 0.458949
Max. : 17.31511 Max. : 9.25353 Max. : 5.041069 Max. : 5.591971
V20 V21 V22 V23
Min. :-54.49772 Min. :-34.83038 Min. :-10.933144 Min. :-44.80774
1st Qu.: -0.21172 1st Qu.: -0.22839 1st Qu.: -0.542350 1st Qu.: -0.16185
Median : -0.06248 Median : -0.02945 Median : 0.006782 Median : -0.01119
Mean : 0.00000 Mean : 0.00000 Mean : 0.000000 Mean : 0.00000
3rd Qu.: 0.13304 3rd Qu.: 0.18638 3rd Qu.: 0.528554 3rd Qu.: 0.14764
Max. : 39.42090 Max. : 27.20284 Max. : 10.503090 Max. : 22.52841
V24 V25 V26 V27
Min. :-2.83663 Min. :-10.29540 Min. :-2.60455 Min. :-22.565679
1st Qu.:-0.35459 1st Qu.: -0.31715 1st Qu.:-0.32698 1st Qu.: -0.070840
Median : 0.04098 Median : 0.01659 Median :-0.05214 Median : 0.001342
Mean : 0.00000 Mean : 0.00000 Mean : 0.00000 Mean : 0.000000
3rd Qu.: 0.43953 3rd Qu.: 0.35072 3rd Qu.: 0.24095 3rd Qu.: 0.091045
Max. : 4.58455 Max. : 7.51959 Max. : 3.51735 Max. : 31.612198
V28 Amount Class
Min. :-15.43008 Min. : 0.00 Min. :0.000000
1st Qu.: -0.05296 1st Qu.: 5.60 1st Qu.:0.000000
Median : 0.01124 Median : 22.00 Median :0.000000
Mean : 0.00000 Mean : 88.35 Mean :0.001728
3rd Qu.: 0.07828 3rd Qu.: 77.17 3rd Qu.:0.000000
Max. : 33.84781 Max. :25691.16 Max. :1.000000 Step 3: Data PreprocessingConverts the Class column to a factor. Selects the first 1000 rows to work with a smaller dataset.
Step 4: Split Data into Training and Test SetsSplits the subset data into training (700 rows) and testing (300 rows) sets.
Output: Rows: 2 Columns: 2 $ Class <fct> 0, 1 $ count <int> 698, 2 Rows: 1 Columns: 2 $ Class <fct> 0 $ count <int> 300 Step 5: Build and Evaluate the ModelEvaluate the model’s performance on the test data using confusion matrix and other relevant metrics.
Output: Confusion Matrix and Statistics
Reference
Prediction 0 1
0 300 0
1 0 0
Accuracy : 1
95% CI : (0.9878, 1)
No Information Rate : 1
P-Value [Acc > NIR] : 1
Kappa : NaN
Mcnemar's Test P-Value : NA
Sensitivity : 1
Specificity : NA
Pos Pred Value : NA
Neg Pred Value : NA
Prevalence : 1
Detection Rate : 1
Detection Prevalence : 1
Balanced Accuracy : NA
'Positive' Class : 0 Step 6: Predict New Input ValuesPrepare new input values, preprocess them similarly to the training data, and make predictions.
Output: 1 2
0 0
Levels: 0 1
ConclusionCredit card fraud detection is crucial for keeping financial transactions safe. Using machine learning algorithms and strong security measures, we can spot and stop fraudulent activities. In this dataset, the techniques used for detecting fraud, and highlights the importance of security in financial transactions. |
Reffered: https://www.geeksforgeeks.org
AI ML DS |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 28 |