Horje
Telecom Customer Churn Analysis in R

Customer churn is a topic of the telecom industry as retaining customers is as important as acquiring new customers. Telecom Customer Churn Analysis in R Programming Langauge involves examining a dataset related to Telecom Customer Churn to derive insights into why customers leave and what can be done to retain them.

The objective of Telecom Customer Churn Analysis

Customer churn analysis helps telecom companies identify the factors that influence customer departure. By understanding these factors, companies can implement targeted interventions to retain customers. This has implications not only for the telecom sector but also for broader economic and social ecosystems. Effective churn management can lead to improved customer satisfaction, better resource allocation, and enhanced profitability. Additionally, communities benefit from stable and reliable telecom services.

Dataset Link: Telecom Customer Churn

In this case, the dataset contains columns such as customer ID, gender, senior citizen, Partner, Dependents, tenure, phone service, Internet service, Churn, and other telecom customer-related information. The insights derived from this analysis can significantly impact various sectors, ecosystems, and communities by helping telecom companies improve their customer retention strategies. now we will discuss step by step for Telecom Customer Churn Analysis in R Programming Language.

Step 1 : Load Packages and Data

First, install and load the required packages and read the Dataset and check the first few rows.

R
# Install and load necessary libraries
library(dplyr)
library(tidyverse)
library(caret)
library(ggplot2)

# Load the "Telecom Customer Churn " dataset
churn_data <- read.csv("Your//path")
head(churn_data) 

Output:

  customerID gender SeniorCitizen Partner Dependents tenure PhoneService    MultipleLines
1 7590-VHVEG Female             0     Yes         No      1           No No phone service
2 5575-GNVDE   Male             0      No         No     34          Yes               No
3 3668-QPYBK   Male             0      No         No      2          Yes               No
4 7795-CFOCW   Male             0      No         No     45           No No phone service
5 9237-HQITU Female             0      No         No      2          Yes               No
6 9305-CDSKC Female             0      No         No      8          Yes              Yes
  InternetService OnlineSecurity OnlineBackup DeviceProtection TechSupport StreamingTV
1             DSL             No          Yes               No          No          No
2             DSL            Yes           No              Yes          No          No
3             DSL            Yes          Yes               No          No          No
4             DSL            Yes           No              Yes         Yes          No
5     Fiber optic             No           No               No          No          No
6     Fiber optic             No           No              Yes          No         Yes
  StreamingMovies       Contract PaperlessBilling             PaymentMethod MonthlyCharges
1              No Month-to-month              Yes          Electronic check          29.85
2              No       One year               No              Mailed check          56.95
3              No Month-to-month              Yes              Mailed check          53.85
4              No       One year               No Bank transfer (automatic)          42.30
5              No Month-to-month              Yes          Electronic check          70.70
6             Yes Month-to-month              Yes          Electronic check          99.65
  TotalCharges Churn
1        29.85    No
2      1889.50    No
3       108.15   Yes
4      1840.75    No
5       151.65   Yes
6       820.50   Yes

The head(churn_data) function in R displays the first six rows of the “churn_data” dataframe. This function is useful for quickly inspecting the structure and contents of the dataframe to understand what kind of data it contains.

Step 2 : Exploratory Data Analysis (EDA)

EDA is a process of describing and summarizing data to bring important aspects into focus for further analysis.

R
# Check missing values in each column
colSums(is.na(churn_data))

# Check the dimension of the data
dim(churn_data)

# Removing missing values
churn_data<-na.omit(churn_data)

# Check total missing values
sum(is.na(churn_data))

Output:

      customerID           gender    SeniorCitizen          Partner       Dependents 
               0                0                0                0                0 
          tenure     PhoneService    MultipleLines  InternetService   OnlineSecurity 
               0                0                0                0                0 
    OnlineBackup DeviceProtection      TechSupport      StreamingTV  StreamingMovies 
               0                0                0                0                0 
        Contract PaperlessBilling    PaymentMethod   MonthlyCharges     TotalCharges 
               0                0                0                0               11 
           Churn 
               0 

[1] 7032   21

[1] 0

Check the summary of the data

The `summary(churn_data)` function in R provides a concise statistical summary of each column in the `churn_data` dataframe. For numeric columns, it shows the minimum, 1st quartile, median, mean, 3rd quartile, and maximum values. For categorical columns, it displays the frequency of each category. This helps you quickly understand the distribution and key statistics of your data.

R
summary(churn_data)

Output:

      customerID      gender     SeniorCitizen    Partner    Dependents     tenure     
 0002-ORFBO:   1   Female:3483   Min.   :0.0000   No :3639   No :4933   Min.   : 1.00  
 0003-MKNFE:   1   Male  :3549   1st Qu.:0.0000   Yes:3393   Yes:2099   1st Qu.: 9.00  
 0004-TLHLJ:   1                 Median :0.0000                         Median :29.00  
 0011-IGKFF:   1                 Mean   :0.1624                         Mean   :32.42  
 0013-EXCHZ:   1                 3rd Qu.:0.0000                         3rd Qu.:55.00  
 0013-MHZWF:   1                 Max.   :1.0000                         Max.   :72.00  
 (Other)   :7026                                                                       
 PhoneService          MultipleLines     InternetService             OnlineSecurity
 No : 680     No              :3385   DSL        :2416   No                 :3497  
 Yes:6352     No phone service: 680   Fiber optic:3096   No internet service:1520  
              Yes             :2967   No         :1520   Yes                :2015  
                                                                                   
                                                                                   
                                                                                   
                                                                                   
              OnlineBackup             DeviceProtection              TechSupport  
 No                 :3087   No                 :3094    No                 :3472  
 No internet service:1520   No internet service:1520    No internet service:1520  
 Yes                :2425   Yes                :2418    Yes                :2040  
                                                                                  
                                                                                  
                                                                                  
                                                                                  
              StreamingTV              StreamingMovies           Contract   
 No                 :2809   No                 :2781   Month-to-month:3875  
 No internet service:1520   No internet service:1520   One year      :1472  
 Yes                :2703   Yes                :2731   Two year      :1685  
                                                                            
                                                                            
                                                                            
                                                                            
 PaperlessBilling                   PaymentMethod  MonthlyCharges    TotalCharges   
 No :2864         Bank transfer (automatic):1542   Min.   : 18.25   Min.   :  18.8  
 Yes:4168         Credit card (automatic)  :1521   1st Qu.: 35.59   1st Qu.: 401.4  
                  Electronic check         :2365   Median : 70.35   Median :1397.5  
                  Mailed check             :1604   Mean   : 64.80   Mean   :2283.3  
                                                   3rd Qu.: 89.86   3rd Qu.:3794.7  
                                                   Max.   :118.75   Max.   :8684.8  
                                                                                    
 Churn     
 No :5163  
 Yes:1869  

Step 3 : Data Visualization

Perform data visualization to find some important information from the data.

R
# Count the occurrences of each churn value
churn_counts <- table(churn_data$Churn)

# Convert churn_counts to a dataframe
churn_df <- as.data.frame(churn_counts)
names(churn_df) <- c("Churn", "Count")

# Create the pie chart
ggplot(churn_df, aes(x = "", y = Count, fill = Churn)) +
  geom_bar(stat = "identity", width = 1) +
  coord_polar(theta = "y") +
  geom_text(aes(label = scales::percent(Count / sum(Count))), 
            position = position_stack(vjust = 0.5)) +
  ggtitle("Churn Distribution") +
  theme_void()

Output:

gh

Telecom Customer Churn Analysis in R

The above code snippet creates a pie chart in R to show the distribution of churn (customer attrition) in the churn_data dataset. It counts how many entries belong to each category (‘Churn’ or ‘No Churn’), converts this count into a dataframe, and then uses ggplot2 to plot the data as a pie chart with percentage labels.

Churn Distribution of Contract Status

Here we will visualize the Distribution of Contract Status.

R
# Create the count plot
ggplot(churn_data, aes(x = Churn, fill = Contract)) +
  geom_bar(position = "dodge") +
  labs(title = "Churn Distribution w.r.t Contract Status", x = "Churn") +
  theme_minimal()

Output:

Churn-Distribution-wrt-Contract-Status

Churn Distribution w.r.t Contract Status

The above code snippet creates a bar plot in R using ggplot2 to show the distribution of churn (customer attrition) with respect to Contract Status in the churn_data dataframe.

Churn Distribution of Tenure

Now we will visualize the Churn Distribution of Tenure.

R
# Create the count plot
ggplot(churn_data, aes(x = tenure, fill = Churn)) +
  geom_bar(position = "dodge",width = 2,colour="black") +
  labs(title = "Churn Distribution w.r.t Tenure", x = "Months", y = "Count") +
  theme_minimal()

Output:

Churn-Distribution-wrt-Tenure

Churn Distribution w.r.t Tenure

The above code snippet creates a bar plot in R using ggplot2 to show the distribution of churn (customer attrition) with respect to Tenure in the churn_data dataframe.

Churn Distribution of Internet Services

Now we will visualize the Churn Distribution of Internet Services.

R
# Create the count plot
ggplot(churn_data, aes(x = InternetService, fill = Churn)) +
  geom_bar(position = "dodge") +
  labs(title = "Churn Distribution w.r.t Internet Services", x = "Internet Service") +
  theme_minimal()

Output:

Churn-Distribution-wrt-Internet-Services

Churn Distribution w.r.t Internet Services

The above code snippet creates a bar plot in R using ggplot2 to show the distribution of churn (customer attrition) with respect to Internet Services in the churn_data dataframe.

Senior Citizen Status

Identifying the number of senior citizens helps in tailoring services and promotions specifically for this segment. A bar plot can show the distribution of senior citizens versus non-senior citizens.

R
# Sample data
senior_data <- data.frame(
  SeniorCitizen = c("No", "Yes"),
  Count = c(6932, 1539)
)

# Create bar plot
ggplot(senior_data, aes(x = SeniorCitizen, y = Count, fill = SeniorCitizen)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Senior Citizen Status", x = "Senior Citizen", y = "Count") +
  scale_fill_manual(values = c("No" = "#66B3FF", "Yes" = "#FF9999"))

Output:

gh

Customer Churn Analysis in R

This bar plot displays two bars: one for non-senior citizens and one for senior citizens. The height of the bars indicates the count of customers in each category. The plot uses different colors to distinguish between senior citizens and non-senior citizens, making the comparison straightforward.

Payment Method

Understanding how customers prefer to pay for services can inform billing and payment strategy. A bar plot can visualize the distribution of different payment methods.

R
# Sample data
payment_data <- data.frame(
  PaymentMethod = c("Bank transfer (automatic)", "Credit card (automatic)",
                                                "Electronic check", "Mailed check"),
  Count = c(1542, 1521, 2365, 1604)
)

# Create bar plot
ggplot(payment_data, aes(x = PaymentMethod, y = Count, fill = PaymentMethod)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  labs(title = "Payment Method Distribution", x = "Payment Method", y = "Count") +
  scale_fill_brewer(palette = "Set3")

Output:

Screenshot-2024-07-04-091914

Telecom Customer Churn Analysis in R

The bar plot represents the number of customers using each payment method. The plot uses different colors for each payment method, enhancing the visual distinction and making it easy to identify the most and least popular payment methods among customers.

Conclusion

By leveraging the insights from the churn analysis, telecom companies can develop targeted strategies to reduce churn, enhance customer satisfaction, and ultimately drive growth. Continuous monitoring and analysis of customer data are essential to adapting to market trends and evolving customer needs, ensuring long-term success in the competitive telecom industry.




Reffered: https://www.geeksforgeeks.org


AI ML DS

Related
Unlocking Performance: Understanding Numba&#039;s Speed Advantages Over NumPy Unlocking Performance: Understanding Numba&#039;s Speed Advantages Over NumPy
UMAP: Uniform Manifold Approximation and Projection UMAP: Uniform Manifold Approximation and Projection
How to Return the Fit Error in Python curve_fit How to Return the Fit Error in Python curve_fit
Top Pre-Trained Models for Image Classification Top Pre-Trained Models for Image Classification
7 Amazing Applications of AI in Space Exploration 7 Amazing Applications of AI in Space Exploration

Type:
Geek
Category:
Coding
Sub Category:
Tutorial
Uploaded by:
Admin
Views:
14