Horje
Election Voter Turnout Visualization in R

Election Voter turnout is a vital metric in evaluating the health of a democracy. Best Democracy stands on the people who have the power to vote and allows its people to leverage their power by casting their votes. Understanding who votes and who doesn’t can reveal significant insights into political engagement, national interests, social inequality, and the effectiveness of electoral systems and their functioning in that particular country using R Programming Language.

Objectives and Goals

The primary objective of this article is to visualize voter turnout data to identify trends and patterns over time and across different States in India. By doing so, we aim to understand the factors influencing voter turnout and make informed predictions about future electoral participation. India consists of a national election where each states are allocated a certain number of seats and the party which gets the majority in them gets into power in the center. In our dataset, we have the information on the voting details for each state.

Dataset Explanation

The provided dataset contains information about candidates in Indian national elections which particularly focuses on the voter turnout and other key details. Here are the relevant columns explained:

  • st_name: The name of the state or union territory where the election took place.
  • year: The year in which the election was held.
  • pc_no: The parliamentary constituency number.
  • pc_name: The name of the parliamentary constituency.
  • pc_type: The type of parliamentary constituency (e.g., General, Reserved).
  • cand_name: The name of the candidate who contested the election.
  • cand_sex: The gender of the candidate.
  • partyname: The full name of the political party to which the candidate belongs.
  • partyabbre: The abbreviation of the political partys in india.
  • totvotpoll: The total number of votes polled (i.e., the number of votes received by the candidate).
  • electors: The total number of registered electors in the constituency.

Dataset Link: Election Voter

Load the Dataset and data preprocessing

First, let’s load the data from a CSV file and clean it by removing any undefined data, filling missing value and remove the outliers.

R
# Load necessary libraries
library(ggplot2)
library(dplyr)
library(readr)

# Load the dataset
dataset_path <- "path/to/indianelection.csv"
voter_data <- read_csv(dataset_path)

# Display the structure and first few rows of the dataset
str(voter_data)
head(voter_data)

# Data cleaning: Remove any rows with missing values and ensure consistency 
voter_data_clean <- voter_data %>%
  filter(!is.na(totvotpoll), !is.na(electors)) %>%
  mutate(partyname = ifelse(partyname == "Indian Natioanl Congress (I)",
                            "Indian National Congress", partyname))

# Calculate voter turnout percentage
voter_data_clean <- voter_data_clean %>%
  mutate(turnout_percentage = (totvotpoll / electors) * 100)

Output:

                    st_name year pc_no                   pc_name pc_type
1 Andaman & Nicobar Islands 1977 1 Andaman & Nicobar Islands GEN
2 Andaman & Nicobar Islands 1977 1 Andaman & Nicobar Islands GEN
3 Andaman & Nicobar Islands 1980 1 Andaman & Nicobar Islands GEN
4 Andaman & Nicobar Islands 1980 1 Andaman & Nicobar Islands GEN
5 Andaman & Nicobar Islands 1980 1 Andaman & Nicobar Islands GEN
6 Andaman & Nicobar Islands 1980 1 Andaman & Nicobar Islands GEN
cand_name cand_sex partyname partyabbre totvotpoll electors
1 K.R. Ganesh M Independents IND 25168 85308
2 Manoranjan Bhakta M Indian National Congress INC 35400 85308
3 Ramesh Mazumdar M Independents IND 109 96084
4 Alagiri Swamy M Independents IND 125 96084
5 Kannu Chemy M Independents IND 405 96084
6 K.N. Raju M Independents IND 470 96084

Perform Exploratory Data Analysis (EDA)

We’ll start with some basic exploratory data analysis to understand the data better. Exploratory Data Analysis or EDA is a statistical approach or technique for analyzing data sets to summarize their important and main characteristics generally by using some visual aids. EDA Gives a descriptive statistical analysis of the dataset. We are going to use the EDA to summarize the indian election dataset statistically.The below code performs the EDA and outputs the Summary Statistics of the dataset.

R
# Summary statistics of the dataset
summary(voter_data_clean)

# Number of unique parties
unique_parties <- voter_data_clean %>%
  select(partyname) %>%
  distinct()
nrow(unique_parties)

Output:

           st_name           year          pc_no                pc_name     
Uttar Pradesh :14791 Min. :1977 Min. : 1.00 Belgaum : 567
Bihar : 7727 1st Qu.:1989 1st Qu.: 7.00 Nalgonda : 563
Maharashtra : 6458 Median :1996 Median :18.00 East Delhi : 434
Tamil Nadu : 5309 Mean :1997 Mean :22.31 Chandni Chowk: 344
Andhra Pradesh: 5236 3rd Qu.:2004 3rd Qu.:33.00 Lucknow : 319
Madhya Pradesh: 5196 Max. :2014 Max. :85.00 Outer Delhi : 319
(Other) :28364 (Other) :70535
pc_type cand_name cand_sex partyname
: 8070 None Of The Above: 543 F : 3648 Length:73081
GEN:54862 Ashok Kumar : 87 M :68885 Class :character
SC : 7293 Om Prakash : 78 NULL: 542 Mode :character
SC : 15 Raj Kumar : 57 O : 6
ST : 2841 Ram Singh : 54
Rajesh Kumar : 51
(Other) :72211
partyabbre totvotpoll electors turnout_percentage
IND :41127 Min. : 0 Min. : 19471 Min. : 0.00000
INC : 4800 1st Qu.: 872 1st Qu.: 912985 1st Qu.: 0.07908
BJP : 3350 Median : 2743 Median :1099503 Median : 0.25292
BSP : 2624 Mean : 49835 Mean :1122277 Mean : 4.83608
SP : 1057 3rd Qu.: 19185 3rd Qu.:1329086 3rd Qu.: 1.86813
JD : 943 Max. :863358 Max. :3368399 Max. :68.27193
(Other):19180

[1] 1424

Visualize Voter Turnout Trends

We’ll create visualizations to explore voter turnout trends over time and by party. For this purpose we use certain R packages that helps in great visualizations. Examples include ggplot2,ggmap,leaflet,plotly and etc.

1. Voter Turnout Over Time

A versatile package used in R for creating visually appealing and customizable graphics.

R
# Aggregate data by year
turnout_by_year <- voter_data_clean %>%
  group_by(year) %>%
  summarise(total_votes = sum(totvotpoll), total_electors = sum(electors)) %>%
  mutate(turnout_percentage = (total_votes / total_electors) * 100)

# Plot voter turnout percentage over time
ggplot(turnout_by_year, aes(x = year, y = turnout_percentage)) +
  geom_line() +
  geom_point() +
  labs(title = "Voter Turnout Percentage Over Time",
       x = "Year",
       y = "Turnout Percentage") +
  theme_minimal()

Output:

scc1

Election Voter Turnout Visualization

By executing the above code, we understand that the total number of unique parties that are contesting in the election in india amounts.

2. Gender Distribution of Candidates

A bar plot to visualize the gender distribution of candidates.

R
# Aggregate data by gender
gender_distribution <- voter_data_clean %>%
  group_by(cand_sex) %>%
  summarise(count = n())

# Plot gender distribution of candidates
ggplot(gender_distribution, aes(x = cand_sex, y = count, fill = cand_sex)) +
  geom_bar(stat = "identity") +
  labs(title = "Gender Distribution of Candidates",
       x = "Gender",
       y = "Number of Candidates") +
  theme_minimal() +
  scale_fill_manual(values = c("M" = "blue", "F" = "pink"))

Output:

gh

Election Voter Turnout Visualization

The bar plot analysis of the gender distribution of the candidates shouts out that there is a clear majority of male caditates those who are contesting in the election than the female candidates who are fewer in number. This suggests that more female candidates has to participate in the election contest and involve more in politics.

3. Bar Plot of Total Votes Polled by Year

We will create a Bar Plot of Total Votes Polled by Year.

R
# Bar plot of total votes polled by year
votes_by_year <- voter_data %>%
  group_by(year) %>%
  summarize(TotalVotes = sum(totvotpoll))

p1 <- ggplot(votes_by_year, aes(x = year, y = TotalVotes)) +
  geom_bar(stat = "identity", fill = "blue") +
  theme_minimal() +
  labs(title = "Total Votes Polled by Year", x = "Year", y = "Total Votes")

# Display plot
print(p1)

Output:

gh

Election Voter Turnout Visualization

4.Creating Dashboard with Shiny

In R programming, we can create interactive dashboards using the shiny R package by creating a shiny app and maintaining the appropriate folder structure. Below code gives you the interactive dashboard that visualizes the voter turnout percentage for every political party in India.

R
# Shiny app for interactive dashboard
ui <- fluidPage(
  titlePanel("Voter Turnout Analysis"),
  sidebarLayout(
    sidebarPanel(
      selectInput("year", "Select Year", choices = unique(voter_data_clean$year)),
      selectInput("party", "Select Party", choices = unique(voter_data_clean$partyname))
    ),
    mainPanel(
      plotOutput("turnoutPlot"),
      plotOutput("partyPlot")
    )
  )
)

server <- function(input, output) {
  filtered_data <- reactive({
    voter_data_clean %>%
      filter(year == input$year)
  })
  
  filtered_party_data <- reactive({
    voter_data_clean %>%
      filter(partyname == input$party)
  })
  
  output$turnoutPlot <- renderPlot({
    ggplot(filtered_data(), aes(x = pc_name, y = turnout_percentage)) +
      geom_bar(stat = "identity", fill = "steelblue") +
      labs(title = paste("Voter Turnout in", input$year),
           x = "Parliamentary Constituency",
           y = "Turnout Percentage") +
      theme_minimal()
  })
  
  output$partyPlot <- renderPlot({
    ggplot(filtered_party_data(), aes(x = year, y = turnout_percentage, color = pc_name)) +
      geom_line() +
      geom_point() +
      labs(title = paste("Turnout Percentage for", input$party, "Over Years"),
           x = "Year",
           y = "Turnout Percentage") +
      theme_minimal()
  })
}

shinyApp(ui = ui, server = server)

Output:

Conclusion

Our analysis of the Indian national election dataset interestingly reveals several trends:

  1. Voter turnout has varied over the years, with noticeable peaks and troughs.
  2. Certain parties show higher voter turnout percentages, which could be due to their popularity or campaign strategies.
  3. Predictions suggest that voter turnout may continue to follow current trends, though this is subject to various factors not accounted for in our simple model.



Reffered: https://www.geeksforgeeks.org


AI ML DS

Related
Bayesian Information Criterion (BIC) Bayesian Information Criterion (BIC)
Annotating the End of Lines Using Python and Matplotlib Annotating the End of Lines Using Python and Matplotlib
Changing the Datetime Tick Label Frequency for Matplotlib Plots Changing the Datetime Tick Label Frequency for Matplotlib Plots
Utility-Based Agents in AI Utility-Based Agents in AI
AI in Transportation - Benifits, Use Cases and Examples AI in Transportation - Benifits, Use Cases and Examples

Type:
Geek
Category:
Coding
Sub Category:
Tutorial
Uploaded by:
Admin
Views:
20