Horje
How to add multiple columns to a data.frame in R?

In R Language adding multiple columns to a data.frame can be done in several ways. Below, we will explore different methods to accomplish this, using some practical examples. We will use the base R approach, as well as the dplyr package from the tidyverse collection of packages.

Understanding Data Frames in R

The data frame in the R context is a two-dimensional table or an array-like structure in which all the columns can possess different types of values such as numeric, character, factors, etc. Data frames are crucial in the process of data manipulation in R and work is made easier when carrying out operations on data sets.

Method 1: Using the $ Operator

You can add new columns to a data.frame by directly assigning values to new column names.

R
# Create a sample data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Alice", "Bob", "Charlie", "David", "Eve")
)
df
# Add new columns
df$Age <- c(25, 30, 35, 40, 45)
df$Salary <- c(50000, 55000, 60000, 65000, 70000)

# Print the updated data frame
print(df)

Output:

  ID    Name
1  1   Alice
2  2     Bob
3  3 Charlie
4  4   David
5  5     Eve

  ID    Name Age Salary
1  1   Alice  25  50000
2  2     Bob  30  55000
3  3 Charlie  35  60000
4  4   David  40  65000
5  5     Eve  45  70000

Method 2: Using cbind()

The cbind() function can be used to combine multiple vectors or data frames by column.

R
# Create a sample data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Alice", "Bob", "Charlie", "David", "Eve")
)

# Create new columns as data frames
new_cols <- data.frame(
  Age = c(25, 30, 35, 40, 45),
  Salary = c(50000, 55000, 60000, 65000, 70000)
)

# Add new columns using cbind()
df <- cbind(df, new_cols)

# Print the updated data frame
print(df)

Output:

  ID    Name Age Salary
1  1   Alice  25  50000
2  2     Bob  30  55000
3  3 Charlie  35  60000
4  4   David  40  65000
5  5     Eve  45  70000

Method 3: Using within()

The within() function allows for convenient modification of a data.frame by adding or transforming columns.

R
# Create a sample data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Alice", "Bob", "Charlie", "David", "Eve")
)

# Add new columns using within()
df <- within(df, {
  Age <- c(25, 30, 35, 40, 45)
  Salary <- c(50000, 55000, 60000, 65000, 70000)
})

# Print the updated data frame
print(df)

Output:

  ID    Name Salary Age
1  1   Alice  50000  25
2  2     Bob  55000  30
3  3 Charlie  60000  35
4  4   David  65000  40
5  5     Eve  70000  45

Using dplyr from the tidyverse

The dplyr package provides a more readable and efficient way to manipulate data frames.

Method 1: Using mutate()

The mutate() function is used to add new variables and preserve existing ones.

R
# Install and load dplyr package if not already installed
# install.packages("dplyr")
library(dplyr)

# Create a sample data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Alice", "Bob", "Charlie", "David", "Eve")
)

# Add new columns using mutate()
df <- df %>%
  mutate(
    Age = c(25, 30, 35, 40, 45),
    Salary = c(50000, 55000, 60000, 65000, 70000)
  )

# Print the updated data frame
print(df)

Output:

  ID    Name Age Salary
1  1   Alice  25  50000
2  2     Bob  30  55000
3  3 Charlie  35  60000
4  4   David  40  65000
5  5     Eve  45  70000

Method 2: Using bind_cols()

The bind_cols() function combines data frames by their columns.

R
# Install and load dplyr package if not already installed
# install.packages("dplyr")
library(dplyr)

# Create a sample data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Alice", "Bob", "Charlie", "David", "Eve")
)

# Create new columns as a data frame
new_cols <- data.frame(
  Age = c(25, 30, 35, 40, 45),
  Salary = c(50000, 55000, 60000, 65000, 70000)
)

# Add new columns using bind_cols()
df <- bind_cols(df, new_cols)

# Print the updated data frame
print(df)

Output:

  ID    Name Age Salary
1  1   Alice  25  50000
2  2     Bob  30  55000
3  3 Charlie  35  60000
4  4   David  40  65000
5  5     Eve  45  70000

Conclusion

Adding multiple columns to a data.frame in R can be done using various methods, each suited to different needs and preferences. Base R provides functions like $, cbind(), and within(), while the dplyr package from the tidyverse offers mutate() and bind_cols() for more readable and efficient code. Choosing the right method depends on your specific use case and coding style.




Reffered: https://www.geeksforgeeks.org


R Language

Related
Add Horizontal or Vertical Line in Plotly Using R Add Horizontal or Vertical Line in Plotly Using R
Creating initialize method for reference class in R Creating initialize method for reference class in R
RUnit - A Unit Test Framework for R RUnit - A Unit Test Framework for R
Supplementary Qualitative Variable Labels in FactoMinR Supplementary Qualitative Variable Labels in FactoMinR
How to use a variable in dplyr::filter? How to use a variable in dplyr::filter?

Type:
Geek
Category:
Coding
Sub Category:
Tutorial
Uploaded by:
Admin
Views:
21