An open-source data analysis and manipulation tool that is used to create datasets is known as Pandas. There are some circumstances when the user forgets to give the name to the columns. Such datasets, when read in Pandas, give the name Unnamed to such columns. There are certain ways to drop those unnamed columns. In this article, we have discussed the same.
Methods to drop unnamed columns in a Pandas data frame:- Using the drop function
- Using the loc function
- While importing the data
Using the drop function:The function that is used to remove specified columns or rows by specifying label names is called the drop function. In this method, we will drop the Unnamed column in the Pandas data frame using the drop function.
Syntax:
df.drop(df.columns[df.columns.str.contains(‘unnamed’,case = False)],axis = 1, inplace = True)
Here,
- df: It is the data frame from which you want to drop the unnamed column.
In this example, we have used student_data.csv file, which has one Unnamed column. Further, we have removed this column using loc function.
Python
# Import Pandas library
import pandas as pd
# Read the CSV file
df = pd.read_csv(
'https://media.geeksforgeeks.org/wp-content/uploads/20240208164753/student_data3.csv')
# Print the data frame
print('Actual dataframe:')
print(df)
# Removing unnamed columns using drop function
df.drop(df.columns[df.columns.str.contains(
'unnamed', case=False)], axis=1, inplace=True)
# Print the data frame after removing unnamed columns
print('\nDataframe after removing unnamed columns:')
print(df)
Output:
Actual dataframe: name subject Unnamed: 2 fees fine 0 Arun Maths 9 9000 400 1 Aniket Social Science 10 12000 600 2 Ishita English 11 15000 0 3 Pranjal Science 12 18000 1000 4 Vinayak Computer 12 18000 500 Dataframe after removing unnamed columns: name subject fees fine 0 Arun Maths 9000 400 1 Aniket Social Science 12000 600 2 Ishita English 15000 0 3 Pranjal Science 18000 1000 4 Vinayak Computer 18000 500
Using the loc function:The function that is used to access specified group of rows and columns by certain labels is called loc function. What we will do is first we will find all the columns having column name Unnamed, and then remove such columns using loc function.
Syntax:
df = df.loc[:, ~df.columns.str.contains(‘^Unnamed’)]
Here,
- df: It is the data frame from which you want to drop unnamed column.
We have used the loc function now to Drop Unnamed Column in Pandas DataFrame.
Python
# Import Pandas library
import pandas as pd
# Read the CSV file
df = pd.read_csv(
'https://media.geeksforgeeks.org/wp-content/uploads/20240208164753/student_data3.csv')
# Print the data frame
print('Actual dataframe:')
print(df)
# Removing unnamed columns using drop function
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]
# Print the data frame after removing unnamed columns
print('\nDataframe after removing unnamed columns:')
print(df)
Output:
Actual dataframe: name subject Unnamed: 2 fees fine 0 Arun Maths 9 9000 400 1 Aniket Social Science 10 12000 600 2 Ishita English 11 15000 0 3 Pranjal Science 12 18000 1000 4 Vinayak Computer 12 18000 500 Dataframe after removing unnamed columns: name subject fees fine 0 Arun Maths 9000 400 1 Aniket Social Science 12000 600 2 Ishita English 15000 0 3 Pranjal Science 18000 1000 4 Vinayak Computer 18000 500
Using index_col=0The way of explicitly specifying which column to make as the index to the read_csv function is known as index_col attribute. This method is useful if you have created the dataset in Pandas and have stored that in CSV file. Then, while importing that CSV file back in Pandas, you can use this method.
Syntax:df=pd.read_csv(csv_file ,index_col=0)
Here,
- csv_file: It is the CSV file from which you want to drop unnamed columns.
Example:In this method, we created the Pandas dataframe with three columns, name, subject and fees. On saving this Pandas dataframe in CSV, it by defaults add an unnamed column in the dataset. Now, while importing the dataset, we will remove that unnamed column by using index_col attribute of read_csv function.
Python
# Import Pandas library
import pandas as pd
# create DataFrame
df1 = pd.DataFrame({'name': ['Arun', 'Aniket', 'Ishits', 'Pranjal', 'Vinayak'],
'subject': ['Maths', 'Social Science', 'English', 'Science', 'Computer'],
'fees': [9000, 12000, 15000, 18000, 18000]})
# Store the data frame in a CSV file
df1.to_csv('student_data.csv')
# Read the CSV file
df = pd.read_csv('student_data.csv')
# Print the data frame
print('Actual dataframe:')
print(df)
# Read the CSV file removing unnamed columns
df = pd.read_csv('student_data.csv', index_col=0)
# Print the data frame after removing unnamed columns
print('\nDataframe after removing unnamed columns:')
print(df)
Output:
Actual dataframe: Unnamed: 0 name subject fees 0 0 Arun Maths 9000 1 1 Aniket Social Science 12000 2 2 Ishits English 15000 3 3 Pranjal Science 18000 4 4 Vinayak Computer 18000 Dataframe after removing unnamed columns: name subject fees 0 Arun Maths 9000 1 Aniket Social Science 12000 2 Ishits English 15000 3 Pranjal Science 18000 4 Vinayak Computer 18000 How to Drop Unnamed Column in Pandas DataFrame – FAQsHow to Delete Unnamed Columns in PandasUnnamed columns typically occur during data import, especially when some data rows have more columns than the headers specified. To remove columns that pandas automatically names like “Unnamed: 0”, you can use filtering with filter() or a condition with column names:
import pandas as pd
# Sample DataFrame with unnamed columns df = pd.DataFrame({ 'Unnamed: 0': [1, 2, 3], 'A': [4, 5, 6], 'Unnamed: 3': [7, 8, 9] })
# Remove columns with 'Unnamed:' in their name df = df.loc[:, ~df.columns.str.contains('^Unnamed')] print(df)
How to Remove a Column Name from a DataFrame in PandasIf by removing a column name you mean making it unnamed or resetting it, you can rename the column to something generic or empty:
# Renaming a column to be unnamed or less meaningful df.rename(columns={'A': ''}, inplace=True) print(df.columns)
How to Remove NaN Column Names in PandasTo handle NaN column names, which can occur due to improper data loading, you can replace them and then proceed with standard operations:
# Assuming some columns don't have names and are NaN df.columns = pd.Series(df.columns).fillna('Unnamed_Column') # Now you can rename or drop as needed df.drop(columns=['Unnamed_Column'], inplace=True)
How to Remove a Specific Column from a Pandas DataFrameTo drop a specific column by name, use the drop() method with the columns argument specifying the column name:
# Drop a specific column by name df.drop(columns='A', inplace=True) print(df)
How to Drop Columns in PandasDropping multiple columns by name involves a similar approach, where you can list all the columns you want to remove:
# Drop multiple columns by name df.drop(columns=['A', 'Unnamed: 3'], inplace=True) print(df)
|