Convert Pandas Dataframe Column To Float - Coding

Converting columns to floats in Pandas DataFrame is a very crucial step for data analysis. Converting columns to float values can help you perform various arithmetic operations and plot graphs.

In this article, we’ll look at different ways to convert a column to a float in DataFrame.

Using DataFrame.astype()
Using pandas.to_numeric()
Handling Non-numeric Values and Missing Values
Converting Multiple Columns
Convert the entire data frame

Convert Pandas Dataframe Column To Float DataType

Importing Pandas

Python

import pandas as pd

Creating a sample dataframe

Python

# Sample DataFrame
data = {'IntegerColumn': [10, 20, 30],
        'StringColumn': ['15.3', '25.6', '35.8']}
df = pd.DataFrame(data)

Method 1: Using DataFrame.astype()

DataFrame.astype() method is used to cast a Pandas object to a specified dtype. astype() function is used to convert a particular column data type to another data type.

Here, we created a sample data frame with two columns containing integers and strings and then we converted the string column to a float column using the astype() function.

Python

# Print dtypes before conversion
print(&quot;Data types before conversion:&quot;)
print(df.dtypes)

# Convert 'StringColumn' to float using astype()
df['StringColumn'] = df['StringColumn'].astype(float)

# Print dtypes after conversion
print(&quot;\nData types after conversion:&quot;)
print(df.dtypes)

# Print the DataFrame
print(&quot;\nDataFrame after conversion:&quot;)
print(df)

Output:

Data types before conversion:
IntegerColumn     int64
StringColumn     object
dtype: object
Data types after conversion:
IntegerColumn      int64
StringColumn     float64
dtype: object
DataFrame after conversion:
   IntegerColumn  StringColumn
0             10          15.3
1             20          25.6
2             30          35.8

As you can observe the datatype of the string column is changed to float after using astype() function.

Method 2: Using pandas.to_numeric()

pandas.to_numeric() is a function in Pandas which is used to convert argument to a numeric type. Here, we are converting the string column to float using to_numeric() function.

We are printing the data type of the columns before and after the conversion of column to understand the conversion.

Python

# Print dtypes before conversion
print(&quot;Data types before conversion:&quot;)
print(df.dtypes)

# Convert 'StringColumn' to float using to_numeric()
df['StringColumn'] = pd.to_numeric(df['StringColumn'])

# Print dtypes after conversion
print(&quot;\nData types after conversion:&quot;)
print(df.dtypes)

# Print the DataFrame
print(&quot;\nDataFrame after conversion:&quot;)
print(df)

Output:

Data types before conversion:
IntegerColumn     int64
StringColumn     object
dtype: object
Data types after conversion:
IntegerColumn      int64
StringColumn     float64
dtype: object
DataFrame after conversion:
   IntegerColumn  StringColumn
0             10          15.3
1             20          25.6

As you observe the output the data type of the string column is changed from object to float after using to_numeric() function.

Handling Non-numeric Values or Missing Values while coverting the DataType to Float

We can handle Non-convertible values, Missing values, and NaN values by using errors=’coerce’ parameter in pandas.to_numeric() function, errors=’coerce’ parameter instructs Pandas to replace non-convertible values with NaN (Not a Number).

Here, we created a dataframe with missing values and alphabets (can’t convert) and applying pandas.to_numeric() function with errors=’coerce’ parameter. This output the dataframe with the NaN values where the values are not convertible.

Python

# Sample DataFrame with non-numeric and missing values
data = {'Column1': ['10.5', '20.7', '30.2', 'xyz'],
        'Column2': ['15.3', '25.6', '35.8', '']}
df = pd.DataFrame(data)

print('Original DataFrame')
print(df)

# Convert columns to float, handling errors and missing values
df['Column1'] = pd.to_numeric(df['Column1'], errors='coerce')
df['Column2'] = pd.to_numeric(df['Column2'], errors='coerce')

# Print dtypes after conversion
print(&quot;\nData types after conversion:&quot;)
print(df.dtypes)

# Print the DataFrame
print(&quot;\nDataFrame after conversion:&quot;)
print(df)

Output:

Original DataFrame
  Column1 Column2
0    10.5    15.3
1    20.7    25.6
2    30.2    35.8
3     xyz        

Data types after conversion:
Column1    float64
Column2    float64
dtype: object

DataFrame after conversion:
   Column1  Column2
0     10.5     15.3
1     20.7     25.6
2     30.2     35.8
3      NaN      NaN

Converting Multiple Columns

We can convert multiple columns to float in the dataframe by passing multiple columns while conversion. Here is a simple syntax,

df[[‘C1’, ‘C2’]] = df[[‘C1’, ‘C2’]].astype(float)

C1, C2 are the columns of the dataframe to be converted.

Now, we created a dataframe with two columns as strings and converted those two columns to float using the above syntax.

Python

# Sample DataFrame
data = {'C1': ['10.5', '20.7', '30.2'],
        'C2': ['15.3', '25.6', '35.8']}
df = pd.DataFrame(data)

# Print dtypes before conversion
print(&quot;Data types before conversion:&quot;)
print(df.dtypes)

# Convert multiple columns to float using astype()
df[['C1', 'C2']] = df[['C1', 'C2']].astype(float)

# Print dtypes after conversion
print(&quot;\nData types after conversion:&quot;)
print(df.dtypes)

# Print the DataFrame
print(&quot;\nDataFrame after conversion:&quot;)
print(df)

Output:

Data types before conversion:
C1    object
C2    object
dtype: object
Data types after conversion:
C1    float64
C2    float64
dtype: object
DataFrame after conversion:
     C1    C2
0  10.5  15.3
1  20.7  25.6
2  30.2  35.8

Convert the entire DataFrame

We can convert the entire DataFrame using astype() function and passing float as datatype.

Python

# Sample DataFrame
data = {'C1': ['10.5', '20.7', '30.2'],
        'C2': ['15.3', '25.6', '35.8']}
df = pd.DataFrame(data)

# Print dtypes before conversion
print(&quot;Data types before conversion:&quot;)
print(df.dtypes)

# Convert the entire DataFrame to float
df = df.astype(float)

# Print dtypes after conversion
print(&quot;\nData types after conversion:&quot;)
print(df.dtypes)

# Print the DataFrame
print(&quot;\nDataFrame after conversion:&quot;)
print(df)
int(&quot;GFG&quot;)

Output:

Data types before conversion:
C1    object
C2    object
dtype: object
Data types after conversion:
C1    float64
C2    float64
dtype: object
DataFrame after conversion:
     C1    C2
0  10.5  15.3
1  20.7  25.6
2  30.2  35.8

Conclusion:

In conclusion, converting columns to float in a Pandas DataFrame is used for mathematical operations and data analysis. In this article,we discussed about methods like DataFrame.astype() and pandas.to_numeric() and their usage in converting columns to float and handling missing values.

Convert Pandas Dataframe Column To Float – FAQs

How to Convert a Column to Float in Pandas?

To convert a column in a pandas DataFrame to a float data type, use the astype() method:
import pandas as pd

df = pd.DataFrame({
    'A': ['1.1', '2.2', '3.3']
})

# Convert column 'A' to float
df['A'] = df['A'].astype(float)
print(df['A'])

How to Convert Pandas DataFrame Column Float to Int?

When converting a float column to an integer in pandas, decimals will be truncated. Use astype(int) or consider rounding before converting if appropriate:
# Assuming 'A' is already a float column
df['A'] = df['A'].astype(int)  # Direct conversion, truncates the decimal
# or
df['A'] = df['A'].round().astype(int)  # Round first, then convert
print(df['A'])

How to Convert Pandas Column to Integer?

You can convert a column to an integer using the astype() method. If the column contains null values or decimals, you might need to handle them first since integers cannot represent NaN:
df = pd.DataFrame({
    'B': [1.0, 2.5, None]
})

# Fill NaN with 0 (or another appropriate value) and convert to integer
df['B'] = df['B'].fillna(0).astype(int)
print(df['B'])

What is dtype (‘o’)?

In pandas, dtype('O') (the letter ‘o’, not zero) refers to an object type. This is the most general dtype; it is typically used for columns that contain mixed types (e.g., numbers alongside strings) or purely string data. It’s akin to Python’s object type, which can effectively store any type of Python object:
df = pd.DataFrame({
    'C': ['text', 1, 2.5, True]
})
print(df['C'].dtype)  # Outputs: object

How to Get Only Float Columns in Pandas?

To select only the columns in a DataFrame that are of float type, you can use the select_dtypes() method:
df = pd.DataFrame({
    'A': [1.1, 2.2, 3.3],
    'B': [4, 5, 6],
    'C': [7.1, 8.2, None]
})

# Select only float columns
float_columns = df.select_dtypes(include=['float64'])
print(float_columns)

Reffered: https://www.geeksforgeeks.org

Geeks Premier League

Related
Design First API Development with Swagger
Cube Root of 1331 \| How to Find Cube Root of 1331?
How to Force Image Resize and Keep Aspect Ratio in HTML ?
How to use Escape Characters in PHP ?
PHP Program to Print Pascal's Triangle

Type:	Geek
Category:	Coding
Sub Category:	Tutorial
Uploaded by:	Admin
Views:	15