Data manipulation and analysis are common tasks in the field of data science, and two powerful libraries in Python that facilitate these tasks are NumPy and Pandas. NumPy provides support for large, multi-dimensional arrays and matrices, while Pandas offers data structures like DataFrames that make it easy to manipulate and analyze structured data. In this article, we’ll explore how to create a Pandas DataFrame from a NumPy array.
Create Dataframe from Numpy Array in PythonBelow are some of the ways by which we can understand how we can convert a NumPy Array to Pandas DataFrame in Python:
- Using pd.DataFrame()
- Specifying Column Names
- Customize Row and Column Indices
- Using pd.DataFrame.from_records()
Create Numpy Array In this example, the below code imports the NumPy and Pandas libraries and creates a 2D NumPy array named “data” containing a matrix, which is then printed.
Python
import numpy as np
import pandas as pd
# Create a NumPy array
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(data)
Output :
[[1 2 3] [4 5 6] [7 8 9]] Methods 1: pd.DataFrame()The most straightforward method is to use the Pandas pd.DataFrame() constructor, passing the NumPy array as an argument.
Python
# Using pd.DataFrame()
df1 = pd.DataFrame(data)
# Display the DataFrame
print(df1)
Output :
0 1 2 0 1 2 3 1 4 5 6 2 7 8 9 Specifying Column NamesYou can provide column names while creating the DataFrame using the columns parameter.
Python
#Specifying Column Names
df2 = pd.DataFrame(data, columns=['col1', 'col2', 'col3'])
# Display the DataFrame
print(df2)
Output :
col1 col2 col3 0 1 2 3 1 4 5 6 2 7 8 9 Customize Row and Column IndicesCustomize both row and column indices using the index and columns parameters.
Python
# Approach 3: Customizing Row and Column Indices
df3 = pd.DataFrame(data, index=['row1', 'row2', 'row3'], columns=['col1', 'col2', 'col3'])
# Display the DataFrame
print(df3)
Output :
col1 col2 col3 row1 1 2 3 row2 4 5 6 row3 7 8 9 Methods 2: pd.DataFrame.from_records()Another way is to use pd.DataFrame.from_records() method, which is particularly useful when dealing with structured data or records.
Python
# Using pd.DataFrame.from_records()
df4 = pd.DataFrame.from_records(data, columns=['col1', 'col2', 'col3'])
# Display the DataFrame
print(df4)
Output :
col1 col2 col3 0 1 2 3 1 4 5 6 2 7 8 9 Conclusion In conclusion, each of these approaches provides flexibility for creating a Pandas DataFrame from a NumPy array, and the choice depends on your specific requirements and the nature of your data. Whether you need to customize indices, specify column names, or work with structured data, Pandas offers multiple ways to seamlessly convert NumPy arrays into versatile DataFrames.
Convert Numpy Array to Dataframe – FAQsHow to Convert NumPy Array to DataFrame in PythonTo convert a NumPy array to a pandas DataFrame, you can use the DataFrame constructor provided by pandas. You can optionally specify column names:
import pandas as pd import numpy as np
# Create a NumPy array array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Convert the NumPy array to a DataFrame df = pd.DataFrame(array, columns=['Column1', 'Column2', 'Column3']) print(df)
How to Create a DataFrame from a NumPy ArrayThe process is the same as above, where you pass the NumPy array directly to the DataFrame constructor. If the array is multidimensional and you want specific indexing or column names, you can specify them as shown.
How to Convert a DataFrame to a NumPy ArrayYou can convert a pandas DataFrame back to a NumPy array using the .values attribute or the .to_numpy() method, which is recommended for newer versions of pandas:
# Convert DataFrame to a NumPy array array_from_df = df.to_numpy() print(array_from_df)
Can a NumPy Array Contain a List?A NumPy array can contain objects, including lists, if it is specifically configured to do so. However, this is generally not recommended because it defeats the purpose of NumPy’s fast array operations which rely on all elements being of the same type and thus stored contiguously in memory. Here’s how you could, nonetheless, create such an array:
# Create a NumPy array of lists array_of_lists = np.array([list(range(3)), list(range(3, 6))], dtype=object) print(array_of_lists)
How to Convert NumPy ndarray to Array in PythonThe terminology might be a bit confusing here; ndarray is the term used for NumPy arrays, and they are already “arrays” in the context of NumPy. If you’re looking to convert a NumPy array to a standard Python list (often referred to simply as “array” in other programming contexts), you can use the tolist() method:
# Convert ndarray to a standard Python list python_list = array_from_df.tolist() print(python_list)
|