![]() |
Attribute-Relation File Format (ARFF) is a file format developed by the Machine Learning Project of the University of Waikato, New Zealand. It has been developed by the Computer Science department of the aforementioned University. The ARFF files mostly belong to WEKA (Waikato Environment for Knowledge Analysis), which is free software licensed under the GNU Free Public License. It is a collection of Machine Learning and Data Analysis tools. In this article, we will see how we can convert an ARFF file into a Pandas data frame. Prerequisites:We will be using two modules here. To install them, execute the following command – pip install pandas Approach 1: Using Pandas and SciPyStep – 1After the installation of the required modules, we will import them. Python3
We will use the loadarff() method of the arff class of the SciPy.io module. So the user can import them directly at the beginning, or load just the arff class and then use the loadarff method while needed. Step – 2Download an ARFF file from the Official WEKA website and keep it in the same directory as the python file. It would be easier to import it then. We will now use the loadarff() method to import the file which we have downloaded and store it in a variable. Python3
Step – 3Now we will use the DataFrame method of the pandas library here to convert that ARFF file into pandas dataframe. Python3
Here inside the DataFrame() method we are passing the name of the file in which we have imported and stored the ARFF file and providing the index [0] to signify that the data is extracted from the first column of the arff file and then converted into a Pandas Dataframe. Step – 4Now we will use common pandas commands like head(), tail() etc to see if the arff file has been successfully converted into a dataframe or not. Python3
Output: MYCT MMIN MMAX CACH CHMIN CHMAX class Python3
Output: MYCT MMIN MMAX CACH CHMIN CHMAX class Python3
Output: 0 125.0 Approach – 2 : Using liac_arff and PandasWe can use the liac_arff module alongside Pandas to import and convert an arff file into a Pandas DataFrame. Install the required modules first by executing the following command – pip install liac-arff Step – 1After installing the required modules, we will import them. Python3
Step – 2After importing the required modules, we will use a variable in which we will import and store the arff file. We will use the loadarff() method of the ARFF module. Python3
Here, the variable data has been used to load and open the ARFF file. Step – 3After that Convert the data to a Pandas DataFrame, Python3
Here, the data variable will be converted to a dataframe. Step – 4Finally, we will print the data frame to see if it is working properly or not. Python3
Output: MYCT MMIN MMAX CACH CHMIN CHMAX class ConclusionWe saw different approaches in this article of how we can read a file with the extension ARFF can be converted into a Pandas DataFrame. Some user may prefer the approach which involves Pandas and SciPy whereas some may like the second approach. The benefit of converting an ARFF file into a Pandas DataFrame because, it opens a whole new sea of opportunities of how to manipulate the information stored in that file. It also helps in cleaning them or sort them in more precise manner. |
Reffered: https://www.geeksforgeeks.org
Pandas |
Related |
---|
|
![]() |
![]() |
![]() |
![]() |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 13 |