![]() |
Scala stands for scalable language. It was developed in 2003 by Martin Odersky. It is an object-oriented language that provides support for functional programming approach as well. Everything in Scala is an object. It is a statically typed language although unlike other statically typed languages like C, C++, or Java, it doesn’t require type information while writing the code. The type verification is done at the compile time. Static typing allows to building of safe systems by default. Smart built-in checks and actionable error messages, combined with thread-safe data structures and collections, prevent many tricky bugs before the program first runs. This article focuses on discussing steps to create SQLContext in Spark using Scala. Table of Content What is SQLContext?The official definition in the documentation of Spark is:
The purpose of SQLContext is to introduce processing on structured data in Spark. Before it, spark only had RDDs to manipulate data. RDDs are simply a collection of rows (notice the absence of columns) that can be manipulated using lambda functions and other functionalities. SQLContext introduced objects that would add schema (like column name and data type column) to the data to make it similar to relational databases. The additional information about data would also open the gate to optimizations for data processing. Looking more at the documentation, it shows that the SQLContext is a class introduced in version 1.0.0 and provides a set of functions that allow creating and manipulating a SchemaRDD object. Here is the list of functions:
The APIs revolve around inter-transformation of Parquet files and SchemaRDD objects. SchemaRDD objects are an RDD of Row objects that has an associated schema. In addition to standard RDD functions, SchemaRDDs can be used in relational queries, like as below:
The above code would not run on the latest versions of Spark because SchemaRDDs are now obsolete. Currently, SQLContext is itslef not used and instead SparkSession is used to create a unified interface for many such different contexts like SQLContext, SparkContext HiveContext and others. Inside SparkSession, the SQLContext is still present. Also, instead of SchemaRDDs spark now uses DataSets and DataFrames to denote structured data. Creating SQLContext1. Using SparkContextWe can create an SQLContext from a sparkcontext. The constructor is as follows:
We can create a simple sparkcontext object with “master” (the cluster url) being set to “local” (just use the current machine) and “appName” to “createSQLContext”. We can then supply this sparkcontext to the SQLContext constructor.
Output: ![]() The SQLContext Object Explanation: As you can see above we have created a new SQLContext object. Although we were successful but this method is deprecated and SQLContext is replaced with SparkSession. SQLContext is kept in newer versions only for backward compatibility. 2. Using Existing SQLContext ObjectWe can also use an existing SQLContext object to create a new SQLContext object. Every SQLContext provides a newSession API to create a new object based on the same SparkContext object. The API is as follows:
Below is the Scala program to implement the approach:
Output: ![]() The SQLContext Object Explanation: As you can see above we have created a new SQLContext object. Although we were successful but this method is deprecated and SQLContext is replaced with SparkSession. SQLContext is kept in newer versions only for backward compatibility. 3. Using SparkSessionThe latest way (as of version 3.5.0) is to use SparkSession object. The SparkSession is a culmination of various previous contexts and provides a unified interface for all of them. We can create a SparkSession object using the builder API and then access the SQLContext object from it as follows:
Output: ![]() The SQLContext Object Explanation: As you can see we accessed the SQLContext object from inside the SparkSession object. |
Reffered: https://www.geeksforgeeks.org
Scala |
Type: | Geek |
Category: | Coding |
Sub Category: | Tutorial |
Uploaded by: | Admin |
Views: | 14 |