Previous USER DEFINED FUNCTIONS Next Replace values Drop Duplicate Fill Drop Null
In post we will discuss about the different kind of views and how to use to them to convert from dataframe to sql table.
Git hub link to SQL views jupyter notebook
There are four different form of views, in that two subdivisions are global and local. Global views lifetime ends with the spark application , but the local view lifetime ends with the spark session. Create views creates the sql view form of a table but if the table name already exists then it will throw an error, but create or replace temp views replaces the already existing view , so be careful when you are using the replace.
creating session and loading the data
import os from pyspark import SparkConf from pyspark.sql import SparkSession spark = SparkSession.builder.master("local").config(conf=SparkConf()).getOrCreate() # loading the data and assigning the schema. path_text_orders="file:///D://data-master/retail_db/orders" orders_text=spark.read.format("text").load(path_text_orders) orders_table=orders_text.selectExpr("cast(split(value,',') [0] as int) order_customer_id", "cast(split(value,',') [1] as date) order_date", "cast(split(value,',') [2] as int) order_id", "cast(split(value,',') [3] as string) order_status") orders_table.show(2)
To drop a view use the below commands
#dropping the global views spark.catalog.dropGlobalTempView("orders_table") #dropping the temp views spark.catalog.dropTempView("orders_table")
create global temp view
# creating the global temp view orders_table.createGlobalTempView("orders_table") spark.sql("select * from global_temp.orders_table limit 2").show()
create or replace global temp view
# replacing a global temp views orders_table.createOrReplaceGlobalTempView("orders_table") spark.sql("select * from global_temp.orders_table limit 2").show()
create temp view
# creating a temp views orders_table.createTempView("orders_table") spark.sql("select * from global_temp.orders_table limit 2").show()
create or replace temp view
# over writing a temp views orders_table.createOrReplaceTempView("orders_table") spark.sql("select * from global_temp.orders_table limit 2").show()