Create temp tables/views


createOrReplaceTempView()

What it does

It allows you to register a DataFrame as a temporary table/view and reference it in Spark using SQL queries:

# Register the DataFrame
df_green.createOrReplaceTempView('trips_data')

# Now you can query it with SQL
result = spark.sql("""
    SELECT * 
    FROM trips_data 
    WHERE fare_amount > 10
""")

Types of temp tables/views:

1. Temporary View (session-scoped):

df.createOrReplaceTempView('my_table')
# Available only in current Spark session

2. Global Temporary View (application-scoped):

Key characteristics:

  • Temporary - exists only for the duration of the Spark session

  • In-memory metadata - doesn't write data to disk

  • SQL interface - lets you use SQL instead of DataFrame API

  • Replaces if exists - createOrReplaceTempView() overwrites existing view with same name

Example workflow:


Last updated