spark.sql(""" CREATE TABLE IF NOT EXISTS nyc_taxi USING DELTA LOCATION '/mnt/delta/nyc-taxi' """)
spark.sql(""" CREATE OR REPLACE VIEW cleaned_taxi AS SELECT * FROM nyc_taxi WHERE passenger_count IS NOT NULL AND trip_distance > 0 """)
spark.sql(""" CREATE FUNCTION calculate_fare_per_mile(fare FLOAT, distance FLOAT) RETURNS FLOAT RETURN fare / distance """)
Tip: Use spark.sql (""" DROP FUNCTION <function_name> """) to delete the function if error saying there is one that still exists,
afterwards continue with 'CREATE FUNCTION' command .
spark.sql(""" SELECT *, calculate_fare_per_mile(total_amount, trip_distance) AS fare_per_mile FROM cleaned_taxi """).write.mode("overwrite").format("delta").save("/mnt/delta/transformed-taxi")



