Big Data With Spark and Hadoop
Big Data With Spark and Hadoop
EET736
Mini Project
PREPARED BY
2023761495
Final Project: Data Analysis using Spark.
This final project is similar to the Practice Project you did. In this project, you will not be provided with
hints or solutions. You will create a DataFrame by loading data from a CSV file and apply
transformations and actions using Spark SQL. This needs to be achieved by performing the following
tasks:
Prerequisites
1. For this lab assignment, you will be using Python and Spark (PySpark). Therefore, it's
essential to make sure that the following libraries are installed in your lab environment or
within Skills Network (SN) Labs
Download the CSV data.