0% found this document useful (0 votes)
659 views

PySpark Notes

Databricks provides a lakehouse ecosystem for data storage and processing using Apache Spark. The ecosystem includes DataFrames which allow users to transform and manipulate data using various methods and operations like filtering, grouping, joining and aggregating. The module covers how to use these DataFrame transformations and aggregation functions to write Spark code for different use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
659 views

PySpark Notes

Databricks provides a lakehouse ecosystem for data storage and processing using Apache Spark. The ecosystem includes DataFrames which allow users to transform and manipulate data using various methods and operations like filtering, grouping, joining and aggregating. The module covers how to use these DataFrame transformations and aggregation functions to write Spark code for different use cases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Databricks Ecosystem

Introduction
In this section, we're going to cover the Databricks Ecosystem, along with a Spark Overview. We're going to discuss the Lakehouse, an exciting
new paradigm in data storage and processing. 

Our first lesson


is 
DataFrame

Transformation Methods and Operations


Apache Spark Programming -Transformations

Aggregation Functions
Introduction

This module is all about transforming DataFrames. We're going to cover ways to manipulate via the DataFrame API so that by the end of this
module you'll be writing Spark code for a variety of use cases.
DateTime

You might also like