Create PySpark dataframe from nested dictionary Last Updated : 17 Jun, 2021 Summarize Comments Improve Suggest changes Share Like Article Like Report In this article, we are going to discuss the creation of Pyspark dataframe from the nested dictionary. We will use the createDataFrame() method from pyspark for creating DataFrame. For this, we will use a list of nested dictionary and extract the pair as a key and value. Select the key, value pairs by mentioning the items() function from the nested dictionary [Row(**{'': k, **v}) for k,v in data.items()] Example 1:Python program to create college data with a dictionary with nested address in dictionary Python3 # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession from pyspark.sql import Row # creating sparksession and giving an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # creating nested dictionary data = { 'student_1': { 'student id': 7058, 'country': 'India', 'state': 'AP', 'district': 'Guntur' }, 'student_2': { 'student id': 7059, 'country': 'Srilanka', 'state': 'X', 'district': 'Y' } } # taking row data rowdata = [Row(**{'': k, **v}) for k, v in data.items()] # creating the pyspark dataframe final = spark.createDataFrame(rowdata).select( 'student id', 'country', 'state', 'district') # display pyspark dataframe final.show() Output: +----------+--------+-----+--------+ |student id| country|state|district| +----------+--------+-----+--------+ | 7058| India| AP| Guntur| | 7059|Srilanka| X| Y| +----------+--------+-----+--------+ Example 2: Python program to create nested dictionaries with 3 columns(3 keys) Python3 # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession from pyspark.sql import Row # creating sparksession and giving an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # creating nested dictionary data = { 'student_1': { 'student id': 7058, 'country': 'India', 'state': 'AP' }, 'student_2': { 'student id': 7059, 'country': 'Srilanka', 'state': 'X' } } # taking row data rowdata = [Row(**{'': k, **v}) for k, v in data.items()] # creating the pyspark dataframe final = spark.createDataFrame(rowdata).select( 'student id', 'country', 'state') # display pyspark dataframe final.show() Output: +----------+--------+-----+ |student id| country|state| +----------+--------+-----+ | 7058| India| AP| | 7059|Srilanka| X| +----------+--------+-----+ Comment More infoAdvertise with us Next Article PySpark - Create DataFrame from List S sravankumar_171fa07058 Follow Improve Article Tags : Python Python-Pyspark Practice Tags : python Similar Reads Create PySpark dataframe from dictionary In this article, we are going to discuss the creation of Pyspark dataframe from the dictionary. To do this spark.createDataFrame() method method is used. This method takes two argument data and columns. The data attribute will contain the dataframe and the columns attribute will contain the list of 2 min read PySpark - Create DataFrame from List In this article, we are going to discuss how to create a Pyspark dataframe from a list. To do this first create a list of data and a list of column names. Then pass this zipped data to spark.createDataFrame() method. This method is used to create DataFrame. The data attribute will be the list of da 2 min read PySpark - Create dictionary from data in two columns In this article, we are going to see how to create a dictionary from data in two columns in PySpark using Python. Method 1: Using Dictionary comprehension Here we will create dataframe with two columns and then convert it into a dictionary using Dictionary comprehension. Python # importing pyspark # 3 min read How to create DataFrame from dictionary in Python-Pandas? The task of converting a dictionary into a Pandas DataFrame involves transforming a dictionary into a structured, tabular format where keys represent column names or row indexes and values represent the corresponding data.Using Default ConstructorThis is the simplest method where a dictionary is dir 3 min read How To Convert Pandas Dataframe To Nested Dictionary In this article, we will learn how to convert Pandas DataFrame to Nested Dictionary. Convert Pandas Dataframe To Nested DictionaryConverting a Pandas DataFrame to a nested dictionary involves organizing the data in a hierarchical structure based on specific columns. In Python's Pandas library, we ca 2 min read Convert PySpark DataFrame to Dictionary in Python In this article, we are going to see how to convert the PySpark data frame to the dictionary, where keys are column names and values are column values. Before starting, we will create a sample Dataframe: Python3 # Importing necessary libraries from pyspark.sql import SparkSession # Create a spark se 3 min read Like