How to create a PySpark dataframe from multiple lists ? Last Updated : 30 May, 2021 Comments Improve Suggest changes Like Article Like Report In this article, we will discuss how to create Pyspark dataframe from multiple lists. ApproachCreate data from multiple lists and give column names in another list. So, to do our task we will use the zip method. zip(list1,list2,., list n) Pass this zipped data to spark.createDataFrame() method dataframe = spark.createDataFrame(data, columns) Examples Example 1: Python program to create two lists and create the dataframe using these two lists Python3 # importing module import pyspark # importing sparksession from # pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giving # an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # list of college data with dictionary # with two lists in three elements each data = [1, 2, 3] data1 = ["sravan", "bobby", "ojaswi"] # specify column names columns = ['ID', 'NAME'] # creating a dataframe by zipping the two lists dataframe = spark.createDataFrame(zip(data, data1), columns) # show data frame dataframe.show() Output: Example 2: Python program to create 4 lists and create the dataframe Python3 # importing module import pyspark # importing sparksession from # pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giving # an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # list of college data with dictionary # with four lists in three elements each data = [1, 2, 3] data1 = ["sravan", "bobby", "ojaswi"] data2 = ["iit-k", "iit-mumbai", "vignan university"] data3 = ["AP", "TS", "UP"] # specify column names columns = ['ID', 'NAME', 'COLLEGE', 'ADDRESS'] # creating a dataframe by zipping # the two lists dataframe = spark.createDataFrame( zip(data, data1, data2, data3), columns) # show data frame dataframe.show() Output: Comment More infoAdvertise with us Next Article How to create a PySpark dataframe from multiple lists ? S sravankumar_171fa07058 Follow Improve Article Tags : Python Python-Pyspark Practice Tags : python Similar Reads PySpark - Create DataFrame from List In this article, we are going to discuss how to create a Pyspark dataframe from a list. To do this first create a list of data and a list of column names. Then pass this zipped data to spark.createDataFrame() method. This method is used to create DataFrame. The data attribute will be the list of da 2 min read Create PySpark DataFrame from list of tuples In this article, we are going to discuss the creation of a Pyspark dataframe from a list of tuples. To do this, we will use the createDataFrame() method from pyspark. This method creates a dataframe from RDD, list or Pandas Dataframe. Here data will be the list of tuples and columns will be a list 2 min read How to Add Multiple Columns in PySpark Dataframes ? In this article, we will see different ways of adding Multiple Columns in PySpark Dataframes. Let's create a sample dataframe for demonstration: Dataset Used: Cricket_data_set_odi Python3 # import pandas to read json file import pandas as pd # importing module import pyspark # importing sparksessio 2 min read How to create an empty PySpark DataFrame ? In PySpark, an empty DataFrame is one that contains no data. You might need to create an empty DataFrame for various reasons such as setting up schemas for data processing or initializing structures for later appends. In this article, weâll explore different ways to create an empty PySpark DataFrame 4 min read How to create DataFrame from Scala's List of Iterables? In Scala, working with large datasets is made easier with Apache Spark, a powerful framework for distributed computing. One of the core components of Spark is DataFrames, which organizes data into tables for efficient processing. In this article, we'll explore how to create DataFrames from simple li 3 min read Like