How to drop multiple column names given in a list from PySpark DataFrame ? Last Updated : 17 Jun, 2021 Summarize Comments Improve Suggest changes Share Like Article Like Report In this article, we are going to drop multiple columns given in the list in Pyspark dataframe in Python. For this, we will use the drop() function. This function is used to remove the value from dataframe. Syntax: dataframe.drop(*['column 1','column 2','column n']) Where, dataframe is the input dataframecolumn names are the columns passed through a list in the dataframe. Python code to create student dataframe with three columns: Python3 # importing module import pyspark # importing sparksession from pyspark.sql module from pyspark.sql import SparkSession # creating sparksession and giving an app name spark = SparkSession.builder.appName('sparkdf').getOrCreate() # list of students data data =[["1","sravan","vignan"], ["2","ojaswi","vvit"], ["3","rohith","vvit"], ["4","sridevi","vignan"], ["1","sravan","vignan"], ["5","gnanesh","iit"]] # specify column names columns=['student ID','student NAME','college'] # creating a dataframe from the lists of data dataframe = spark.createDataFrame(data,columns) print("Actual data in dataframe") # show dataframe dataframe.show() Output: Actual data in dataframe +----------+------------+-------+ |student ID|student NAME|college| +----------+------------+-------+ | 1| sravan| vignan| | 2| ojaswi| vvit| | 3| rohith| vvit| | 4| sridevi| vignan| | 1| sravan| vignan| | 5| gnanesh| iit| +----------+------------+-------+ Example 1: Program to delete multiple column names as a list. Python3 list = ['student NAME','college'] # drop two columns in dataframe dataframe = dataframe.drop(*list) dataframe.show() Output: +----------+ |student ID| +----------+ | 1| | 2| | 3| | 4| | 1| | 5| +----------+ Example 2: Example program to drop one column names as a list. Python3 list = ['college'] # drop two columns in dataframe dataframe=dataframe.drop(*list) dataframe.show() Output: +----------+------------+ |student ID|student NAME| +----------+------------+ | 1| sravan| | 2| ojaswi| | 3| rohith| | 4| sridevi| | 1| sravan| | 5| gnanesh| +----------+------------+ Example 3: Drop all column names as a list. Python3 list = ['student ID','student NAME','college'] # drop all columns in dataframe dataframe=dataframe.drop(*list) dataframe.show() Output: ++ || ++ || || || || || || ++ Comment More infoAdvertise with us Next Article How to get name of dataframe column in PySpark ? G gottumukkalabobby Follow Improve Article Tags : Python Python-Pyspark Practice Tags : python Similar Reads Drop One or Multiple Columns From PySpark DataFrame In this article, we will discuss how to drop columns in the Pyspark dataframe. In pyspark the drop() function can be used to remove values/columns from the dataframe. Syntax: dataframe_name.na.drop(how=âany/allâ,thresh=threshold_value,subset=[âcolumn_name_1â³,âcolumn_name_2â]) how â This takes either 3 min read How to create a PySpark dataframe from multiple lists ? In this article, we will discuss how to create Pyspark dataframe from multiple lists. ApproachCreate data from multiple lists and give column names in another list. So, to do our task we will use the zip method. zip(list1,list2,., list n) Pass this zipped data to spark.createDataFrame() method data 2 min read How to Add Multiple Columns in PySpark Dataframes ? In this article, we will see different ways of adding Multiple Columns in PySpark Dataframes. Let's create a sample dataframe for demonstration: Dataset Used: Cricket_data_set_odi Python3 # import pandas to read json file import pandas as pd # importing module import pyspark # importing sparksessio 2 min read How to drop one or multiple columns in Pandas DataFrame Let's learn how to drop one or more columns in Pandas DataFrame for data manipulation. Drop Columns Using df.drop() MethodLet's consider an example of the dataset (data) with three columns 'A', 'B', and 'C'. Now, to drop a single column, use the drop() method with the columnâs name.Pythonimport pand 4 min read How to get name of dataframe column in PySpark ? In this article, we will discuss how to get the name of the Dataframe column in PySpark. To get the name of the columns present in the Dataframe we are using the columns function through this function we will get the list of all the column names present in the Dataframe. Syntax: df.columns We can a 3 min read How to change dataframe column names in PySpark ? In this article, we are going to see how to change the column names in the pyspark data frame. Let's create a Dataframe for demonstration: Python3 # Importing necessary libraries from pyspark.sql import SparkSession # Create a spark session spark = SparkSession.builder.appName('pyspark - example jo 3 min read Like