Pyspark Funcamentals
Pyspark Funcamentals
4. Data Transformation
5. Data Profiling
7. Data Import/Export
9. Spark SQL
df.createOrReplaceTempView("table");
df.withColumn("new_column", expr("SQL_expression"))
● Creating a GraphFrame:
from graphframes import GraphFrame;
g = GraphFrame(vertices_df, edges_df)
● Running Graph Algorithms: results = g.pageRank(resetProbability=0.15, maxIter=10)
● Subgraphs and Motif Finding: g.find("(a)-[e]->(b); (b)-[e2]->(a)")
spark.conf.set("spark.executor.memory", "2g")
df.withColumn("vector_col", Vectors.dense("column"))