0% found this document useful (0 votes)
67 views13 pages

Spark Optimizations & Deployment

1. The document discusses Spark optimizations including narrow and wide transformations, RDD persistence, co-partitioning, and traffic minimization. Narrow transformations involve local computations while wide transformations require data sharing. 2. It explains how persisting RDDs in memory or on disk can avoid recomputing them on failure. Storage levels and replication can improve fault tolerance. 3. Co-partitioning inputs for joins and using checkpointing to an external storage like HDFS enhances performance and reliability.

Uploaded by

Othman Farhaoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views13 pages

Spark Optimizations & Deployment

1. The document discusses Spark optimizations including narrow and wide transformations, RDD persistence, co-partitioning, and traffic minimization. Narrow transformations involve local computations while wide transformations require data sharing. 2. It explains how persisting RDDs in memory or on disk can avoid recomputing them on failure. Storage levels and replication can improve fault tolerance. 3. Co-partitioning inputs for joins and using checkpointing to an external storage like HDFS enhances performance and reliability.

Uploaded by

Othman Farhaoui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

08/09/2021

Spark optimizations & deployment
Big Data
1. Wide and Narrow transformations
Spark optimizations & deployment 2. Optimizations
3. Page Rank example
4. Deployment on clusters & clouds

Stéphane Vialle  &  Gianluca Quercini

1 2

Wide and Narrow transformations Wide and Narrow transformations
Narrow transformations Narrow transformations
• Local computations applied to each partition block • Local computations applied to each partition block
 no communication between processes (or nodes)  no communication between processes (or nodes)
 only local dependencies (between parent & son RDDs)  only local dependencies (between parent & son RDDs)
•Map() •Union() •Map() •Union()
•Filter() •Filter()

RDD RDD RDD RDD

• In case of failure: 
• In case of sequence of Narrow transformations:   recompute only the damaged partition blocks 
possible pipelining inside one step  recompute/reload only its parent blocks 

Lineage

Map() Filter() Map(); Filter()


RDD RDD Source : Stack Overflow

3 4

Wide and Narrow transformations Wide and Narrow transformations
Wide transformations Wide transformations
• Computations requiring data from all parent RDD blocks  • Computations requiring data from all parent RDD blocks 
 many comms between processes (and nodes) (shuffle & sort)  many comms between processes (and nodes) (shuffle & sort)
 non‐local dependencies (between parent & son RDDs)  non‐local dependencies (between parent & son RDDs)
•groupByKey() •groupByKey()
•reduceByKey() •reduceByKey()

• In case of sequence of transformations:  • In case of sequence of failure: 


 no pipelining of transformations  recompute the damaged partition blocks 
 wide transformation must be totally achieved before to enter   recompute/reload all blocks of the parent RDDs
next transformation  reduceByKey filter

5 6

1
08/09/2021

Wide and Narrow transformations
Avoiding wide transformations with co‐partitioning Spark optimizations & deployment
• With identical partitioning of inputs:
wide transforma on → narrow transformation
1. Wide and Narrow transformations
2. Optimizations
• RDD Persistence
• RDD Co‐partitionning
• RDD controlled distribution
• Traffic minimization
Join with inputs  Join with inputs 
not co‐partitioned co‐partitioned • Maintaining parallelism
3. Page Rank example
• less expensive communications Control RDD partitioning 4. Deployment on clusters & clouds
• possible pipelining Force co‐partitioning
• less expensive fault tolerance (using the same partition map)

7 8

Optimizations: persistence Optimizations: persistence
Persistence of the RDD Persistence of the RDD to improve Spark application performances
RDD are stored: Spark application developper has to add instructions to force RDD 
• in the memory space of the Spark Executors storage, and to force RDD forgetting:
• or on disk (of the node) when memory space of the Executor is full myRDD.persist(StorageLevel) // or myRDD.cache()
… // Transformations and Actions
By default: an old RDD is removed when memory space is required myRDD.unpersist()
(Least Recently Used policy)
Available storage levels:
 An old RDD has to be re‐ • MEMORY_ONLY : in Spark Executor memory space
computed (using its lineage)  • MEMORY_ONLY_SER : + serializing the RDD data
when needed again Lineage • MEMORY_AND_DISK : on local disk when no memory space
• MEMORY_AND_DISK_SER : + serializing the RDD data in memory
 Spark allows to make a 
« persistent » RDD to  • DISK_ONLY : always on disk (and serialized)
avoid to  recompute it Source : Stack Overflow RDD is saved in the Spark executor memory/disk space
 limited to the Spark session

9 10

Optimizations: persistence
Persistence of the RDD to improve fault tolerance Spark optimizations & deployment
To face short term failures: Spark application developper can force 
RDD storage with replication in the local memory/disk of several
Spark Executors 1. Wide and Narrow transformations
myRDD.persist(storageLevel.MEMORY_AND_DISK_SER_2) 2. Optimizations
… // Transformations and Actions
myRDD.unpersist() • RDD Persistence
• RDD Co‐partitionning
To face serious failures: Spark application developper can checkpoint  • RDD controlled distribution
the RDD outside of the Spark data space, on HDFS or S3 or…
• Traffic minimization
myRDD.sparkContext.setCheckpointDir(directory)
myRDD.checkpoint() • Maintaining parallelism
… // Transformations and Actions 3. Page Rank example
 Longer, but secure! 4. Deployment on clusters & clouds

11 12

2
08/09/2021

Optimizations: RDD co‐partitionning Optimizations: RDD co‐partitionning
5 main internal properties of a RDD: Specify a « partitioner »
• A list of partition blocks
getPartitions() val rdd2 = rdd1
.partitionBy(new HashPartitioner(100))
• A function for computing each partition block .persist()
To compute and 
compute(…)
re‐compute the 
• A list of dependencies on other RDDs: parent  RDD when failure
Creates a new RDD (rdd2):
RDDs and transformations to apply happens
getDependencies()
• partitionned according to hash partitionner strategy
• on 100 Spark Executors
Optionally: To control the 
• A Partitioner for key‐value RDDs: metadata   Redistribute the RDD (rdd1  rdd2)
RDD partitioning, 
specifying the RDD partitioning to achieve co‐
 WIDE (expensive) transformation
partitioner() partitioning… • Do not keep the original partition (rdd1) in memory / on disk
• A list of nodes where each partition block  To improve data  • keep the new partition (rrd2) in memory / on disk
can be accessed faster due to data locality  locality with  to avoid to repeat a WIDE transformation when rdd2 is re‐used
getPreferredLocations(…) HDFS & YARN…

13 14

Optimizations: RDD co‐partitionning Optimizations: RDD co‐partitionning
Specify a « partitioner » Avoid repetitive WIDE transformations on large data sets 
Repeated op.
val rdd2 = rdd1 Same
partitioner
.partitionBy(new HashPartitioner(100)) used on 
.persist() Partitioner same set of 
specified keys
Partitionners:
• Hash partitioner : B
Re‐partition Repeated op.
Key0, Key0+100, Key0+200… on one Spark Executor One time
Narrow
• Range partitioner : A A.join(B)
Wide Wide
[Key‐min ; Key‐max] on one Spark Executor • Make ONE Wide op (one time) to 
avoid many Wide ops
• Custom partitioner (develop your own partitioner) :
• An explicit partitioning « propagates »  B
Ex : Key = URL, hash partitioned to the transformation result
BUT : hash only the domain name of the URL
• Replace Wide op by Narrow op
 all pages of the same domain on the same Spark A’.join(B)
• Do not re‐partition a RDD to use only A A’
Executor because they are frequently linked once! Wide Wide

15 16

Optimizations: RDD co‐partitionning Optimizations: RDD co‐partitionning
Co‐paritioning PageRank with partitioner (see further)
Repeated op.
Val links = …… // previous code
Use the same partitioner val links1 = links.partitionBy(new HashPartitioner(100)).persist()
Avoid to repeat Wide op.
var ranks = links1.mapValues(v => 1.0)

for (i <- 1 to iters) {


val contribs =
links1.join(ranks)
B .flatMap{ case (url (urlLinks, rank)) =>
Repeated op.
urlLinks.map(dest => (dest,rank/urlLinks.size))}
Wide ranks = contribs.reduceByKey(_ + _).mapValues(0.15 + 0.85 * _)
A’.join(B) }
A Wide A’
Narrow
Created • Initial links and ranks are co‐partitioned
with the  • Repeated join is Narrow‐Wide
right 
partitioning • Repeated mapValues is Narrow: respects the reduceByKey partitioning

• Pb: flatMap{…urlinks.map(…)} can change the partitionning ?!
A A’ A’.join(B) B
Wide Narrow Narrow

17 18

3
08/09/2021

Optimization: RDD distribution
Spark optimizations & deployment Create and distribute a RDD
• By default: level of parallelism set by the nb of partition blocks 
of the input RDD
1. Wide and Narrow transformations
• When the input is a in‐memory collection (list, array…), it needs
2. Optimizations to be parallelized:
• RDD Persistence val theData = List(("a",1), ("b",2), ("c",3),……)
• RDD Co‐partitionning sc.parallelize(theData).theTransformation(…)
• RDD controlled distribution Or : 
• Traffic minimization val theData = List(1,2,3,……).par
theData.theTransformation(…)
• Maintaining parallelism
3. Page Rank example  Spark adopts a distribution adapted to the cluster…
4. Deployment on clusters & clouds … but it can be tuned

19 20

Optimization: RDD distribution
Control of the RDD distribution Spark optimizations & deployment
• Most of transformations support an extra parameter to control 
the distribution (and the parallelism) 
1. Wide and Narrow transformations
• Example:  2. Optimizations
Default parallelism:  • RDD Persistence
val theData = List(("a",1), ("b",2), ("c",3),……)
• RDD Co‐partitionning
sc.parallelize(theData).reduceByKey((x,y) => x+y)
• RDD controlled distribution
Tuned parallelism: • Traffic minimization
val theData = List(("a",1), ("b",2), ("c",3),……) • Maintaining parallelism
sc.parallelize(theData).reduceByKey((x,y) => x+y,8)
3. Page Rank example
8 partition blocks imposed for 4. Deployment on clusters & clouds
the result of the reduceByKey

21 22

Optimization: traffic minimization Optimization: traffic minimization


RDD redistribution: rdd : {(1, 2), (3, 3), (3, 4)} RDD reduction: rdd : {(1, 2), (3, 3), (3, 4)}
Scala    : rdd.groupByKey()  rdd: {(1, [2]), (3, [3, 4])} Scala    : rdd.reduceByKey((x,y) => x+y)  rdd: {(1, 2), (3, 7)}
Group values associated to the same key Reduce values associated to the same key

 Move almost all input data 
shuffle
 Huge trafic in the shuffle step !!
((x,y) => x+y):
1 int + 1 int  1 int
groupByKey will be time consumming:  Limited trafic in the shuffle step shuffle
• no computation time…
• … but huge traffic on the network 
of the cluster/cloud
But:  ((x,y) => x+y):  TD‐1
 Optimize computations and communications  1 list + 1 list  1 longer list
in a Spark program

23 24

4
08/09/2021

Optimization: traffic minimization
RDD reduction with different input and reduced datatypes: Spark optimizations & deployment
Scala    : rdd.aggregateByKey(init_acc)(

…, // mergeValueAccumulator fct
1. Wide and Narrow transformations
2. Optimizations
…, // mergeAccumulators fct • RDD Persistence
)
• RDD Co‐partitionning
• RDD controlled distribution
Scala    : rdd.combineByKey( • Traffic minimization
…, // createAccumulator fct
• Maintaining parallelism
…, // mergeValueAccumulator fct 3. Page Rank example
…, // mergeAccumulators fct shuffle
4. Deployment on clusters & clouds
)

25 26

Optimization: maintaining parallelism Optimization: maintaining parallelism


Computing an average value per key in parallel Computing an average value per key in parallel
theMarks: {(‘’julie’’, 12), (‘’marc’’, 10), (‘’albert’’, 19), (‘’julie’’, 15), (‘’albert’’, 15),…} theMarks: {(‘’julie’’, 12), (‘’marc’’, 10), (‘’albert’’, 19), (‘’julie’’, 15), (‘’albert’’, 15),…}

• Solution 1: mapValues + reduceByKey + collectAsMap + foreach • Solution 2: combineByKey + collectAsMap + foreach


val theSums = theMarks val theSums = theMarks
.mapValues(v => (v, 1)) .combineByKey(
.reduceByKey((vc1, vc2) => (vc1._1 + vc2._1, // createCombiner function
vc1._2 + vc2._2)) Type (valueWithNewKey) => (valueWithNewKey, 1),
.collectAsMap() // Return a ‘Map’ datastructure inference // mergeValue function (inside a partition block)
ACTION  Break parallelism! Bad performances!  needs (acc:(Int, Int), v) =>(acc._1 + v, acc._2 + 1),
some // mergeCombiners function (after shuffle comm.)
theSums.foreach( help! (acc1:(Int, Int), acc2:(Int, Int)) =>
kvc => println(kvc._1 +
(acc1._1 + acc2._1, acc1._2 + acc2._2))
" has average:" +
Sequential computing ! .collectAsMap() Still bad performances! (Break parallelism)
kvc._2._1/kvc._2._2.toDouble))
theSums.foreach(
kvc => println(kvc._1 + " has average:" +
Still sequential ! kvc._2._1/kvc._2._2.toDouble))

27 28

Optimization: maintaining parallelism
Computing an average value per key in parallel Spark optimizations & deployment
theMarks: {(‘’julie’’, 12), (‘’marc’’, 10), (‘’albert’’, 19), (‘’julie’’, 15), (‘’albert’’, 15),…}

• Solution 2: combineByKey + map + collectAsMap + foreach 1. Wide and Narrow transformations


val theSums = theMarks
.combineByKey(
2. Optimizations
// createCombiner function 3. Page Rank example
(valueWithNewKey) => (valueWithNewKey, 1), 4. Deployment on clusters & clouds
// mergeValue function (inside a partition block)
(acc:(Int, Int), v) =>(acc._1 + v, acc._2 + 1), • Task DAG execution
Transformation: 
compute in 
// mergeCombiners function (after shuffle comm.) • Spark execution on clusters
parallel and  (acc1:(Int, Int), acc2:(Int, Int)) => • Ex of Spark execution on cloud
return a RDD (acc1._1 + acc2._1, acc1._2 + acc2._2))
.map{case (k,vc) => (k, vc._1/vc._2.toDouble)}

theSums.collectAsMap().foreach( Action: at the end (just to print)


kv => println(kv._1 + " has average:" + kv._2))

29 30

5
08/09/2021

PageRank with Spark PageRank with Spark


PageRank objectives PageRank principles
Important URL 
(referenced by  • Simplified algorithm:
𝐵 𝑢 : the set containing all
Compute the probability to  many pages) pages linking to page u
arrive at a web page when Rank increases
𝑃𝑅 𝑣
url 1 (referenced by an  𝑃𝑅 𝑢 𝑃𝑅 𝑥 : PageRank of page x
randomly clicking on web  𝐿 𝑣
important URL) ∈
links… 𝐿 𝑣 : the number of outbound 
url 2 url 4
links of page v

Contribution of page v
url 3 to the rank of page u

• If a URL is referenced by many other URLs then its rank increases • Initialize the PR of each page with an equi‐probablity


(because being referenced means that it is important – ex: URL 1)
• Iterate k times:
• If an important URL (like URL 1) references other URLs (like URL 4)  compute PR of each page
this will increase the destination’s ranking

31 32

PageRank with Spark PageRank with Spark


PageRank principles PageRank first step in Spark (Scala)
// read text file into Dataset[String] -> RDD1
• The damping factor:  val lines = spark.read.textFile(args(0)).rdd
the probability a user continues to click is a damping factor: d
val pairs = lines.map{ s =>
𝑁 : Nb of documents // Splits a line into an array of
1 𝑑 𝑃𝑅 𝑣 in the collection  // 2 elements according space(s)
𝑃𝑅 𝑢 𝑑.
𝑁 𝐿 𝑣 val parts = s.split("\\s+")
∈ Usually : d = 0.85
// create the parts<url, url>
Sum of all PR is 1 // for each line in the file
(parts(0), parts(1))
}
// RDD1 <string, string> -> RDD2<string, iterable>
Variant: val links = pairs.distinct().groupByKey().cache()
𝑃𝑅 𝑣
𝑃𝑅 𝑢 1 𝑑 𝑑. Usually : d = 0.85
𝐿 𝑣 ‘’url 4  url 3’’
∈ ‘’url 4  url 1’’ links RDD url 4 [url 3, url 1]
‘’url 2  url 1’’ url 3 [url 2, url 1]
Sum of all PR is Npages ‘’url 1  url 4’’ url 2 [url 1]
‘’url 3  url 2’’ url 1 [url 4]
‘’url 3  utl 1’’

33 34

PageRank with Spark PageRank with Spark


url 1 url 1
PageRank second step in Spark (Scala) url 2 url 4
PageRank third step in Spark (Scala) url 2 url 4
for (i <- 1 to iters) {
Initialization with 1/N equi‐probability: url 3 val contribs = url 3
links.join(ranks)
// links <key, Iter> RDD  ranks <key,1.0/Npages> RDD .flatMap{ case (url (urlLinks, rank)) =>
var ranks = links.mapValues(v => 1.0/4.0) urlLinks.map(dest => (dest, rank/urlLinks.size)) }
ranks = contribs.reduceByKey(_ + _)
links.mapValues(…) is an immutable RDD .mapValues(0.15 + 0.85 * _)
}
var ranks is a mutable variable
links RDD Output links RDD’
var ranks = RDD1 url 4 [url 3, url 1] url 4 ([url 3, url 1], 1.0)
ranks = RDD2
« ranks » is re‐associated to a new RDD url 3 [url 2, url 1] url 3 ([url 2, url 1], 1.0) contribs RDD
url 2 [url 1] url 2 ([url 1], 1.0) .flatmap url 3 0.5
RDD1 is forgotten …  url 1 0.5
url 1 [url 4] url 1 ([url 4], 1.0)
…and will be removed from memory .join Output links & url 2 0.5
Other strategy: url 4 1.0 contributions url 1 0.5
// links <key, Iter> RDD  ranks <key,one> RDD url 3 1.0 url 1 1.0
var ranks = links.mapValues(v => 1.0) url 2 1.0 url 4 1.0 url 3 0.5 .reduceByKey url 4 1.0
.mapValues
url 1 1.0 url 3 0.57 url 1 2.0
links RDD url 4 [url 3, url 1] individual input
ranks RDD url 4 1.0 ranks RDD url 2 0.57 url 2 0.5
url 3 [url 2, url 1] url 3 1.0 url 4 1.0 contributions
url 1 1.849
url 2 [url 1] url 2 1.0 new ranks RDD Individual & cumulated
url 1 [url 4] url 1 1.0 var ranks (with damping factor) input contributions

35 36

6
08/09/2021

PageRank with Spark PageRank with Spark


PageRank third step in Spark (Scala) PageRank third step in Spark (Scala): optimized with partitioner
• Spark & Scala allow a short/compact implementation of the  Val links = …… // previous code
PageRank algorithm val links1 = links.partitionBy(new HashPartitioner(100)).persist()

var ranks = links1.mapValues(v => 1.0)


• Each RDD remains in‐memory from one iteration to the next one
for (i <- 1 to iters) {
val contribs =
val lines = spark.read.textFile(args(0)).rdd links1.join(ranks)
val pairs = lines.map{ s => .flatMap{ case (url (urlLinks, rank)) =>
val parts = s.split("\\s+") urlLinks.map(dest => (dest,rank/urlLinks.size))}
(parts(0), parts(1)) } ranks = contribs.reduceByKey(_ + _).mapValues(0.15 + 0.85 * _)
val links = pairs.distinct().groupByKey().cache() }

var ranks = links.mapValues(v => 1.0) • Initial links and ranks are co‐partitioned

for (i <- 1 to iters) { • Repeated join is Narrow‐Wide


val contribs =
links.join(ranks)
• Repeated mapValues is Narrow: respects the reduceByKey partitioning
.flatMap{ case (url (urlLinks, rank)) =>
urlLinks.map(dest => (dest,rank/urlLinks.size))} • Pb: flatMap{…urlinks.map(…)} can change the partitionning ?!
ranks = contribs.reduceByKey(_ + _).mapValues(0.15 + 0.85 * _)
}

37 38

Task DAG execution
Spark optimizations & deployment • A RDD is a dataset distributed among the Spark compute nodes
• Transformations are lazy operations: saved and executed further
1. Wide and Narrow transformations • Actions trigger the execution of the sequence of transformations
2. Optimizations A job is a sequence of  RDD map
3. Page Rank example RDD transformations,  mapValues
Transformation
4. Deployment on clusters & clouds ended by an action reduceByKey
• Task DAG execution RDD …

• Spark execution on clusters Action


• Ex of Spark execution on cloud Result

A Spark application is a set of jobs to run sequentially or in parallel


 A DAG of tasks

39 40

Task DAG execution Task DAG execution


The Spark application driver controls the application run Spark job trace: on 10 Spark executors, with 3GB input file 
DAGScheduler: Submitting 24 missing tasks from ShuffleMapStage 0 ...
Submitting the 10 
• It creates the Spark context TaskSchedulerImpl: Adding task set 0.0 with 24 tasks first tasks on the 
... 10 Spark executor
• It analyses the Spark program processes
TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, 172.20.10.14, executor 0, partition 1, ...)
TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, 172.20.10.11, executor 7, partition 2, ...)
• It creates a DAG of tasks for each job ...
TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, 172.20.10.11, executor 7, partition 10, ...)
• It optimizes the DAG 
− pipelining narrow transformations TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 18274 ms … (executor 7) (1/24)
TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, 172.20.10.7, executor 8, partition 11, ...)
− identifying the tasks that can be run in parallel TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 18459 ms … (executor 8) (2/24)
...
Submitting a new 
TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks task when a 
• It schedules the DAG of tasks on the available worker nodes have all completed, from pool previous one has 
(the Spark Executors) in order to maximize parallelism (and  ... finished
End of task graph 
to reduce the execution time) execution

41 42

7
08/09/2021

Task DAG execution
Execution time as a function of the number of Spark executors Spark optimizations & deployment
Ex. of Spark application run: Spark pgm run on 1-15 nodes
• from 1 up to 15 executors 512 1. Wide and Narrow transformations
• with 1 executor per node
2. Optimizations
256
Good overall decrease but  Exec Time(s) 3. Page Rank example
plateaus appear ! 128 4. Deployment on clusters & clouds
Probable load balancing 
problem… 64 • Task DAG execution
• Spark execution on clusters
32
1 2 4 8 16 • Using the Spark cluster manager (standalone mode)
Ex: a graph of 4 parallel tasks Nb of nodes • Using YARN as cluster manager
• Using Mesos as cluster manager
on 1 on 2 on 3 • Ex of Spark execution on cloud
node: T nodes: T/2 nodes: T/2 A plateau appears

43 44

Using the Spark Master as cluster  Using the Spark Master as cluster 


manager (standalone mode) manager (standalone mode)
spark-submit --master spark://node:port … myApp spark-submit --master spark://node:port … myApp

Spark Master Cluster worker node Spark Master Cluster worker node


Cluster worker node Cluster worker node
 Cluster  Cluster worker node  Cluster  Cluster worker node
Manager Cluster worker node Manager Cluster worker node
Cluster worker node Cluster worker node
Cluster worker node Cluster worker node

Spark cluster configuration: Spark cluster configuration:
• Add the list of cluster worker nodes in the Spark Master config. • Default config :
− (only) 1GB/Spark Executor
• Specify the maximum amount of memory per Spark Executor − Unlimited nb of CPU cores per application execution
spark-submit --executor-memory XX … − The Spark Master creates one mono‐core Executor on all 
• Specify the total amount of CPU cores used to process one  Worker nodes to process each job … 
Spark application (through all its Spark executors) • You can limit the total nb of cores per job
spark-submit --total-executor-cores YY …
• You can concentrate the cores into few multi‐core Executors

45 46

Using the Spark Master as cluster  Using the Spark Master as cluster 


manager (standalone mode) manager (standalone mode)
spark-submit --master spark://node:port … myApp spark-submit --master spark://node:port … myApp

Spark Master Cluster worker node Spark Master Cluster worker node


Cluster worker node Cluster worker node
 Cluster  Cluster worker node  Cluster  Cluster worker node
Manager Cluster worker node Manager Cluster worker node
Cluster worker node Cluster worker node
Cluster worker node Cluster worker node

Spark cluster configuration: Client deployment mode:


• Default config : Spark Master
 Cluster 
− (only) 1GB/Spark Executor Manager
− Unlimited nb of CPU cores per application execution
Spark app. Driver Spark  Spark  Spark 
one multi‐core Executor
− The Spark Master creates one mono‐core Executor on all 
on all  • DAG builder executor executor executor
job (invading all cores!)
Worker nodes to process each job  • DAG scheduler‐
optimizer Spark  Spark  Spark 
• You can limit the total nb of cores per job • Task scheduler executor executor executor

• You can concentrate the cores into few multi‐core Executors Interactive control of the application: development mode

47 48

8
08/09/2021

Using the Spark Master as cluster  Using the Spark Master as cluster 


manager (standalone mode) manager (standalone mode)
spark-submit --master spark://node:port … myApp spark-submit --master spark://node:port … myApp

Cluster worker node Spark Master Cluster worker node


Spark Master Cluster worker node
 Cluster  Cluster worker node  Cluster  & Hadoop Data Node
Manager Cluster worker node Manager
Cluster worker node Cluster worker node
Cluster worker node & Hadoop Data Node

HDFS
Cluster deployment mode: Name Node Cluster worker node
& Hadoop Data Node
Spark Master Spark app. Driver
 Cluster  • DAG builder
Manager • DAG scheduler‐optimizer
• Task scheduler The Cluster Worker nodes should be the Data nodes, storing initial 
RDD values or new generated (and saved) RDD
Laptop connection
Spark 
executor
Spark 
executor
 Will improve the global data‐computations locality
can be turn off:   When using HDFS: the Hadoop data nodes should be
Spark  Spark  Spark 
production mode executor executor executor re‐used as worker nodes for Spark Executors

49 50

Using the Spark Master as cluster  Using the Spark Master as cluster 


manager (standalone mode) manager (standalone mode)
spark-submit --master spark://node:port … myApp spark-submit --master spark://node:port … myApp

Cluster worker node Spark Master Cluster worker node


Spark Master  Cluster  & Hadoop Data Node
 Cluster  & Hadoop Data Node Manager
Manager Cluster worker node
Cluster worker node & Hadoop Data Node
& Hadoop Data Node
HDFS
Name Node Cluster worker node
HDFS & Hadoop Data Node
Name Node Cluster worker node
& Hadoop Data Node Cluster
Spark Master Spark app. Driver
deployment mode:  Cluster  • DAG builder
The Cluster Worker nodes should be the Data nodes, storing initial  Manager • DAG scheduler‐optimizer
• Task scheduler
RDD values or new generated (and saved) RDD
When using the Spark Master as Cluster Manager: HDFS Spark  Spark 
Name Node executor executor
…there is no way to localize the Spark Executors on the 
data nodes hosting the right RDD blocks! Spark  Spark  Spark 
executor executor executor

51 52

Using the Spark Master as cluster 
manager (standalone mode)
spark-submit --master spark://node:port … myApp
Spark optimizations & deployment
Spark Master Cluster worker node
 Cluster  & Hadoop Data Node
Manager
Cluster worker node 1. Wide and Narrow transformations
& Hadoop Data Node 2. Optimizations
HDFS
Name Node Cluster worker node 3. Page Rank example
& Hadoop Data Node
4. Deployment on clusters & clouds
Strenght and weakness of standalone mode: • Task DAG execution
• Nothing more to install (included in Spark) • Spark execution on clusters
• Easy to configure • Using the Spark cluster manager (standalone mode)
• Can run different jobs concurrently • Using YARN as cluster manager
• Can not share the cluster with non‐Spark applications • Using Mesos as cluster manager
• Can not launch Executors on the data nodes hosting input data • Ex of Spark execution on cloud
• Limited scheduling mechanism (unique queue)

53 54

9
08/09/2021

Using YARN as cluster manager Using YARN as cluster manager


export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf
spark-submit --master yarn … myApp spark-submit --master yarn … myApp
YARN Cluster worker node YARN Cluster worker node
Resource & Hadoop Data Node Resource & Hadoop Data Node
Manager Manager
Cluster worker node Cluster worker node
& Hadoop Data Node & Hadoop Data Node

HDFS HDFS
Name Node Cluster worker node Name Node Cluster worker node
& Hadoop Data Node & Hadoop Data Node

Spark cluster configuration: Spark cluster configuration:
• Add an env. variable defining the path to Hadoop conf directory • By default:
• Specify the maximum amount of memory per Spark Executor − (only) 1GB/Spark Executor
• Specify the amount of CPU cores used per Spark executor − (only) 1 CPU core per Spark Executor
spark-submit --executor-cores YY … − (only) 2 Spark Executors per job
• Specify the nb of Spark Executors per job: --num-executors • Usually better with few large Executors (RAM & nb of cores)…

55 56

Using YARN as cluster manager Using YARN as cluster manager


export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf
spark-submit --master yarn … myApp spark-submit --master yarn … myApp
YARN Cluster worker node YARN Cluster worker node
Resource & Hadoop Data Node Resource & Hadoop Data Node
Manager Manager
Cluster worker node Cluster worker node
& Hadoop Data Node & Hadoop Data Node
HDFS
Name Node Cluster worker node
HDFS & Hadoop Data Node
Name Node Cluster worker node
& Hadoop Data Node
Client deployment
YARN
Spark cluster configuration: mode: Resource App. Master
Manager Executor launcher
• Link Spark RDD meta‐data « prefered locations » to HDFS meta‐
data about « localization of the input file blocks »
HDFS
val sc = new SparkContext(sparkConf,
Spark Driver Name Node
Spark Context • DAG builder
InputFormatInfo.computePreferredLocations( • DAG scheduler‐
Seq(new InputFormatInfo(conf,  construction optimizer
• Task scheduler
classOf[org.apache.hadoop.mapred.TextInputFormat], hdfspath ))…

57 58

Using YARN as cluster manager Using YARN as cluster manager


export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf
spark-submit --master yarn … myApp spark-submit --master yarn … myApp
YARN Cluster worker node YARN Cluster worker node
Resource & Hadoop Data Node Resource & Hadoop Data Node
Manager Manager
Cluster worker node Cluster worker node
& Hadoop Data Node & Hadoop Data Node
HDFS HDFS
Name Node Cluster worker node Name Node Cluster worker node
& Hadoop Data Node & Hadoop Data Node

Client deployment Cluster deployment


mode: YARN
mode: YARN App. Master / Spark Driver
Resource App. Master Resource • DAG builder
Manager « Executor » launcher Manager • DAG scheduler‐optimizer
• Task scheduler

HDFS Spark  HDFS Spark 


Spark Driver Name Node executor Name Node executor
• DAG builder
• DAG scheduler‐
optimizer Spark  Spark 
• Task scheduler executor executor

59 60

10
08/09/2021

Using YARN as cluster manager Using YARN as cluster manager


export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf export HADOOP_CONF_DIR = ${HADOOP_HOME}/conf
spark-submit --master yarn … myApp spark-submit --master yarn … myApp
YARN Cluster worker node YARN Cluster worker node
Resource & Hadoop Data Node Resource & Hadoop Data Node
Manager Manager
Cluster worker node Cluster worker node
& Hadoop Data Node & Hadoop Data Node
HDFS HDFS
Name Node Cluster worker node Name Node Cluster worker node
& Hadoop Data Node & Hadoop Data Node

YARN vs standalone Spark Master: YARN vs standalone Spark Master:


• Usually available on HADOOP/HDFS clusters • Improvement of the data‐computation locality…but is it critical ?
• Allows to run Spark and other kinds of applications on HDFS − Spark reads/writes only input/output RDD from Disk/HDFS
(better to share a Hadoop cluster) − Spark keeps intermediate RDD in‐memory 
• Advanced application scheduling mechanisms − With cheap disks: disk‐IO time > network time
(multiple queues, managing priorities…)  Better to deploy many Executors on unloaded nodes ?

61 62

Using MESOS as cluster manager 


Spark optimizations & deployment spark-submit --master mesos://node:port … myApp
Cluster worker node
Mesos Master & Hadoop Data Node
 Cluster
1. Wide and Narrow transformations Manager Cluster worker node
& Hadoop Data Node
2. Optimizations HDFS Cluster worker node
Name Node
3. Page Rank example & Hadoop Data Node
4. Deployment on clusters & clouds Mesos is a generic cluster manager 
• Task DAG execution • Supporting to run both:
• Spark execution on clusters − short term distributed computations
• Using the Spark cluster manager (standalone mode) − long term services (like web services)
• Using YARN as cluster manager
• Compatible with HDFS
• Using Mesos as cluster manager
• Ex of Spark execution on cloud

63 64

Using MESOS cluster manager  Using MESOS as cluster manager 


spark-submit --master mesos://node:port … myApp spark-submit --master mesos://node:port … myApp
Cluster worker node Cluster worker node
Mesos Master & Hadoop Data Node Mesos Master & Hadoop Data Node
 Cluster  Cluster
Manager Cluster worker node Manager Cluster worker node
& Hadoop Data Node & Hadoop Data Node
HDFS Cluster worker node HDFS Cluster worker node
Name Node Name Node
& Hadoop Data Node & Hadoop Data Node

• Specify the maximum amount of memory per Spark Executor Client deployment


spark-submit --executor-memory XX … mode: Mesos Master With just Mesos:
 Cluster • No Application Master
• Specify the total amount of CPU cores used to process one Spark  Manager
• No Input Data – Executor locality
application (through all its Spark executors)
spark-submit --total-executor-cores YY …
HDFS Spark  Spark 
Spark Driver Name Node executor executor
• Default config: • DAG builder
• DAG scheduler‐
− create few Executors with max nb of cores  like standalone…   optimizer
• Task scheduler
− use all available cores to process each job           …in 2019

65 66

11
08/09/2021

Using MESOS as cluster manager  Using MESOS as cluster manager 


spark-submit --master mesos://node:port … myApp spark-submit --master mesos://node:port … myApp
Cluster worker node Cluster worker node
Mesos Master & Hadoop Data Node Mesos Master & Hadoop Data Node
 Cluster  Cluster
Manager Cluster worker node Manager Cluster worker node
& Hadoop Data Node & Hadoop Data Node
HDFS Cluster worker node HDFS Cluster worker node
Name Node Name Node
& Hadoop Data Node & Hadoop Data Node

Cluster deployment • Coarse grained mode: number of cores allocated to each Spark 


mode: Mesos Master Spark Driver Executor are set at launching time, and cannot be changed
 Cluster • DAG builder
Manager • DAG scheduler‐
optimizer • Fine grained mode: number of cores associated to an Executor
• Task scheduler
can dynamically change, function of the number of concurrent 
HDFS jobs and function of the load of each executor (specificity!)
Name Node
 Better solution/mechanism to support many shell interpretors
 But latency can increase (Spark Streaming lib can be disturbed)

67 68

Using Amazon Elastic Compute


Cloud « EC2 »
Spark optimizations & deployment
spark-ec2 … -s <#nb of slave nodes>
-t <type of slave nodes>
launch MyCluster-1
1. Wide and Narrow transformations
Standalone 
2. Optimizations Spark Master
MyCluster‐1

3. Page Rank example
4. Deployment on clusters & clouds
• Task DAG execution
• Spark execution on clusters
• Ex of Spark execution on cloud

69 70

Using Amazon Elastic Compute Using Amazon Elastic Compute


Cloud « EC2 » Cloud « EC2 »
spark-ec2 … -s <#nb of slave nodes> spark-ec2 … -s <#nb of slave nodes>
-t <type of slave nodes> -t <type of slave nodes>
launch MyCluster-1 launch MyCluster-2
Standalone  Spark app. Driver Spark app. Driver
Standalone 
Spark Master Standalone 
Spark Master • DAG builder Spark Master • DAG builder
MyCluster‐1

MyCluster‐1

• DAG scheduler‐optimizer • DAG scheduler‐optimizer
• Task scheduler • Task scheduler

HDFS Spark  Spark  HDFS Spark  Spark 


Name Node executor executor Name Node executor executor
Spark  Spark  Spark  Spark  Spark  Spark 
executor executor executor executor executor executor
MyCluster‐2

Spark Master

HDFS
Name Node

71 72

12
08/09/2021

Using Amazon Elastic Compute Using Amazon Elastic Compute


Cloud « EC2 » Cloud « EC2 »
spark-ec2 … -s <#nb of slave nodes> spark-ec2 … launch MyCluster-1
spark-ec2 destroy MyCluster-2
-t <type of slave nodes>
launch MyCluster-2 spark-ec2 get-master MyCluster-1  MasterNode
scp … myApp.jar root@MasterNode
Spark app. Driver spark-ec2 … login MyCluster-1
Standalone 
Spark Master • DAG builder spark-submit --master spark://node:port … myApp
MyCluster‐1

• DAG scheduler‐optimizer
• Task scheduler
Standalone  Spark app. Driver
Spark Master • DAG builder

MyCluster‐1
• DAG scheduler‐optimizer
HDFS Spark  Spark  • Task scheduler
Name Node executor executor
Spark  Spark  Spark  HDFS Spark  Spark 
executor executor executor Name Node executor executor
MyCluster‐2

Spark Master Spark  Spark  Spark 


executor executor executor
HDFS spark-ec2 destroy MyCluster-1
Name Node

73 74

Using Amazon Elastic Compute Using Amazon Elastic Compute


Cloud « EC2 » Cloud « EC2 »
spark-ec2 … launch MyCluster-1
Start to learn to deploy HDFS and Spark architectures
spark-ec2 get-master MyCluster-1  MasterNode Then, learn to deploy these architectecture in a CLOUD
scp … myApp.jar root@MasterNode … or use a ‘’Spark Cluster service’’: ready to use in a CLOUD!
spark-ec2 … login MyCluster-1
spark-submit --master spark://node:port … myApp
Learn to minimize the cost (€) of a Spark cluster:
MyCluster‐1

Standalone 
Spark Master • Allocate the right number of nodes
• Stop when you do not use, and re‐start further
HDFS
Name Node
Choose to allocate reliable or preemptible machines:
spark-ec2 stop MyCluster-1  Stop billing • Reliable machines during all the session (standard)
spark-ec2 … start MyCluster-1  Restart billing • Preemptibles machines  (5x less expensive!)
 require to support to loose some tasks, or to checkpoint…
spark-ec2 destroy MyCluster-1

75 76

Spark optimizations & deployment

77

13

You might also like