Spark Architecture
Spark Architecture
Mr. Virendra 1
Spark 2018
Mr. Virendra 2
Spark 2018
Job
|----- > CPU time
|----- > I/O time
Mr. Virendra 3
Spark 2018
Mr. Virendra 4
Spark 2018
Mr. Virendra 5
Spark 2018
Mr. Virendra 6
Spark 2018
Mr. Virendra 7
Spark 2018
Mr. Virendra 8
Spark 2018
Driver Program
o The main executable program from where spark
operation are performed.
o Control and coordinate all operations.
o The driver program is the main class execute
parallel operation on a cluster define RDD
Mr. Virendra 9
Spark 2018
Spark Context
o Driver accesses spark functionality through a
spark context object represents a connection the
computing cluster.
Mr. Virendra 10
Spark 2018
Mr. Virendra 11
Spark 2018
Spark Framework
Mr. Virendra 12
Spark 2018
Spark Engine:
Management:
Storage:
o In spark we can used HDFS,S3,RDBMS,local,NoSql etc..
Mr. Virendra 13
Spark 2018
Library :
o Spark SQL gives easy way to read and write data.
o ML Lib supports n number machine learning algorithm.
o Spark supports graph analysis with the help of
GraphX.
o Spark supports real time streaming with the help of
streaming library.
Programming
o Spark support various language like scala, python, R
and java.
o Scala,python and R having interactive shell.
o Sparks support many tools due to sparksql like JDBC.
Mr. Virendra 14
Spark 2018
Mr. Virendra 16
Spark 2018
Mr. Virendra 17