The core components of the spark application are: * Driver * When we submit a Spark application in cluster mode using spark-submit, Driver will interact with the Cluster Resource Manager to start the Application Master. * It is also convert user code into logical plan (DAG) and then convert to physical plan. * Application Master * Driver request Application Master for executors for executing the user code, application Master will negotiates the resources with the Resource Manager to host these executors. * Spark Context * Driver will create Spark Context for each application and Spark Context is the main entry point for executing any Spark functionality. * Executors * Executors are processes on the worker nodes whose job is to execute the assigned tasks. The Spark execution model can be defined in three phases: * Logical Plan * Converting user code into different steps which will be executed when an action is performed. * Logical plan will create DAG for how spark will execute all transformation * Physical Plan * Converting Logical plan into Physical plan using Catalyst and Tungsten optimisation techniques. * Few methods while choosing/translating best physical plan using Catalyst optimiser. * Remove repeated operations(ex. adding of two numbers to perform for each row) * Predicate pushdown : pushes the filters as close as possible to data sources. * Column pruning : only select needed columns. * Tungsten: executing query plan on the actual cluster, which generate optimised code based on query plan that generated by Catalyst Optimiser * Executions: * Physical plan is covered into number of stages and then convert into tasks. * Driver request the Cluster Manager and negotiates the resources. Cluster Manager will allocate containers and launches executors on all the allocated containers and assigns tasks to run on behalf of the Driver.
Monday, March 23, 2020
Spark Core components and Execution Model
Subscribe to:
Post Comments (Atom)
I really liked your blog post.Much thanks again. Awesome.
ReplyDeleteOracle rac training
Oracle SCM online training
Oracle SCM training
Oracle SOA online training
Oracle SOA training
Oracle sql plsql online training
Oracle sql plsql training
Oracle Web logic online training
Oracle Web logic training