Logical plan is nothing but Lineage, which tells us how to create an RDD by applying what transformation. Whenever an action is called, this lineage will be converted into a DAG, basically a Physical plan.
It is great video and helps alot.. Shreya . Interesting to know below . 1 . Why complete stage is skipped 2 . Shuffle should create different stage . But In snap of DAG , it is in same stage .
Thanks for sharing the valuable information in most simplest way. It saved a lots of time in reading spark tutorial. sincere thanks to you. I have one question - where can we see or generate this DAG visualization in spark ? We want to see the plan in case to analyse our spark programme ? Please let us know . 🙏 Thank you
I confused a bit... I just watched Spark Summit videos. It states that on Physical Planning stage Spark applies 'Strategy' (a function that converts a logical plan tree to a physical plan tree)... Where does DAG come into play?
Dag is created once a spark job is submitted. Think of the vertices as RDDs and the edges as the operations. The dag schedulers job is to convert the dag into stages and tasks that will perform actual execution .