Every job is a piece of work to be executed by an executor on a cluster. A query is analyzed and then split into stages according to the transformations in the query itself. Every stage is then split into multiple jobs which can be parallelized and pipelined for best efficiency.