Тёмный

Distributed Job Scheduler Design Deep Dive with Google SWE! | Systems Design Interview Question 25 

Jordan has no life
Подписаться 40 тыс.
Просмотров 26 тыс.
50% 1

For my next trick I'll show you all how to make a blow job scheduler
00:18 Introduction
01:22 Functional Requirements
02:33 Capacity Estimates
03:21 API Design
04:09 Database Schema
05:18 Architectural Overview

Наука

Опубликовано:

 

8 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 91   
@shivanshjagga255
@shivanshjagga255 Год назад
4:43 DB Schema: [jobID, s3URL, status, retryTimestamp ] status = ENUM , (enum - STARTED / NOT_STARTED / DONE / CLAIMED) . 7:00 Querying the DB. ACID compliance. Indexing should be done on timestamp - Query: select the tasks that are NOT_STARTED where timestamp< current_time 8:50 failure during job run. - MQ failure - Node failure - (9:44) :New Query: select the tasks that are NOT_STARTED where timestamp< current_time - AND - tasks that are STARTED where timestamp + enqueing time + heartbeat < current time. 10:46 : Messaging Queue choice 12:14 : Claim service /DB + zookeeper Zookeeper is to check if the node is down or not. Then we can write in the Metadata DB that it's a retryable error 14:54 : Node dies and comes back up and tries the job again = 2 nodes trying the job Distributed lock. Ending note: how toschedule jobs at a fixed rate (WEEKLY / MONTHLY ) The task-runner service itself will write to the DB, the next time the task should be run again Ex: for BI-WEEKLY schedule, it will add the next time it has to be run.
@shivujagga
@shivujagga Год назад
18:25 The whole flow: 1. Client uploads job -> goes to S3 and gets stored in DB with its schedule. 2. The enqueue service (1 machine) polls DB every minute for all jobs with the query mentioned at 9:44 3. Batches and sends jobs to MQ. 4. MQ sends it to multiple workers. and sends heartbeats to zookeepers. (Zookeeper was used for distributed locking of jobs being run) 5. Worker updates the STATUS if the job was completed or not. I have 1 question that's not addressed though @Jordan has no life. What if the worker completes the job but fails at the point before updating the STATUS of the job as COMPLETED in the DB?
@shrutimistry2086
@shrutimistry2086 Год назад
Some visual diagrams would be very useful, to better follow along with your explanation
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Yep, am trying to be better about visualizations in my new series
@pawandeepb5967
@pawandeepb5967 Год назад
Very nice videos! awesome work !
@xmnemonic
@xmnemonic Год назад
easy to listen to and follow, thanks for making this
@geekwithabs
@geekwithabs 16 дней назад
At this point, based on that rad intro, I have to ask: Have you considered a role in Hollywood? 😉
@jordanhasnolife5163
@jordanhasnolife5163 16 дней назад
They send me linkedin DMs sometimes asking me to be an underwear model
@cambriandot3665
@cambriandot3665 Год назад
12:05 Job run: Distributed locks, heartbeats, retries, fencing tokens 15:30 More than once runs 16:28 Recurring jobs
@shivujagga
@shivujagga Год назад
So helpful!!!
@zy3394
@zy3394 27 дней назад
why does the zookeeper has no arrow outwards ? should it be notifying the db the status change of the tasks? like updating the task status not complete/not complete/ etc.
@jordanhasnolife5163
@jordanhasnolife5163 25 дней назад
Probably just because I forgot to include it in the diagram.
@silentsword9518
@silentsword9518 Год назад
This video on Job Scheduler is by far the best I've come across on RU-vid. Thank you for creating it! I have a question though: it seems here a lot of effort is made to make sure the "exactly-once" semantics, by doing retry, and having Zookeeper as well as the claim service. Would that work be eased a bit if we use Kafka? My understanding is that Kafka has better support for "exactly-once" and also uses Zookeeper internally.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Yeah definitely, I think though that maybe for the sake of the interview it's worth breaking that down
@zhonglin5985
@zhonglin5985 2 месяца назад
How does the Job Claim Service communicate with the ZK? Does it poll ZK once in a while, get the all the running jobs' statuses, and then update our JobStatusTable?
@jordanhasnolife5163
@jordanhasnolife5163 2 месяца назад
You can put something called a "watch" in zookeeper which will notify you when it changes
@prateekaggarwal3305
@prateekaggarwal3305 Год назад
Hi Jordan, how often job schedules will be polled from the Db, is it every second, every min? do we also need to define an SLA for picking the job from the table.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
I think that's based on the SLA like you said, personally I think something like every 10 seconds is probably reasonable
@akshay-kumar-007
@akshay-kumar-007 Год назад
Can you elaborate on how the SLA would work in this scenario for scheduling a job?
@allo1579
@allo1579 Год назад
Hey Jordan! I did not get why we need a lock here? If we enqueue a task into SQS, only one consumer will pick it up anyway (I think SQS takes care of concurrency here) and for the duration of the execution we can hide the task in the queue. Also, what happens to a task in the queue? Does worker removes it from the queue or makes it invisible for the duration of execution?
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Locks are important because tasks may be put in the queue again if the system thinks that it failed to execute (e.g. there is a timeout that is exceeded) - yes once a task is removed from a queue it won't be removed again, however like I mentioned it could be re-enqueued if we accidentally think that it has failed
@allo1579
@allo1579 Год назад
@@jordanhasnolife5163 oh, that makes sense! And what about a task in the queue? A task can take very long to execute, so I assume make it invisible in the queue is not really an option? Does executor remove it from the queue? In which case, what if it dies, who re-queues the task?
@aritraroy3493
@aritraroy3493 Год назад
I didn't know you were chill like that either 😫
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Listen bro if you didn't know I'm pretty damn chill
@aritraroy3493
@aritraroy3493 Год назад
@@jordanhasnolife5163 Left the freezer open 😨
@user-ke8bx3nw6o
@user-ke8bx3nw6o Год назад
Hey @jordan thank for the great video on scheduler design. I have a small query what will happend if we run multiple consumers for the service that will be polling data from DB and pushing it to queue? For scalability we may need to run multiple consumers and there is probability that jobs will get duplicated in queue.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
If our database uses transactions we wouldn't have to worry about this, each consumer could just mark a row as "being uploaded to queue" before they attempt to upload it and other consumers won't touch it if that happens
@rajatbansal112
@rajatbansal112 Год назад
I think data schema can be better. We can have job table which contains jobid,name,cron expression etc. There will be another table also which is job_execution table which will maintaine every execution of job.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Seems reasonable to me
@sanampreet3045
@sanampreet3045 Год назад
Great video ! just a small question . When a consumer node dies (stops sending heartbeats , how do we mark job status as failed , is zookeeper holding info that which consumer node is running which job id ? )
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Yes because to start claiming a job a consumer must grab the corresponding job lock in zookeeper.
@user-vv8fw4fj5i
@user-vv8fw4fj5i 9 месяцев назад
Looks like there's overlapping work between Job claim service and Zookeper for me, can zookeeper also do the job "Job claim service" does?
@jordanhasnolife5163
@jordanhasnolife5163 9 месяцев назад
Assuming that you mean the distributed locking part, then yeah I think so
@julianosanm
@julianosanm 6 месяцев назад
How would we differentiate if the job timed out or is just taking long to execute? How can we prevent it from running twice or even indefinitely? Would it make more sense to use a log based queue and let it take care of retries?
@jordanhasnolife5163
@jordanhasnolife5163 6 месяцев назад
To be honest, the challenging part of distributed computing is that you can never truly know. Networks aren't perfect and so nothing is certain, jobs can complete years after in theory. But, as long as you set a reasonable timeout, and make your jobs idempotent, it's ok! Using a log based queue is totally fine too, but it would still have to use timeouts somewhere
@desltiny2884
@desltiny2884 9 месяцев назад
FIRST 30 SECONDS HAHAHA THE BEST
@tavvayaswanth
@tavvayaswanth Год назад
If we maintain some state in database like submitted, queued, running, success, failed, we don't need to have any distributed lock on a job, your enqueing service will only poll for states which are submitted & running for so long let's say & failed ones, and all of it can be done in a serializable isolation level in MySQL as we have opted for it the first place.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
While I agree that the majority of the time, this ought to work, ACID properties aren't enough because our SQL database could go down, and unless it has strong consistency (not recommended for performance reasons/network partition tolerance), it may be possible that a claimed job may not seem claimed in the database replicas. Ultimately, we will need some sort of consensus here.
@tavvayaswanth
@tavvayaswanth Год назад
@@jordanhasnolife5163 Agreed on the database could go down part, but this is where many master slave systems(hbase for example) use consensus to elect the right master and hence we will get strong consistency. Theoretically both of our solutions has to use consensus in anyway just that you are having a distributed lock service separately. Got it. By the way your videos are great, Way to go!
@rishindrasharma7278
@rishindrasharma7278 Год назад
7:04 nice job ;)
@champboy
@champboy Год назад
What if these jobs had different priorities and we had to change the priority of a job at any point ? (Mainly concerned about when priority changed while its in the queue) For longer running jobs staying in the old priority queue might not be an option
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
A bit confused here when you say the queue. We could index our SQL table by priority, or we could shard multiple tables with priority. Once it's in the queue it's going to be run more or less - perhaps you could do some weird type of in memory heap but that seems a bit extra
@ArifSiddiquee
@ArifSiddiquee 5 месяцев назад
Thanks for the excellent video. I have couple of questions. How are job ids created? Are they globally unique? When a recurring job gets another entry in the metadata db, do they get different id? How do client gets status of recurring jobs? Should there be a different db to store statuses of previous runs?
@jordanhasnolife5163
@jordanhasnolife5163 5 месяцев назад
Yeah I think just creating a particular job run with a UUID is fine. Somebody else in the comments here suggested using a "JobExecutions" table which tracks the status of completed jobs as opposed to scheduled ones, I think that would work nicely here.
@valty3727
@valty3727 Год назад
6:10 what is it about the message queue that doesn't allow us to get any information about the job other than 'run' or 'not run'? admittedly my knowledge of message queues is kind of shaky but couldn't we configure a log-based message broker to give us info other than 'run' or 'not run'? also if you want another video idea, system design of a doordash/grubhub type app would be pretty cool!
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
I'm a bit confused what you mean here - we're just placing the jobs themselves in the message queues. We keep track of the status of each job in a database so that we can request the status from a variety of other components. Sure, a message broker knows which jobs were sent to consumers, but that doesn't mean they were run successfully, and the message broker has no way of knowing this. As for the doorash point, I'd just check out my design of Uber, they're basically the same :)
@valty3727
@valty3727 Год назад
@@jordanhasnolife5163 got it, thanks!
@abhishekmishraji11
@abhishekmishraji11 Год назад
Hey Jordan , can uoi please make a video on collaborative editing tools like coderpad, google doc, google sheets. Actually I guess codepad would be a super set of google doc so you can choose coderpad over google doc while designing. Thanks,
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Did that already
@abhishekmishraji11
@abhishekmishraji11 Год назад
@@jordanhasnolife5163 Thanks!
@vitaliizinchenko556
@vitaliizinchenko556 3 месяца назад
Thank you for the content. One question: what if we want to schedule jobs based on job’s resource consumption requirements and availability of resources on workers. How would you change your design?
@jordanhasnolife5163
@jordanhasnolife5163 3 месяца назад
I think that the message broker could itself maintain some internal state (or have consumers go through a proxy) which keeps track of how many jobs each has run and perhaps their hardware capabilities (maybe stored in zookeeper). Essentially a load balancer lol.
@niranjhankantharaj6161
@niranjhankantharaj6161 Год назад
Thanks for the Great video! If zookeper stops receiving heartbeats, "we can go ahead and updated the metadata db" Curious , who would update the metadata db? Is it a) Zookeper that goes ahead and updates the metadata db? If so, is that feasible with Zookeepers capabilities for us to add such a custom logic? or b) Zookeper performs failover where it creates another worker node and has it restart this job ? Also, since zookeper will help claim service acquire distributed locks using fencing token, why do we still need ACID properties if SQL DB - why should we not use no-sql for metadata db?
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Fair point! I think a couple of servers that poll zookeeper for outages and restart their jobs would do it
@niranjhankantharaj6161
@niranjhankantharaj6161 Год назад
​@@jordanhasnolife5163 Any example design or literature that shows this design (polling zookeper for outages and implements custom logic with failover) ? I believe this is very critical, and if left unaddressed leaves the fault tolerance not solved
@niranjhankantharaj6161
@niranjhankantharaj6161 Год назад
Looks like apache curator has some "recipes" that can be used when persistant nodes fails which can be used here. Also, curator can be used as a client with zookeper to acquire distributed locks
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
@@niranjhankantharaj6161 I'll do a better job addressing this in the remake. You have many options though - for example a cron job on the status table that eventually sets job status back to "not started" after a certain amount of time that the job has yet to be completed. It's certainly not trivial, but it's not overly complex either
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
@@niranjhankantharaj6161 Good to know, I'll take a look into curator!
@jayshah5695
@jayshah5695 Год назад
Are there any open source or commercial example that solve this problem? helps to understand the problem better.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Look up Dropbox atf
@user-se9zv8hq9r
@user-se9zv8hq9r Год назад
song? in b4 darude - sandstorm
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Lol it's some no copyright edm bs I gotta go find it haha
@wil2200
@wil2200 Месяц назад
Solid side job (id =14)
@zhonglin5985
@zhonglin5985 2 месяца назад
At ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-WTxG5880EH8.html, why long polling instead of regular polling?
@jordanhasnolife5163
@jordanhasnolife5163 2 месяца назад
You end up putting a lot of load on your system that you may not necessarily need to.
@dind7926
@dind7926 11 месяцев назад
hey Jordan, great video as always. Have a couple of questions: - Instead of using Enqueuing service could which polls jobs every minute, could we instead just add an event stream on the DB and just do filtering within the stream where we only take a look at the jobs that need to be run? - Not sure I got the argument about using in-memory queue, could you add more context on why we decided to do that instead of log-based queue?
@jordanhasnolife5163
@jordanhasnolife5163 11 месяцев назад
1) We could but that's effectively just polling and I think defeats the purpose of using the stream 2) We don't care about the order in which jobs are run and want to maximize throughout, so an in memory queue with many consumers is more useful to us than a log based queue with a single consumer per partition
@erythsea
@erythsea Год назад
that intro tho
@nikkinic112
@nikkinic112 Год назад
Why MqSql for the Job Scheduler? Why not Nosql?
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
We need transactions in our db table or else we could have write conflicts on a single node and jobs will get lost
@tunepa4418
@tunepa4418 Год назад
Good intro lol
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Why thank you, it was certainly out there
@akarshgajbhiye1289
@akarshgajbhiye1289 4 месяца назад
Jordan is clearly a man of culture ,
@Lantos1618
@Lantos1618 Год назад
jordan make a discord channel baka >,
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Definitely something I'm considering, I'm stretched a little too thin to be on there consistently atm so will let you know if I change my mind!
@andreystolbovsky
@andreystolbovsky Год назад
We don’t care about order of the jobs and we want an in-memory broker, so let’s pick Kafka. Wat. Wat a strange statement in otherwise interesting video.
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Probably a misstatement on my part - meant sqs or rabbit mq
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Actually it seems at 11:10 I said to not use kafka
@andreystolbovsky
@andreystolbovsky Год назад
Listened to that again - you’re right, I’m wrong. I felt it!
@user-se9zv8hq9r
@user-se9zv8hq9r Год назад
love the farting part. are you going to start selling your farts anytime soon?
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
Should I make a Patreon or an only fans?
@mnchester
@mnchester Год назад
Only Farts
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
@@mnchester brb building that
@justicedoesntexist1919
@justicedoesntexist1919 8 месяцев назад
how crass is this man? Such people pass googlyness round and get into google? Do people really like to work with such people with questionable character?
@jordanhasnolife5163
@jordanhasnolife5163 8 месяцев назад
Nope they all hate me! I'm literally incapable of cursing during the interview round!
@justicedoesntexist1919
@justicedoesntexist1919 4 месяца назад
So basically, interview process at Google is broken and there are false positives all the time. Got it!@@jordanhasnolife5163
@utkarshgupta2909
@utkarshgupta2909 Год назад
Jordan dont you think we should be having a queue between job submission service and SQL db?
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
I don't think it's necessary since a job submission is just adding one row to the database.
@utkarshgupta2909
@utkarshgupta2909 Год назад
@@jordanhasnolife5163At what scale should we have queue there? I mean at what transaction per second, SQL needs a queue
@jordanhasnolife5163
@jordanhasnolife5163 Год назад
@@utkarshgupta2909 Can't speak to exact TPS, but I think a good rule of thumb for a queue is when something that is being uploaded needs to be sent to multiple places or there is a lot of processing that eventually has to be done on it
Далее
How to answer any system design interview question?
1:37:51