Really interesting series but I would like to see the interviewer engage more or even challenge the interviewee, most of the time you just agree with the decisions/design, but overall great job.
@@IGotAnOffer-Engineering Amazing interview but I agree with @SmartCoder89. Asking questions on why decided to use certain technology, as I have been asked in interviews, and such
I recently had a twitter system design interview, and i’m sorry to say that this is not a good mock. In a real interview, the interviewer ask the candidate challenging questions, interrupt and steering you to a direction that relevant for how the specific company see the role. This leaves you much less time to cover some of the items that shown in the interview, your time management need to be much more precise. In addition, a good portion of the interviews for staff level are with 2 interviewers, which add more to the challenge.
Often times it will be shorter than this, for me it was 30 minutes. just clarifying the assignment was already 10 minutes gone. so you don't need to go into detail about everything. the point of this is not to mirror a real time interview, I don't think, but for us to be able to extract information. If you watch 10 of these videos the chance that you will be able to answer a question about something in a way that it looks like you know what you're talking about will increase greatly. And let's be honest, in the job you won't be designing the system anyway so that is the main point.
I worked at Twitter and this design is hot garbage. Like he doesn't even talk about micro service architectures, service discovery and just glosses over components. What are the responsibilities of the "twitter processor" for example? He just plops down components and makes them magically do-alls. He goes into data flow and completely looses context. He also makes it appear that the entire thing stays in memory implicitly.
@@drunken_moose interesting. i have never done a system design interview before, and having one coming up next week. good to hear this might've been hot garbage and i so that i don't spend time one something not v useful. agree this does seem a lot like just plopping down components
The fastest way to learn is to watch someone do it. Seeing this senior Engineer go through system design from scratch is really inspiring. thanks for sharing.
IMHO this is lacking important things: - A message bus, and something in front of it (what I would use, in the real world you would want various services and a service oriented architecture is more scalable and adaptable, potentially have multiple sources of entry, ie Twitter used to and may still allow SMS created tweets, and then the can of worms that involves bc message bus and bc various services -- he uses a message bus but just for media). - websockets (and pub/sub for that matter). - I do not understand the choice of Cassandra. I would think a time series database would work better for fetching tweets if you want them ordered by time. If it was a service oriented architecture, you could even do things like have a service for determining a timeline for a given user, or groups of users. - I didn't catch any mention of a CDN. - This might be extreme, but I would think rate limiting and security as well. - Batching updates is something I would think to do, and at least mention minimizing deltas would be interesting. - Also with such massive load I would think Protobufs as a protocol work really well and would be worth mentioning. Storing keys is expensive at scale. Still a great interview. Just had some notes, and feel a little concerned I don't see websockets anywhere in the video or comments.
Thanks for sharing! But today i had System Design interview, where i was asked to design twitter, and it was much more difficult) Because: 1. Interviewer were interrupting me and asking lots of question (for example, why use cassandra, what type of replication should we use, etc) 2. The scheme must follow any kind of notation, C4 for example. 3. I had to count the numbers (you know, RPS, capacity etc.) and implement it into the scheme (how much CPU, RAM, HDD/SSD should be, how not to kill Databases or Queue managers with too many connections)
Great content. i wish the interviewer will challenge the interviewee about why he made the decision he made. i think the explanation of why he chose to use every component is very important. Thanks alot for your content :)
You should really challenge the interviewed person more, because without it the whole interview feels absolutely unnatural and you look like you have no idea what you are talking about, however I'm sure that you are proficient at system design. You just have to show it
the candidate is a bit weak I feel like you should grill him a bit to see if he knows what he's talking about or just regurgitating from a sys design book
This is not bad but barely passable for L5 and certainly not an L6+ material. There are a lot of holes. Redis Pub/Sub for instance is very fragile part of the design. Also it would be very hard to get the people that a user follows quickly. There were bunch of hand wavy stuff, if we're partitioning by Tweet ID, why does it matter that the Tweet ID is ordered? If we're partitioning by the user and then by the tweet ID then each tweet will still go to a different server. What's the purpose of it? I mean there are some big holes. We did the capacity planning and what purpose did it serve? What did it help with? Just waste of time? The more I think about it, L5 hire is hard actually maybe L4.
@IGotAnOffer at 38.12 where it is proposed that every tweet will be posted into kafka now coming back to calculations velocity in q1 would be 6000 messages / sec as that is the number of tweets produced / sec second now since you are fetching the followers in the consumer which is 200 / user . so roughly there is 6000 qps on the user follower database which fetches all the followers now this consumer is publishing 200 messages one for each follower into the second queue , which will be 6000*200= 1.2 million messages / sec also all other services like in this case the redis would receive as many writes , this is a important issue , if we consider bandwidth also the secondary approach could be to batch these .
@@crushingtecheducation Thanks for this video. I have couple of questions: 1) for tweets, why are you going with no sql DB? Tweet id, user id, tweet content (text) - can all be stored in relation DB, right? Media can be stored in nosql or blog DB, with reference in relational DB, probably as part of content itself. I thought that is how you described in the initial field design. 2) Why cassandra DB and wide rows - what does that mean ? tweets have a fixed "text" length -> translating to a fixed set of data size, right?
@crushingtecheducation - can you explain kafka work here - especially for timeline ? If it is going to get all the tweets in Kafka, what would the processor (consumer) do with all the tweets? are you going to have consumers for each user or active users in Kafka and prepare a data set and store it in cache? is that the purpose of kafka + timeline processor ?
2 inserts allow us to better scale the database since we can have 2 independent key-value tables/databases. Once to follower=> followee and the other one for followee=>follower. If we don't do it, we have to use an index on user_id (follower or followee) which is not optimal for billions of records.
I think this presentation is more for senior engineer interviews. For new graduate system design interviews I think this design is indeed an overkill lol