Тёмный

System Design Interview - Top K Problem (Heavy Hitters) 

System Design Interview
Подписаться 100 тыс.
Просмотров 356 тыс.
50% 1

Please check out my other video courses here: www.systemdesignthinking.com
Topics mentioned in the video:
- Stream and batch processing data pipelines.
- Count-min sketch data structure.
- MapReduce paradigm.
- Various applications of the top k problem solution (Google/Twitter/RU-vid trends, popular products, volatile stocks, DDoS attack prevention).
Merge N sorted lists problem: leetcode.com/problems/merge-k...
Inspired by the following interview questions:
Amazon (www.careercup.com/question?id...)
Facebook (www.careercup.com/question?id...)
Google (www.careercup.com/question?id...)
LinkedIn (www.careercup.com/question?id...)
Twitter (www.careercup.com/question?id...)
Yahoo (www.careercup.com/question?id...)

Наука

Опубликовано:

 

26 июн 2019

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 646   
@DickWu1111
@DickWu1111 3 года назад
Jesus christ this guy's material is amazing... and each video is so compact. He basically never wastes a single word....
@antonfeng1434
@antonfeng1434 2 года назад
I have to pause or rewind constantly, and watch every video twice to digest it.
@amitdubey9201
@amitdubey9201 2 года назад
@@antonfeng1434 me too
@xordux7
@xordux7 2 года назад
@@antonfeng1434 Same here
@jasonl412
@jasonl412 3 месяца назад
@@xordux7 Same here
@saurabhmaurya94
@saurabhmaurya94 4 года назад
A summary of questions and answers asked in the comments below. 1. Can we use use hash maps but flush it's content (after converting to heap) each few seconds to the storage instead of using CMS? For small scale it is totally fine to use hash maps. When scale grows, hash maps may become too big (use a lot of memory). To prevent this we can partition data, so that only subset of all the data comes to a Fast Processor service host. But it complicates the architecture. The beauty of CMS is that it consumes a limited (defined) memory and there is no need to partition the data. The drawback of CMS, it calculates numbers approximately. Tradeoffs, tradeoffs... 2. How do we store count-min sketch and heap into database? Like how to design the table schema? Heap is just a one-dimensional array. Count-min sketch is a two-dimensional array. Meaning that both can be easily serialized into a byte array. Using either language native serialization API or well-regarded serialization frameworks (Protobufs, Thrift, Avro). And we can store them is such form in the database. 3. Count-min sketch is to save memory, but we still have n log k time to get top k, right? Correct. It is n log k (for Heap) + k log k (for sorting the final list). N is typically much larger then k. So, n log k is the dominant. 4. If count-min sketch is only used for 1 min count, why wouldn't we directly use a hash table to count? After all the size of data set won't grow infinitely. For small to medium scale, hash tables solution may work just fine. But keep in mind that if we try to create a service that needs to find top K lists for many different scenarios, there may be many such hash tables and it will not scale well. For example, top K list for most liked/disliked videos, most watched (based on time) videos, most commented, with the highest number of exceptions during video opening, etc. Similar statistics may be calculated on channels level, per country/region and so on. Long story short, there may be many different top K lists we may need to calculate with our service. 5. How to merge two top k lists of one hour to obtain top k for two hours? We need to sum up values for the same identifiers. In other words we sum up views for the same videos from both lists. And take the top K of the merged list (either by sorting or using a Heap). [This won't necessarily be a 100% accurate result though] 6. How does count min sketch work when there are different scenarios like you mentioned.... most liked/disliked videos. Do we need to build multiple sketch? Do we need to have designated hash for each of these categories? Either ways, they need more memory just like hash table. Correct. We need its own sketch to count different event types: video views, likes, dislikes, submission of a comment, etc. 7. Regarding the slow path, I am confused by the data partitioner. Can we remove the first Distribute Messaging system and the data partitioner? The API gateway will send messages directly to the 2nd Distribute Messaging system based on its partitions. For example, the API gateway will send all B message to partition 1, and all A messages to partition 2 and all C messages to partition 3. Why we need the first Distribute Messaging system and data partitioner? If we use Kalfa as Distribute Messaging system, we can just create a topic for a set of message types. In case of a large scale (e.g. RU-vid scale), API Gateway cluster will be processing a lot of requests. I assume these are thousands or even tens of thousands of CPU heavy machines. With the main goal of serving video content and doing as little of "other" things as possible. On such machines we usually want to avoid any heavy aggregations or logic. And the simplest thing we can do is to batch together each video view request. I mean not to do any aggregation at all. Create a single message that contains something like: {A = 1, B = 1, C = 1} and send it for further processing. In the option you mentioned we still need to aggregate on the API Gateway side. We cannot afford sending a single message to the second DMS per each video view request, due to a high scale. I mean we cannot have three messages like: {A = 1}, {B = 1}, {C = 1}. As mentioned in the video, we want to decrease request rate at every next stage. 8. I have a question regarding the fast path through, it seems like you store the aggregated count min sketch in the storage system, but is that enough to calculate the top k? I felt like we would need to have a list of the websites and maintain a size k heap somewhere to figure out the top k. You are correct. We always keep two data structures: a count-min sketch and a heap in Fast Processor. We use count-min sketch to count, while heap stores the top-k list. In Storage service we also may keep both or heap only. But heap is always present. 9. So in summary, we still need to store the keys...count-min sketch helps achieve savings by not having to maintain counts for keys individually...when one has to find the top k elements, one has to iterate thru every single key and use count-min sketch to find the top k elements...is this understanding accurate? We need to store the keys, but only K of them (or a bit more). Not all. When every key comes, we do the following: - Add it to the count-min sketch. - Get key count from the count-min sketch. - Check if the current key is in the heap. If it presents in the heap, we update its count value there. If it not present in the heap, we check if heap is already full. If not full, we add this key to the heap. If heap is full, we check the minimal heap element and compare its value with the current key count value. At this point we may remove the minimal element and add the current key (if current key count > minimal element value). This way we only keep a predefined number of keys. This guarantees that we never exceed the memory, as both count-min sketch and the heap has a limited size. Video Notes by Hemant Sethi: tinyurl.com/qqkp274
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Hi Saurabh. This is amazing! Thank you for collecting all these questions and answers in one place. I would like to find time to do something like this for other videos as well. I have pinned this comment to be at the top. Thank you once again!
@ananths5905
@ananths5905 4 года назад
Thanks a lot for this Saurabh!
@saiprajeeth
@saiprajeeth 4 года назад
Need more people like you. Thank you
@saurabhmaurya94
@saurabhmaurya94 3 года назад
@@atibhiagrawal6460 glad it's helpful! Might be worth posting a link to your notes in a standalone comment too so that everyone can see it
@atibhiagrawal6460
@atibhiagrawal6460 3 года назад
@@saurabhmaurya94 That is good idea ! Thank you :D
@sumedhabirajdar
@sumedhabirajdar 19 дней назад
Most system design interview answers contains high level design decisions. These videos explains how the data flows. Which is something I needed.
@alexbordon8886
@alexbordon8886 2 года назад
Your accent is hard to understand initially, but now I fall in love with you accent.
@anderswb
@anderswb 2 года назад
These are by far the best videos on system design for interviews. Thanks a lot for taking the time to make and publish these!
@arvind30
@arvind30 3 года назад
One of the best system design channel ive come across! great job! I particularly liked how you were able to describe a fundamental pattern that can be applied in multiple scenarios
@FrequencyModulator
@FrequencyModulator 3 года назад
This is the best explanation on system design I've ever seen. Thanks Mikhail, that helps A LOT!
@tusharahuja205
@tusharahuja205 3 года назад
Very clear solution and something that can actually be used in an interview! Please keep making more of these.
@pulkitb4Mv
@pulkitb4Mv 3 года назад
I love Mikhail's content, the video is so interactive that it looks like he is talking to you and he knows what is going inside your head :)
@supremepancakes4388
@supremepancakes4388 4 года назад
I wish all sys interview tutorials are like yours, with so much information precisely and carefully explained in a clear manner, with diff trade offs and topics to discuss interviewers along the way! Thank you so much
@manasdalai3934
@manasdalai3934 3 года назад
This is one of the best system design content I have came across. Thanks a lot.
@VageeshUdupa
@VageeshUdupa 2 года назад
Thank you very much!! I had gone over all your videos multiple times to understand it well. I had 2 interviews with FAANG in the last week and was offered a job in both! I have to say a lot of the credit goes to you!
@Gukslaven
@Gukslaven 3 года назад
These are the best videos on system design I've seen, thanks so much!
@prateek_jesingh
@prateek_jesingh 5 месяцев назад
This is one of the best system design videos on this topic I have come across. Thanks & keep up the great work, Mikhail!
@dc5
@dc5 3 года назад
Awesome videos Mikhail... thanks a lot for sharing! That last part showing other problems with similar solutions was the cherry on top.
@gameonline6769
@gameonline6769 3 года назад
Thanks Mikhail. I can bet..this is the best channel on RU-vid. Just binge watch all the videos from this channel and you will learn so much.
@NikhilSharad
@NikhilSharad 3 года назад
You're amazing, by far the most detailed and deeply analysed solution I've seen on any design channel. Please never stop making videos.
@MrAithal29688
@MrAithal29688 8 месяцев назад
All videos in this channel are the best on YT in this category even to this date. You can find many other channels which may give similar data divided into more than 5 videos with a lot of fluff. Mikhael's video touches upon every important part without beating around the bush and also gives great pointers in identifying what the interviewer may be looking for. Kudos to all the videos in this channel !
@warnercooler4488
@warnercooler4488 2 года назад
The amount of info you have covered here is amazing! Thank you so much!
@HieuNguyen-ty7vw
@HieuNguyen-ty7vw 4 года назад
The best system design answer I have seen on RU-vid. Thank you!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you, Hugh, for the feedback.
@souravmojumder5563
@souravmojumder5563 5 лет назад
This s one of the best system design video I came across in long time .. keep up the good work !
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thank you, Sourav. Appreciate the feedback.
@abbasraza4991
@abbasraza4991 4 года назад
Excellent video! Has depth and breadth that isn’t seen elsewhere. Keep it up!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Appreciate the feedback, Abbas! Thanks.
@sevenb1t
@sevenb1t 3 года назад
I had an interview step with AWS a couple of days ago and they asked me exactly this question. Thank you for your videos.
@NoWarForever
@NoWarForever 3 года назад
The best that I have seen so far!
@reprogram_myself
@reprogram_myself 3 года назад
and huge thank you for all your videos! They are the best I could find on system design!
@admiringrubin2910
@admiringrubin2910 2 года назад
couldn't solve this problem in an interview. found this gem of a video a month after. will get them next time!
@joyshu6264
@joyshu6264 4 года назад
Hands down the best system design videos so far !! and I have watched lots of the system design videos. Love how you start from simple and work all the way to complex structure and how it can applies to different situations.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
You are too kind to me, Joy! Thank you for the feedback!
@HarkiratSaluja
@HarkiratSaluja 3 года назад
How can someone even downvote this? This is just so amazing. Have not learnt so much in 30 minutes in my whole life.
@pradyumnakadapa5665
@pradyumnakadapa5665 3 года назад
Nicely structured ! covering both depth and breadth of the concepts as much as possible.
@itdepends5906
@itdepends5906 Год назад
THIS GUY is SO COOL. Who else feel that when he's speaking, explaining difficult concepts in the most concise way possible - and also touching on what we really need to hear about?!
@atabhatti6010
@atabhatti6010 Год назад
Excellent video. A key thing that you did at the end (and is very useful IMHO) is that you identified many other interview questions that are really the same problem in disguise. That is very good thinking that we all probably need to learn and develop. I encourage you to do that in your other design solutions as well. Thank you for another excellent video.
@terigopula
@terigopula 3 года назад
Your content is PURE GOLD. Hats off! :)
@andreytamelo1183
@andreytamelo1183 3 года назад
Misha, Loved the structure as well as depth and breadth of the topics you touched on!
@chuka231d8
@chuka231d8 5 месяцев назад
This is the most tech intense 30min video I've ever seen :) Thank you!
@jianlyu700
@jianlyu700 Месяц назад
OMG, this is still the best system design video i've ever seen. it's not only for interview, but also for actual system solution design.
@soulysouly7253
@soulysouly7253 7 месяцев назад
I'm devastated. I just got out of a last round interview, it was my first time ever being asked a system design question. I used this channel, among others, to study, and this video is the ONLY video I didn't have time to watch. My interview question was exactly this, word for word. I made up a functional and relatively scalable solution on the fly, and the interview felt conversational + it lasted 10 minutes more than it should have, so I think I did alright, but I still struggled a lot in the begining and needed some help. Life is cruel sometimes.
@sachin_getsgoin
@sachin_getsgoin 2 года назад
Awesome video. Discussion of various approach (with code snippet) and the drawback is the highlight. Thanks a lot!
@graysonchao9767
@graysonchao9767 Год назад
Great work. I am a senior engineer at a big tech company and I'm still learning a lot from your videos.
@balajipattabhiraman
@balajipattabhiraman 2 года назад
As luck would have it i had a similar question for make or break round in google and I nailed it since I watched it several times over before the interview. Got a L6 role offered at Google. Thanks for making my dream come true.
@fredylg
@fredylg 2 года назад
Sr your videos are gold, I got no interview but it’s rare to find architecture so well explained, thanks
@rahulsahay19
@rahulsahay19 Год назад
Awesome. Simply awesome. You killed it completely!
@boombasach
@boombasach 9 месяцев назад
Among all the materials I have seen in youtube, this is really the top one. Keep up the good work and thanks for sharing
@anshulgolu123
@anshulgolu123 3 года назад
Please do more of them as your videos are very good from a content perspective :) Extremely informative ...
@biaozhang1643
@biaozhang1643 Год назад
Wow! This is the best system design review video I've ever seen.
@prashub8707
@prashub8707 2 года назад
So far I am loving it. Keeps me glued to ur channel. Fantastic job I must say
@stefanlyew8821
@stefanlyew8821 5 лет назад
one of the best technical discussions I have seen
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thanks, Stefan. Appreciate the feedback!
@artemgrygor476
@artemgrygor476 5 лет назад
Thank you for such a detailed explanation. Awesome as usual!
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thank you @Memfis for providing consistent feedback!
@priyankamishra5704
@priyankamishra5704 2 года назад
Great stuff!! Thanks a ton for such in depth explanation of these concepts and correlation.
@fasolplanetarium
@fasolplanetarium 2 года назад
PLEASE come back and make videos again. There's no resource quite like this channel.
@sabaamanollahi5901
@sabaamanollahi5901 Год назад
I wish I could give this video a thousand likes instead of just 1 !!! these contents are fantastic!!!
@harishshankar682
@harishshankar682 4 года назад
Cant thank you enough for your efforts in sharing such a high quality content for us!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Hi Harish. Thanks!
@lysialee2897
@lysialee2897 3 года назад
i feel bad that im not paying for this video! the quality is beyond amazing
@aman1893_
@aman1893_ 2 года назад
You shouldn't feel bad. With this much knowledge, he must be getting atleast $500k+ on his current job. And by now he must be looking beyond money and must be looking for making meaningful contribution to the society.
@ST-pq4dx
@ST-pq4dx Месяц назад
He is staff at stripe, 1M plus easy. He is just sharing his knowledge
@algorithmimplementer415
@algorithmimplementer415 4 года назад
Amazing video. Thank you! The way you structured it is commendable.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you, Algorithm Implementer. Glad to hear that!
@ahanjura
@ahanjura 4 года назад
The system design video to beat. PERIOD!!!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you, Anubhav!
@drakezen
@drakezen 5 лет назад
Bonus on mentioning using Spark and Kafka as I was thinking that during the video. Great stuff as usual!
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thank you, @Collected Reader. Glad to see you again!
@soubhagyasriguddum4983
@soubhagyasriguddum4983 4 года назад
this channel has the best System design explanations ... thank you so much and keep up the good work!!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you for the feedback, Soubhagyasri! Glad you like the channel!
@diptikaushik8250
@diptikaushik8250 3 года назад
This is just incredible! Please do publish more videos.
@shw4083
@shw4083 Год назад
This is really the best tutorial, and I hope there is article like this content!
@ujjaldas8805
@ujjaldas8805 2 года назад
Thanks for all the efforts you put here to describe. This is a great material.
@cnanavat
@cnanavat 4 года назад
Excellent explanation ! I really appreciate your work!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Appreciate the feedback! Thanks.
@sshks10
@sshks10 3 года назад
Very clean explanation, which is rare nowadays, why did you stop ? It would be nice to see your new videos , good luck man!
@SharanyaVRaju
@SharanyaVRaju 3 года назад
I agree. Can you please continue doing this?
@freeman-uq8xr
@freeman-uq8xr 3 года назад
PLEASE MAKE MORE VIDEOS. WE WILL PAY FOR IT (ADD JOIN BUTTON)!
@coolgoose8555
@coolgoose8555 4 года назад
Ohhhh why I did not find this channel before.... The way you approach the problem and take it forward it make it so easy else the realm of system design concepts are huge.... We need more videos like this.... This is design pattern of system design.... Good Job!!!!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Glad to have you aboard, coolgoose8555! Thank you for the feedback!
@kumarc4853
@kumarc4853 2 года назад
Thank you very much Sir, excellent demonstration of coherent design thinking. I feel more equipped than ever to solve system design problems.
@harjos78
@harjos78 3 года назад
This is pure Gem!.. Take a bow ....
@DinkarGahoi
@DinkarGahoi 4 года назад
This is by far the best content I have found on the System Design. I am addicted to this content. Keep up the good work, waiting for more videos .. :)
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Glad you enjoy it, Dinkar! Sure, more videos to come. I feel very busy these days. But I try to use whatever time is left to work on more content.
@yodali7999
@yodali7999 3 года назад
super helpful and pretty on point. appreciate the video.
@datbui5863
@datbui5863 3 года назад
Great video! Nice work!! Thank you!
@radsimu
@radsimu Год назад
I think it is admirable that you explained all the inner workings. In a real interview you can probably skip the single host solution with the heap, that's good for an explanation on youtube. What I think is more valuable is to also propose some actual technologies for the various components to make it clear that you are not proposing building this from scratch. I'm surprised that Kafka Streams was not mentioned. Also for the long path, it is worth discussing the option to store the raw or pre-aggregated requests in an OLAP db like Redshift. The olap can do the top k efficiently for you with a simple sql query (all the map reduce magic will be handled under the hood), can act as main storage, and will also make you flexible to other analytics queries. Integrates directly with various dashboarding products and one rarely wants to do just top k.
@datbui5863
@datbui5863 3 года назад
Please, make more videos! Absolutely amazing explanation!!!!!!!!!!!
@ravitiwari2160
@ravitiwari2160 3 года назад
Hey, Thank you so much all your knowledge sharing. I am able to perform very nice in all my interviews. Keep up the good work. More power to you. Keep rocking!!!
@lolista
@lolista 3 года назад
watched other videos before this.. so liking this before starting...
@SahilSharma-ug8xk
@SahilSharma-ug8xk 2 года назад
HE explains so well
@karthikmucheli7930
@karthikmucheli7930 4 года назад
OMG. I love these videos. Thank you so much for creating these. Please write a book or open a course, it may fund you to focus much time on very helpful content like this. I am very happy today.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Appreciate your feedback, Karthik!
@balajipattabhiraman
@balajipattabhiraman 2 года назад
Phenomenal. We do something very similar with hot and cold path in microsoft. Instead of countmin sketch we use hyperloglog
@jayeshborgaonkar9166
@jayeshborgaonkar9166 3 года назад
this is really good stuff, keep up the good work thanks
@saurabhchoudhary9260
@saurabhchoudhary9260 5 лет назад
Awesome and detailed explanation. Hats off
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thank you, Saurabh.
@deathbombs
@deathbombs 3 года назад
19:05 slow path 22:00 faster than map reduce but more accurate than countmin 22:43 fast path 25:38 Data partitioner is basically kafka that reads message(logs, processed logs w counts,etc..) and stores them to topics
@TheHinduRakshak
@TheHinduRakshak 3 года назад
awesome content! learnt a lot, many thanks !
@VinayHPTP
@VinayHPTP Год назад
Thanks a lot for detailed explanation. much appreciated! ❤
@andrepinto7895
@andrepinto7895 2 года назад
It is not enough to send the count min sketch matrix to storage only, you also need to send a list of all the event types that were processed, otherwise you have no way of moving from the matrix data to the actual values (before hashing). The only advantage over the map solution is that you don't need to keep all of it in memory at once, you can stream it as you go from disk for example. Calculating the min for each key is O(number of hash functions, H) and you need to do that for all types of events, so O(E*H). Then you use the priority queue to get the top K, O(E*log(K)), so total time complexity is O(E*H*log(K)).
@xiaopeiyi
@xiaopeiyi Год назад
Well, you are right. But I think the video is more about one of a general design for a single event type. Then we can start from here based on the functional requirement.
@AshishGupta-jx1ys
@AshishGupta-jx1ys 4 года назад
Amazing Video and in detail great explanation. Thanks a lot for creating this in-depth video. Please keep creating more awesome stuff.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you, Ashish, for the feedback!
@arunmagar4836
@arunmagar4836 2 года назад
These videos are gem for System design noob like me.
@niosocket1
@niosocket1 4 года назад
This is just amazing! Thanks a lot!
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you for the feedback, Sergey! Glad you liked!
@niosocket1
@niosocket1 4 года назад
So funny, found this channel yesterday and watched this video and been asked pretty much same question at my interview at LinkedIn today. Thanks a lot.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Funny, indeed )) This world is so small )) Thanks for sharing!
@niosocket1
@niosocket1 4 года назад
Actually got an offer from Amazon, LinkedIn, Roku and probably Google as well. A lot of it because of this channel. Can’t recommend it enough! Thanks again!
@HieuNguyen-ty7vw
@HieuNguyen-ty7vw 4 года назад
I was asked this same question at my interview last Friday and found out your video today :( Didn't nail it though, hope I can do better next time. Thank you Mikhail, hope you can spend time to create more video like this.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Wow, Sergey. You rock! And thank you for the praise.
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Time will come, Hugh. Just keep pushing!
@TheRedbeardster
@TheRedbeardster 3 года назад
Спасибо, Миша!
@crescentcompe8289
@crescentcompe8289 2 года назад
amazing, i was like wtf you talking about at the beginning. It all makes sense now after the data retrieval part.
@xiaolanli7985
@xiaolanli7985 2 года назад
This guy is amazing!!!
@oopsywtf
@oopsywtf 4 года назад
Please upload more content ! Awesome cntent for the viewers!! Great Great stuff
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you for the feedback, oopsywtf. More videos to come.
@SatishKumar-jb9qm
@SatishKumar-jb9qm 3 года назад
Thank you for making this video. It was very helpful. It will be great if you can post more such videos.
@xiaopeiyi
@xiaopeiyi Год назад
It's really helpful. I already watched each videos so many times, I learned a lot. Initially, I was so frustraded with the accent(I am not native Eng speaker either). But now I am okay watching it without CC.
@user-uj8rr7nn9q
@user-uj8rr7nn9q 5 лет назад
That's awesome learning material! I hope you can keep publishing new video about system design
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Glad you liked. Thanks for sharing the feedback!
@TheGhumanz
@TheGhumanz 2 года назад
You explained it very well, thank you!
@amitrastogi1405
@amitrastogi1405 8 месяцев назад
Great explanation. Thanks!
@alexanderyakovlev9311
@alexanderyakovlev9311 4 года назад
This is yet another great System Design video in this channel! I have two thoughts that might help improve the solution: 1. The question of "top K frequent elements" does not require us to sort those top K elements, thus we can use "Quick Select" algorithm merely to find the kth element. The point is after we find the kth element using Quick Select, the array is partitioned such that the top K elements are in the first K positions (but not sorted). This gives the answer in log(n) time, which is a reduction from nlog(k); 2. When you really have a huge amount of data and counts to handle, why not partition the data simply using round-robin for each key? This way, each partition contains (about) the same data so we only need to calculate the result from one partition only. With this approach, we may consider all other partitions 'virtual' or imaginary (without actually using server nodes) so we save the design cost. What do you think?
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Hi Alexander. Thank you for the feedback and great questions! Here are some of my thoughts: - Quick Select has O(n) best and average case time complexity. O(n*n) in the worst case. You are correct that it still may be a bit faster on the fixed-sized list of size n. But I cannot say the same for a streaming data, when new events keep coming and we need to calculate/update top K list when every new event arrives. Heap guarantees log(k) time complexity. Running Quick Select on already partially sorted array should be around the same time, but I cannot say what is guaranteed worst-case complexity in this case. - I believe when you say round-robin you mean hash-based, right? So that events for the same video always go to the same partition. Because a "classic" round-robin means "choose the next one in a sequence of machines", which may mean that events for the same video may go to different partitions. So, if you mean hash-based, you are correct, we can use this approach. Two notes, though a. Hash-based partition may lead to "hot partitions" problem. I mention this in a video as well as talk in a bit more details in the latest (step by step interview guide) video. b. When we use count-min sketch, we do not need to partition data at all. Partitioning is needed to guarantee that only limited amount of data will be routed to a particular machine. But because both a count-min sketch and a heap use limited memory (independently how much data is coming), partitioning is not required at all. But this is true for the fast path only, when we calculate approximate results. To calculate accurate results we need to partition. Please keep sharing your thoughts!
@HeckscherHH
@HeckscherHH 11 месяцев назад
I think your great coverage of the topic show how you really know it and understand it compared to other guys who just share what they read last night. Thank you
@snehasishroy39
@snehasishroy39 3 года назад
Awesome video. Well explained. Waiting for your next video :) Please upload soon.
@abysswatcher4907
@abysswatcher4907 2 года назад
For people wondering why heap complexity is O(nlog(k)) for single host top k, we do a simple optimization to pop least frequent item when heap size reaches K, so we have n operations each taking order log(k).
@nikhilkumarsaraf3290
@nikhilkumarsaraf3290 3 года назад
All your videos are really amazing. I hope you would post it more often.
@SystemDesignInterview
@SystemDesignInterview 3 года назад
Thank you, Nikhil. I will surely come back with more regular video postings.
@natarajaneelakanta353
@natarajaneelakanta353 5 лет назад
This is Terrific stuff, keep these coming.
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thank you for sharing your feedback, Nataraja.
@natarajaneelakanta353
@natarajaneelakanta353 5 лет назад
@@SystemDesignInterview at 13:53, how did min val for A become 4 ?
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
My mistake. It should be 3, of course. Thanks for pointing out.
@natarajaneelakanta353
@natarajaneelakanta353 5 лет назад
@@SystemDesignInterview I am sorry, I didn't mean to point mistake, i was just inquisitive. You have done a tremendous job (i don't have a better word) in explaining these so beautifully. I keep looking into this every week !
@SystemDesignInterview
@SystemDesignInterview 5 лет назад
Thank you, Nataraja, for the kind words.
@mohitnagpal8025
@mohitnagpal8025 4 года назад
I have seen lot of system design videos but this content's quality is way above rest. Really appreciate the effort. Please keep posting new topics. Or you can pick top k heavy hitters system design problem requests from comments :)
@SystemDesignInterview
@SystemDesignInterview 4 года назад
Thank you for the feedback Mohit! Much appreciated.
Далее
System Design Interview - Distributed Message Queue
26:28
⚡️Uylanishim kerak, sovchilikka borasizmi?...😅
00:50
System Design: Design a URL Shortener like TinyURL
16:00
System Design Interview - Distributed Cache
34:34
Просмотров 350 тыс.
System Design Interview - Step By Step Guide
1:23:31
Просмотров 784 тыс.
Scaling Instagram Infrastructure
51:12
Просмотров 277 тыс.
899$ vs 360$ which one will you choose ? #iphone #poco
0:18