Тёмный
No video :(

System Design : Design a service like TinyUrl 

Tushar Roy - Coding Made Simple
Подписаться 243 тыс.
Просмотров 571 тыс.
50% 1

Опубликовано:

 

28 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 489   
@himanshuverma31
@himanshuverma31 7 лет назад
We can optimise the generation of the URL part in your service layer by delegating the key generation part to multiple instances of offline services(KeyGenerationService). Each service will generate the keys in the particular range and give the pre-generated tiny urls to servers in need. Servers can also keep some unused url in their local caches to speed up the process more. Their will be some race conditions which could be taken care by designing the service properly to take care of interaction between different threads in the services. All in all, it is a great video, kudos to the efforts you put to come up with such content !!!
@tusharroy2525
@tusharroy2525 7 лет назад
its a good idea of pre-gnerating of tiny url. Thanks for bring this up. Hopefully other viewers will read this.
@rockylovesall
@rockylovesall 7 лет назад
Awesome approach!, I think race condition we can handle via some synchronization approach.
@yasharshahi
@yasharshahi 7 лет назад
+Tushar Roy - Coding Made Simple thanks for the great video. you can pin a comment if you want your viewers to see it.
@RussellWu
@RussellWu 5 лет назад
The pro of using a range is because of its compact size to be stored and operated on. Pre-generated keys will take up a lot more space and increase a lot more roundtrip time between internal services while calculating a tinyurl from a integer is relatively cheap. So I think it's only worth it if the calculation is expensive, like blockchain addresses.
@renjithr7676
@renjithr7676 4 года назад
in this approach you should first check actual url already present, insert only if not.
@arnabsaha876
@arnabsaha876 4 года назад
I appreciate your efforts in making system design tutorial videos. My algorithms for tackling this question: -6 characters long TinyURL: a. Use MD5/SHA256 on Base64 encoding with 6 letters long key (This was mentioned by you also, I uplifted it a bit) b. Use Key Generation Service for 6 digit keys already pre-computed - Key DB of ~500 GB size + Replication of this to avoid Single Point Of Failure + Lock system to avoid concurrency issue + Use Memcache to speed up -> Fetch key from DB and Use HTTP 302 redirect to browser, in case of success else return 404 to the user.
@ilanchen6583
@ilanchen6583 5 лет назад
I think your system design videos are top notch and helped me tremendously. I usually spend my working hours writing code and not worrying about the design videos, but thanks to you I was able to bridge the gap of missing knowledge
@SheldonZam
@SheldonZam 7 лет назад
Tushar! I'm so grateful for what you're doing here on RU-vid! I've literally spent hours watching your videos on dynamic programming this last week to study for an algorithms module of mine. Even almost got a bit of your accent!
@vitaminb4869
@vitaminb4869 7 лет назад
People (multiple people) spend days/weeks/months designing a scaleable system. Then comes some hot shot interviewer with all the right answers he found earlier from his google search, and then asks you to design a tinyurl system and expects you to spit it out in 24 minutes? Also consider the fact that his company has a product that is being used by only a few users, that a Windows 98 PC could happily support. Interviewers really need to get off their fucking clouds and stop searching for unicorns.
@tusharroy2525
@tusharroy2525 7 лет назад
lol
@ripplecutter233
@ripplecutter233 6 лет назад
Vitamin B goddamn tech interviews. (begrudgingly goes back to studying some obscure algorithm that some brilliant mind took ages to figure out)
@sohineebasak4644
@sohineebasak4644 5 лет назад
couldnt agree more
@gopala5334
@gopala5334 5 лет назад
lol, exactly said. I was asked to design the same in 30 mins, that too over telephonic discussion. Great video, thanks to Tushar Roy.
@borhadesumit58
@borhadesumit58 4 года назад
I’ve exactly same view. People/scientists invest days, months or even years to come to solution and some stupid people expect to answer in 30 or 60 mins.
@michanestorowicz6577
@michanestorowicz6577 3 года назад
Again, very good content. I really like that you're talking fast and not waste time on things that are easily found on the internet (definitions etc.). Very nice to watch.
@navdeepdahiya4637
@navdeepdahiya4637 7 лет назад
Dear Tushar, it is great to see that you are back in action. Thanks a lot for your efforts. I truly appreciate your hard work. Could you please share your knowledge on design problems. Thanks a lot.
@sagardafle
@sagardafle 7 лет назад
Navdeep Dahiya Kya baat hai
@tusharroy2525
@tusharroy2525 7 лет назад
I m trying. Started with tiny url and more to come.
@rohitkumarkhatri2203
@rohitkumarkhatri2203 7 лет назад
You are awesome..... Plz keep it up with ur design knowledge.....
@UMAKANTJENABCE
@UMAKANTJENABCE 7 лет назад
Please continue to do your good work of design problems I am waiting for your next coming videos on this eagerly.
@edithau8919
@edithau8919 7 лет назад
Hi Tushar - Thanks for the awesome video! You asked how other people would design the url service and here’s my 2 cents: I would start the design with a stateless API service and write thru in-memory cache, backed by a relational database. Collisions are detected at the db level (eg. a unique primary key on the shorten url) and corrected by hashing the original url + a prefix. The regional issue you mentioned could potentially be mitigated with a combo of load balancer routing configuration and in-memory cache at each service instance. In the case of GET request burst, it is likely that the instances will have the URL cached. I would imagine this starter design is good for couple thousands URL per sec. An AWS RDS with SSD could handle up to 10K IOPS according to their doc. The in-memory cache size could also be adjusted for read performance. To scale up this design for higher volume, I probably would try the distributed cache with write-behind before considering Zookeeper. In general, I like to stay stateless if possible :) I created a simple tiny url implementation in case anyone is interested. github.com/edithau/simple-tiny-url
@tusharroy2525
@tusharroy2525 7 лет назад
+Edith Au thanks for your insights
@renurani3311
@renurani3311 5 лет назад
For people who are confused why do we need 43 bits: 2**42 = 4398046511104 = 4.3 * 10 ** 12 = 4.3 Trillion. So, we only need 42 bits to represent 4.3 trillion numbers But these numbers also include negative numbers and we only need positive numbers (0 - 3.8 Trillion). So we will 43 bits and keep our MSB as 0 to make it a positive number.
@vipuljain9550
@vipuljain9550 5 лет назад
However, Tushar says that 5 bits can be randomized, rather he should mention it as 4 bits, please correct me if am wrong.
@renurani3311
@renurani3311 5 лет назад
@@vipuljain9550 Yes, that should be 4 as he needs first 7 bits to represent number 0 - 63 (MSB is always 0 because number is positive)
@lugesot
@lugesot 4 года назад
I just want to say, you are so gogerous.
@vishalmahavratayajula9658
@vishalmahavratayajula9658 4 года назад
Thank you. This saved my time
@TheProCion
@TheProCion 4 года назад
I don't think that this is correct. Just assume that the number is unsigned integer.
@soumyarooproy
@soumyarooproy 6 лет назад
@Tushar, thanks for the video. 12:15-13:55 is a very convoluted way of conveying that log2(62**7) bits are needed encode 62**7 values. Also, that evaluates to 42 and not 43. An oversight, I presume...
@theanigos
@theanigos 3 года назад
This guy and this video helped a millions.
@erytorth9789
@erytorth9789 Год назад
Thank you for the video! That's the most organized / structured content (about app layer) I've seen for that problem.
@revanthkumar1183
@revanthkumar1183 6 лет назад
I think it is supposed to be 42 bits not 43. If you think about it, your language has 62 chars and to uniquely identify each char you need 6 bits. because 2^6 is 64. Then you can map 000000 to a, 000001 to b and so on till 9. So in a 7 character string each character will need 6 bits and 7*6 is 42. So you will have to save 42 bits in your DB and when you get it back from your DB, you have to use the same mapper and get back the characters. Hope that clears out the doubt around 42 and 43 bits.
@kingofwebguru
@kingofwebguru 2 года назад
@Revanth Kumar Thanks for the clarification. I was thinking why it is an odd number (43).
@nrjdalal
@nrjdalal 9 месяцев назад
I have built one too rdt-li an open source free url shortener with built in analytics, check it out
@nitinagrawal6637
@nitinagrawal6637 5 лет назад
Appreciate such videos which give different/alternative ways towards solutions & one must go through various such views before finalizing on the solution. But seriously, only a king of fools can ask such questions & expect a good logical answers in those 10-15 minutes. I myself have got such questions with the expectations to say those fancy words of design patterns & algorithms. Man, such interviewers themselves do google the questions/answers for the interviews & they themselves don’t know to solve the trivial problems in their own projects. Many issues with interviewers’ mentality & processes but for such videos, these really help a person to think differently & find the solutions. Appreciate Tushar for providing such informative videos.
@rishabhdaim
@rishabhdaim 7 лет назад
Hello, tushar, I would really like to thank you for your videos. recently, I was able to crack an interview by watching your videos. Especially your videos on Dynamic Programming are treat to eyes. Thanks again.
@tusharroy2525
@tusharroy2525 7 лет назад
+Rishabh Daim great. Happy videos helped.
@LuveenWadhwani
@LuveenWadhwani 2 года назад
Ommggggg Tushar's first ever design video 😍😍😍
@visintel
@visintel 4 года назад
This is a great video for people who are trying to learn system design. Not so great for people prepping for an interview
@rish2cool
@rish2cool 6 лет назад
Great articulation & simple enough for non CS folks to understand. Thanks Tushar!
@muktarali2842
@muktarali2842 7 лет назад
Very nice explanation. Please share more design/ system design topics. Thanks a lot.
@tusharroy2525
@tusharroy2525 7 лет назад
definitely
@deepamohan3157
@deepamohan3157 7 лет назад
when can we see more of them?
@zackhellripper
@zackhellripper 5 лет назад
Curious why you choose 43 bits, as you only actually need 42 bits to represent 3.5 trillion (max for 42 bits is 4.398 trillion). Is there a use for the extra bit I'm not seeing?
@rohanabhutkar
@rohanabhutkar 5 лет назад
Yes also since we have 7 char tiny url, and each char is 6 bits in base64, 42 bits is enough even if we use base64 encoding (a-z, A-Z, 0-9, _ , . )?
@adityasahai9020
@adityasahai9020 5 лет назад
@@rohanabhutkar he is using base62 because there are 62 unique chars.
@renurani3311
@renurani3311 5 лет назад
2*42 = 4398046511104 = 4.3 * 10 * 12 = 4.3 Trillion. So, we only need 42 bits to represent 4.3 trillion numbers But these numbers also include negative numbers and we only need positive numbers (0 - 3.8 Trillion). So we will 43 bits and keep our MSB as 0 to make it a positive number.
@Shogoeu
@Shogoeu 4 года назад
@@renurani3311 or we can use unsigned data type and save 1 bit :D
@zhongyuanzhang5611
@zhongyuanzhang5611 4 года назад
@@renurani3311 ddkkokoi Iojkoiw e seeders (know((in(.m Mi
@gkcs
@gkcs 6 лет назад
That's was enlightening, Tushar. What do you think about probabilistic data structures to generate the tinyUrl? Like the MD5 hash you mentioned, I think we could use Bloom Filters to tackle the problem too.
@gkcs
@gkcs 6 лет назад
Haha, yes I have become interested in bloom filters recently :)
@ryc8889
@ryc8889 6 лет назад
this was actually one of the things i thought of as well. I think a md5 hash might be too slow for an efficient bloom filter depending on the performance requirements but maybe it can be combined with the CDN idea. i was reading about bloom filters on wiki and there was an example about how akamai uses bloom filters in their webservers en.wikipedia.org/wiki/Bloom_filter#Examples
@thakur8711
@thakur8711 6 лет назад
Gaurav Sen, thanks for bringing up this point. I agree with you that we can use bloom filter to generate tinyURL. But there is a caveat. As far as I understand, bloom filter can help you to check if the hashOfBigURL(tinyURL) is already existing on the persistence layer or not. It will help us finding the existence of the tinyURL faster instead of going to the disk & finding it there. The remains part is how would you generate the hashOfBigURL(tinyURL) that remains not clear. Tushar have demonstrated those options to generate. Please correct me if there is any invalid statement I made.
@AmreshTripathi
@AmreshTripathi 3 года назад
My my we have a design principle celebrity here
@kavyashree513
@kavyashree513 2 года назад
0
@SudiptaKarmakar87
@SudiptaKarmakar87 7 лет назад
Tushar, I thoroughly enjoy your design analysis. One suggestion though, it would be beneficial for the audience if, in the description, you directly linked any reference to the dependencies - concepts, articles etc (example, Apache Zookeeper) that you talk about in the video. Thanks and keep being awesome.
@tusharroy2525
@tusharroy2525 7 лет назад
+Sudipta Karmakar I will add that
@daspradeep
@daspradeep 6 лет назад
the best i have seen on this is to pre-generate the short urls and keep them ready. Have relational DB with failover (two or more DB servers) to hand over these keys (relational allows to implement locking). Hash Partition the read shards with consistent hashing. You have the full solution without any single point of failure or hot-spots.
@vib7777
@vib7777 4 года назад
Counter based approach is not optimal. There are multiples challenges like maintaining the locking in a worker thread. Another thing these days deployment of worker node done multiple times in a day and every time worker will lose the counter and It will have to ask a new range from zookeeper.
@melarryify
@melarryify 6 лет назад
Instead of base 62 we can use base 64.Just need to add 2 more characters. Underscore and dash(_-) can be added to 62 characters we have. This makes its easier to convert a MD5 output to a 7 character printable string as you can convert blocks of 6 bits directly to base64 character. Even this RU-vid link uses underscore to represent url of this video !!!
@ahmer9800
@ahmer9800 5 лет назад
i really appreciate how indepth your design analysis was! thank you
@BestURLShortenerBioPageQRCode
@BestURLShortenerBioPageQRCode 11 месяцев назад
Thanks Tushar, Glad to have you here.
@kingofwebguru
@kingofwebguru 2 года назад
@Tushar, thanks for the video. Please keep up the hard work.
@sberfield
@sberfield 4 года назад
Thank you for this. I have been going through examples and lessons and this made it all click for me.
@jamess5330
@jamess5330 Год назад
Impressive! It is also super helpful to practice with FAANG engineers at Meetapro through mock interviews
@StanislavKozlovsk
@StanislavKozlovsk 7 лет назад
Thank you so much for the video, I am really excited and glad to have you back! Keep on killing it!
@joanazurundu5377
@joanazurundu5377 Год назад
I really learnt alot on this system design, good job
@doreentyy1936
@doreentyy1936 3 года назад
Very concise abd clear video. Easy to follow.
@yaminil3653
@yaminil3653 6 лет назад
Wonderful, Tushar! Lot of details ! So much passion! I'm becoming your fan !!
@tusharroy2525
@tusharroy2525 6 лет назад
+Yamini L thanks
@ksankitha
@ksankitha 7 лет назад
I was so so looking forward to your amazing videos ! keep it coming please .
@tusharroy2525
@tusharroy2525 7 лет назад
I will
@manimaaji
@manimaaji 6 лет назад
Hi Tushar, Awesome video and thanks for the same. The only minor observation I have is that 3.5 Trillion translates to a 2^42=4,398,046,511, 104 i.e. 4.3 Trillion. So we should be using 42 bits instead of 43 bits from the 128 Bit MD5 Hash ? 43 bits translates to 2^43 = 8,796,093,022,208 i.e. 8.7 Trillion.
@osrax
@osrax 5 лет назад
this is correct, I observed that as well. log(2) of 62^7 = 42
@biswajitsingh8790
@biswajitsingh8790 7 лет назад
Hurray ! The legend is back!! more design questions please. 😍😍
@alekseyt4029
@alekseyt4029 7 лет назад
To support ranges we created service that just give your process some range (x to x+100000) and increment counter in DB. It is guarantee the everyone is completely unique and that service is not heavily loaded
@tusharroy2525
@tusharroy2525 7 лет назад
+Aleksey Telyshev fair enough. That should work
@lavida6188
@lavida6188 6 лет назад
Very well explained, Tushar. Liked your presentation and the length of the video.
@akshaysuman8168
@akshaysuman8168 6 лет назад
Hi Tushar so good to see you back . Please upload more videos on system design
@tusharroy2525
@tusharroy2525 6 лет назад
+Akshay Suman doing it
@AbnerLi
@AbnerLi 7 лет назад
Thank you so much Tushar, your videos are always helpful! We really appreciate your work!!
@humanintheloop
@humanintheloop 3 года назад
Around 9:14 Tushar describes a solution that first puts (short, long) to the database and then reads to verify nobody wrote after it. Doesn't the put corrupt the previous value that someone else left there (and wanted it as the value in the database)?
@gsb22
@gsb22 3 года назад
no, it is insert if not present.
@praveensoni5823
@praveensoni5823 19 дней назад
@@gsb22 I don't think so.. "insert if not present" is the second option suggested by @tusharroy2525 not in the third option
@vivekshah6452
@vivekshah6452 6 лет назад
Thanks Tushar your explanation was really simple and it did help me a lot to understand the topic well.
@RandomShowerThoughts
@RandomShowerThoughts Год назад
I haven't seen anyone explain how to implement the b62 hash properly. If you pass in 1 to the typical base62 function, you will get a single character in certain cases
@sc1444
@sc1444 6 лет назад
@Tushar Roy - Thanks for such helpful videos. Please keep doing more...
@sweetmoment8196
@sweetmoment8196 4 года назад
I really liked the idea behind the zookeeper
@jonahren2649
@jonahren2649 5 лет назад
I love your videos, it's so elegant, clean and beautiful
@mtareen
@mtareen 6 лет назад
Instead of covering how we will design starting from what we will ask the interviewer to creating high level design and then scaling that solution and so on and so forth, the presenter is going to detailed into how we will generate the tiny url etc. I would recommend looking at this for a good example of how to design tiny url www.hiredintech.com/classrooms/system-design/lesson/52
@abhikk4064
@abhikk4064 4 года назад
thanks
@vanshthukral5477
@vanshthukral5477 9 месяцев назад
Thanks for creating such amazing content!
@Cosciug1234
@Cosciug1234 7 лет назад
Like video a lot, hoping to see more service design videos in the future
@MrSachintelalwar
@MrSachintelalwar 7 лет назад
Hello Tushar, Thank you for the great explanation and making it very simple to understand the problem. Waiting for more designing problems. :)
@snigdhakasimahanthi3963
@snigdhakasimahanthi3963 5 лет назад
Awesome video Tushar. I don't understand how Approach 3 would work when you put(tinyURL, longURL), get(tinyURL) and check if this is equal to the longURL we have. If there was a duplicate in the DB, won't that get overwritten when I put(tinyURL, longURL) and won't get fetch the new longURL Value?
@weihan4573
@weihan4573 2 года назад
Indeed I have same understanding. The RestAPI put will Update if the key already exist no? So not only this Approach 3 will never end up in "fail and try again" scenario, it will even cause issue as if client try to Get the key, he will get the updated longURL which is not what he wanted.
@nabhavlogs371
@nabhavlogs371 5 лет назад
Sir with 42 bits, we can generate a total combinations of 4*10^12 and with seven bits about 3*10^12. So why did we take 43 bits and not 42 bits?
@wasabinator
@wasabinator 7 лет назад
Great to see design interviews. Welcome back :)
@boomer_money
@boomer_money 4 года назад
11:17 so let's say you make your GET request on a newly generated md5 tiny url and the first result's long url doesn't match the long url you just entered (i.e. that tiny url already exists). How do you generate a new tiny url such that if the same link is entered again you'll be able to find the same one? Bit shift and repeat?
@ruhinapatel6530
@ruhinapatel6530 5 лет назад
Amazing video and explanation.. u got one more subscriber
@yeniaraserol
@yeniaraserol 6 лет назад
The best system design video ever!
@tusharroy2525
@tusharroy2525 6 лет назад
+erol yeniaras thanks
@tongzhu8999
@tongzhu8999 7 лет назад
great guy is back! more system design video please.
@VeganCheeseburger
@VeganCheeseburger 6 лет назад
Brilliant video! Well explained and perfectly paced.
@SuperKako17
@SuperKako17 4 года назад
What is the advantage of the 3rd approach versus the 2nd? I fail to imagine a scenario in which it would be better suited than the second one, which I find elegantly simple.
@NiravPatel1989
@NiravPatel1989 7 лет назад
toooo goood. I think even the interviewer will also have to refer this as well. Hope to see more. :)
@sin78000000
@sin78000000 2 года назад
Perfect concise Solution. Thanks for it :)
@jineshdhruv6151
@jineshdhruv6151 6 лет назад
Thank you for awesome explanation. Please post more videos for design questions.
@iceman4154
@iceman4154 6 лет назад
I have actually built a system similar to this some years ago for company located in Germany. I created a conversion/encoder-decoder algorithm that took a unique database integer ID and encoded it to a minimum length string (minimum 3 characters). It used a 64 character alphabet (a-z, A-Z, 0-9, -, _). Using this algorithm it could handle up to roughly 112 quadrillion unique combinations with a 10 digit shortened code. Also, to avoid hard-coding any shortened values or storing them in the persistence layer, it all generated using a unique/primary key index integer value. To decentralize or distribute it, I would use a Casandra type setup where the values are partitioned by unique ID per instances. Everything else would be exactly the same as you have shown here. Great video. Thank you for your time and effort.
@tusharroy2525
@tusharroy2525 6 лет назад
Great to know about your personal experience. How did you get unique database integerID? Was it from C* sequence number? How much qps your system was doing?
@iceman4154
@iceman4154 6 лет назад
The system itself never grew into anything major. The investors moved onto other projects before it could be tested at any real scale. The part of the system I built was agnostic to the persistence layer. At that time, I was a somewhat new developer and some of the more senior guys were handling that end of things. I basically just came up with the encoder-decoder algorithm and the architecture behind how the experience went. The senior guys handled the distributed systems and persistence layer of the project. Now if I was to build the system, I would do things differently in a more micro-service oriented way. I would probably have a service for generating unique IDs, a service for converting IDs to shortened character and visa versa, and a service for the handling the redirection/landing page functionality. This would allow me to scale each service as needed, independently but I also am not forced to rely on the shortened values being stored in the persistence layer. I would also use Docker and Docker Swam mode to manage things which was not around back then.
@iceman4154
@iceman4154 6 лет назад
This was before any of the nice tools and frameworks were around. We built the system using PHP, using a home-brewed/in-house built web app framework and jQuery for the front-end UI experience, to give you an idea how long ago it was. Probably coming up around close to a decade ago, now.
@tusharroy2525
@tusharroy2525 6 лет назад
Thanks for the update.
@sergeantsaviour808
@sergeantsaviour808 3 года назад
try your best. continue working hard.
@rajmilan2933
@rajmilan2933 7 лет назад
Amazing !! You are like my guru dude learnt so much from you.
@warnercooler4488
@warnercooler4488 3 года назад
Thank you so much for this amazing content!
@achyutakrishna
@achyutakrishna 6 лет назад
Dear Tushar, great topics and great details. One video on sms and email enrollment design pattern. Design options and best practices.
@markusiak1
@markusiak1 7 лет назад
It was very nice explanation! :) I've listened with pleasure.
@tusharroy2525
@tusharroy2525 7 лет назад
Thanks
@aeon327
@aeon327 6 лет назад
Great video! Very informative and helpful information.
@veryconfuseduser
@veryconfuseduser Год назад
Would be better if you could go into more detail about assigning the ranges. I guess to experts already familiar with the strategy it would be easy to understand but for someone who's learning of this the first time it's hard to wrap one's head around it
@HarshitKumar-dj4ev
@HarshitKumar-dj4ev Месяц назад
Exactly. He just glossed over MOST of the details. I highly appreciate his work but for me who is learning about this for the first time, I'm just lost as to how do these actually work behind the high level design
@liuhongqian
@liuhongqian 4 года назад
I remember timestamp is 64 bites(8 bytes). "The internal representation of a timestamp is a string of 7 - 13 bytes. Each byte consists of 2 packed decimal digits. The first 4 bytes represent the date, the next 3 bytes the time, and the last 0 - 6 bytes the fractions of a second"
@saam6348
@saam6348 2 года назад
With year 2038 problem, a 32-bit can represent Unix time. But it will end after the completion of 2,147,483,647 (2^31 - 1) seconds from the beginning (00:00:00 1 January 1970), i.e., on 19 January, 2038 03:14:08 GMT.
@liuhongqian
@liuhongqian 2 года назад
@@saam6348 I would be more than 70 years old by then LOL
@bcjitu
@bcjitu 3 года назад
why only 43 bits ? Based on which logic ?
@DipeshKumarYadavDKY
@DipeshKumarYadavDKY 7 лет назад
Great AWesome Bro to see you back.
@justjaks
@justjaks 7 лет назад
Tushar. I really missed you. Hope you can shed sometime in your busy schedule and create more videos like this
@maazrehman8723
@maazrehman8723 4 года назад
Why not just use auto-incremented id of a url in database to save these records and decode them in runtime ?
@anantsaksena1989
@anantsaksena1989 3 года назад
Hi, In the MD5 hash approach: How can we be sure that for a 7 char length short url , we need to take 43 bits out of 128 bits of the MD5 hash of long url, convert that to decimal and then to base 62. And once we convert it to base 62 it will give a 7 char length short url. What is the mathematics behind the 43 out of 128 bits approach? MY doubt is why just 43 ? Thanks.
@rajeshg3570
@rajeshg3570 2 года назад
Hi @Anant Saksena, Here is the mathematical formula for this - log(2) of 62^7 = 42 .... which means, based on the no of URLs to be generated per sec, we have assumed that if we take base 64 encoding, then you can generate 62^7 URLs ( URL length would be 7 chars). Since MD5 generates binary value, the equivalent number is 2^42. Hence the no of bits required for this calculation is approximately 42
@zishanshaikh3068
@zishanshaikh3068 4 года назад
Great work.. please make more design videos
@HiRyanChen
@HiRyanChen 6 лет назад
What happens if in the actual interview you say, "This is 62^7, and this is about 3.5 trillion"
@tusharroy2525
@tusharroy2525 6 лет назад
Lol. I dunno. Interviewer will think you are well prepared or great mathematician
@raghureddy3237
@raghureddy3237 6 лет назад
you should say it is 2^(7log62).
@otifelix
@otifelix 6 лет назад
Your videos are super educative thanks
@deepakpatra1411
@deepakpatra1411 6 лет назад
Hi, Tushar I have a question for your 3 technique 1.if you are putting the newly generated value in DB and then in 2nd step getting the value against it verify against your long URL(it should override the value at same key and result is always same you will lose the old value ). will not it match always?
@joshuamorning8250
@joshuamorning8250 2 года назад
I have the same concern as you. I think the third approach should have some strategies to prevent this situation. Otherwise, the old value will be wrritten.
@rajatpawar9465
@rajatpawar9465 7 лет назад
Hey Tushar, thank you very much for sharing your knowledge. I'd like to request a discussion on how media (images & videos) is managed by big companies (youtube, imgur, etc) and how it is organized. Thanks!
@83rossb
@83rossb 6 лет назад
look up content delivery networks (CDNs) and consistent hashing
@rplusgdj
@rplusgdj 6 лет назад
16:36 , can we reduce collisions by considering timestamp along with milliseconds?
@Himanshu-ed3mf
@Himanshu-ed3mf 4 года назад
but that would increase the bits needed for the timestamp. from 32 to 48.
@nixer65
@nixer65 6 месяцев назад
You might want to consider a ring based election algorithm to partition the key space. This would allow you to add more hosts as you scale out.
@sushantsharmapec
@sushantsharmapec 7 лет назад
Great Video. Can you talk about the retrieval part? Specially given a long url, see if we already have a url for it and give that back.
@kant.shashi
@kant.shashi 7 лет назад
thanks boss ..you helped me understand how to approach such problems ..
@tusharroy2525
@tusharroy2525 7 лет назад
welcome
@zzzzzzzzzzz6
@zzzzzzzzzzz6 2 года назад
In a sense using an RDBMS is very similar to the counter from a limitations standpoint -- they scale vertically not horizontally, so you'd have the same single point of failure issue.
@angmathew4377
@angmathew4377 6 лет назад
great man a good way to unwrapped complexity.
@HariYT09
@HariYT09 6 лет назад
Hi Tushar that's a detailed and nice video. Thanks so much. Can you please post a video on 1.design of distributed cache of strings to reduce server hits. 2. Design of a memory allocator+garbage collector
@andrew5407
@andrew5407 6 лет назад
I wish I could like this video twice!
@chenboxie6623
@chenboxie6623 7 лет назад
Thank you Tushar, vary helpful. Hoping you can post more design question videos. Thanks.
@saurabhchoudhary9260
@saurabhchoudhary9260 6 лет назад
We miss u!!!. Please come up with more design videos.
@tusharroy2525
@tusharroy2525 6 лет назад
+Saurabh Choudhary added 2 more videos in case you did not check them yet.
@khalidelgazzar
@khalidelgazzar 4 года назад
Super helpful video. Thank you
@halum3
@halum3 7 лет назад
Awesome, learned some good design approach. Thanks
@tusharroy2525
@tusharroy2525 7 лет назад
+Sajjad Hossain welcome
@Abhimalviya
@Abhimalviya 7 лет назад
Thanks for coming back sir
@vinayak.raghuvamshi
@vinayak.raghuvamshi 3 года назад
Nice video. The MD5 was not the best idea IMHO especially given that you are using only the first 43 bits, increasing the probability of collisions. Some interviewers will get all fussy about this choice. I would rather have the candidates stick to the counter method and maybe dive deeper into how we can design a distributed counter.
@TZCoder
@TZCoder 3 года назад
Just pre generate all the possible urls and insert them in the DB then when a user wants a URL pick an available one at random and give it to them, simple, never a colission. You can even pre-generate as required according to the number used up everyday if you don't want all 3.5 bill in the database taking up space. Stick any used URL in redis so no need to hit DB when accessing the URL.
@AmreshTripathi
@AmreshTripathi 3 года назад
thanks for this, I was asked this in interview
@michaeldang8189
@michaeldang8189 5 лет назад
For the first 43bit MD5 solution, what do you do if you encounter a collision with 2 different long URLs, i.e., there are 2 totally different long URLs that has the same first 43bit MD5 sum.
@pllx2
@pllx2 5 лет назад
Perhaps we could pluck some parts of the counter solution, e.g., take 30 bits of the MD5 + the 16 latter bits of the timestamp (since first 16 are much slower to change) + 7 bits randomly generated or by counter?
@user-jd2dj4gd2w
@user-jd2dj4gd2w 4 года назад
we can use 6 bits to express 62 characers since 2^6 > 62. So for 7 characters long url, 7 * 6 = 42 bits are enough
Далее
Can You Bend This Bar?
01:00
Просмотров 4,5 млн
Woman = best friend🤣
00:31
Просмотров 2,8 млн
20 System Design Concepts Explained in 10 Minutes
11:41
System Design: Design a URL Shortener like TinyURL
16:00
What is an API and how do you design it? 🗒️✅
15:26
How to implement TinyURL (System Design Interview)
17:45
Design a Payment System - System Design Interview
31:40
Can You Bend This Bar?
01:00
Просмотров 4,5 млн