Тёмный
No video :(

22: Recommendation Engine (YouTube, TikTok) | Systems Design Interview Questions With Ex-Google SWE 

Jordan has no life
Подписаться 47 тыс.
Просмотров 8 тыс.
50% 1

Опубликовано:

 

6 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 45   
@siddharthgupta6162
@siddharthgupta6162 3 месяца назад
Hey Jordan, I recently joined Google as an SSE and I wanted to express my sincere gratitude for your system design videos, especially the ones comparing multiple solutions. Those comparisons were exactly what the interviewers were looking for in my feedback.
@jordanhasnolife5163
@jordanhasnolife5163 3 месяца назад
Legend!! Congrats man and enjoy the new role!
@WaxPaxler
@WaxPaxler 4 месяца назад
Congrats on the sponsor bro! Keep up the good work
@JLJConglomeration
@JLJConglomeration 4 месяца назад
Incredible ad read 😂
@brunoalfred
@brunoalfred 4 месяца назад
Just found you channel few days ago. And i'm watching most of your prev videos. Liked the one Message Brokers.
@sharad20073024
@sharad20073024 4 месяца назад
can you please share your slides as well. it will be really helpful.
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
Planning on doing this in bulk after finishing my current series, this will be in the next 1-3 months.
@Anonymous-ym6st
@Anonymous-ym6st 11 дней назад
at 19:00, what do we mean by saying "add as an index entry"? is keep vector1 as index, and v2:v3:v4 (nearby) as a column? or v2:v3:v4 as an index entry? (know limited about vector DB, but trying to understand each vector is represented as geohash, and can be indexed on a single vector?
@jordanhasnolife5163
@jordanhasnolife5163 8 дней назад
When I say create an index I just mean create a database table for each entityId to its closest entity ids.
@alphabeta644
@alphabeta644 Месяц назад
Is my understanding correct that there will be as many Bloom filters in recommendation service as there are users that connect to it? Secondly, as I keep watching more and more videos, my specific Bloom filter would quickly fill up in some days or maybe months. How does our system deal with Bloom filter? Basically filling up all the slots because of plethora of videos that I might have seen over months
@jordanhasnolife5163
@jordanhasnolife5163 Месяц назад
We can have as many or as few as we want since they're just an approximation. We'd have to experiment in practice. Eventually, you just clear it, and let it get filled back up again :)
@Anonymous-ym6st
@Anonymous-ym6st 11 дней назад
quick question on the native solution, why using parquet here?
@jordanhasnolife5163
@jordanhasnolife5163 8 дней назад
It's pretty nice for column compression, if we can get any, and I do believe the files should be immutable once written. Do you have a different proposal?
@zhonglin5985
@zhonglin5985 4 месяца назад
TIL what embedding is! Congrats on the sponsor BTW!!!
@bhaveshupadhyay6657
@bhaveshupadhyay6657 2 месяца назад
My boi getting sponsors. Well deserved. To the moon 🚀
@marksun6420
@marksun6420 Месяц назад
Great video! I did try to digest and understand what you are talking about :) Still got one question: why sharing the vector database by the vector hash won't result in a hot partitioning problem, the same way shading the neighbor index by the vector hash will?
@jordanhasnolife5163
@jordanhasnolife5163 Месяц назад
I think that in theory it could, but most of the additions to the vector data base are being done asynchronously in the background, and so we have more flexibility to temporarily stop all writes and rebalance as needed. We'd want to shard in a similar fashion though, where vectors with close proximity are near one another.
@hakapuu
@hakapuu 2 месяца назад
Great video! I see that you used a heap for new entries into the closest neighbor index. Isnt insertion time into a heap the same O(Logn) as would be in a db index which uses B+ trees? Do understand that in the index we might need to replace multiple rows vs using a heap that wont happen. Is that the optimization here? Trying to understand how this speeds up things.
@jordanhasnolife5163
@jordanhasnolife5163 2 месяца назад
The optimization is that this is in memory
@jamesliu551
@jamesliu551 18 дней назад
Dude getting a sponsor, leaving all us dumb ass behind
@aforty1
@aforty1 3 месяца назад
Thank you for these videos!
@priteshacharya
@priteshacharya 4 месяца назад
Great video Jordan! Learned a lot from this video. One question on Recommendation Service -> Neighbor index flow at 41:18 . Since we are sharding Neighbor index by entity_id, the recommendation services, in case of cash miss, has to scatter and gather right? Entity 12, 13, 62 (examples in the slide) could be in different partition
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
They would have to fetch the neighbors for their last x watched videos. So for each of those x videos, all of its neighbors will be on the same node, but otherwise we may have to hit up to x different partitions.
@adw6579
@adw6579 4 месяца назад
Hey Jordan. What books would you recommend I read? I have already finished DDIA.
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
Hey! I'd probably start reading some white papers! As for which ones, there are like 10-20 tools on LinkedIn who only post links of other people's content on their pages, hopefully one of them is decent
@RaviMenu
@RaviMenu 4 месяца назад
Great video Jordan! Can you do one for ACID based system like Digital wallet or Bank ? or a combination of both like bank to wallet and wallet to wallet may be ?
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
Where do you see the challenge here? At least to me, this initially just feels like you'll need ACID databases, or two phase commit when making a transaction between two accounts on different partitions.
@OptimizingLiving
@OptimizingLiving 4 месяца назад
Hey Jordon, great content. Thank you for making these videos in depth. QQ- Do you think we can use graph databases such as neo4j instead of neighbor index for faster reads.
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
I think that you could, but consider this - for every vector (which is an arbitrary set of points), you'd need to create an edge to other vectors, so that you can traverse the graph. How do you decide which ones to do that for? Even then, let's imagine you could - you'd still have to run a breadth first search to find the closest vectors. I'd think that pre-caching your answers here will just about always be the fastest option.
@vipulspartacus7771
@vipulspartacus7771 4 месяца назад
Hi Jordan, can you please share your ipad notes. Maybe they are not perfect but they serve as some sort of reference to revise
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
Planning on doing this in bulk after finishing my current series, this will be in the next 1-3 months.
@vipulspartacus7771
@vipulspartacus7771 3 месяца назад
@@jordanhasnolife5163 Hi, don't mean to rush you but I have some important interviews in coming weeks and having your notes will really help me prep better. Can you share them in any form. I understand there can be mistakes or typos in them but I want to be able to quickly revise all the overarching concepts and designs
@jordanhasnolife5163
@jordanhasnolife5163 3 месяца назад
@@vipulspartacus7771 Hi Vipul - understand your rush here, it will take me a few hours to properly export everything, which is the reason for the delay. I haven't sat down and done it. Additionally, once I do, I'd like to publicize that a bit, as I hope that they can help me build my following if we're being fully transparent here. My original slides contain all of the same information.
@vipulspartacus7771
@vipulspartacus7771 3 месяца назад
​@@jordanhasnolife5163 Sure Jordan, I understand, I look forward to it. Once again, really appreciate the content
@midicine2114
@midicine2114 3 месяца назад
Techlead catching well deserved strays
@sandeepreddy6295
@sandeepreddy6295 4 месяца назад
Hey Jordan, why an 'In-memory-broker' instead of a broker like Kafka?
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
Hey! For the sake of this video, it probably doesn't have to be, but I'd say check out my video on how to design youtube
@htm332
@htm332 3 месяца назад
does this design account for popularity/ trendiness of a given entity? For example if a random video from an unknown creator becomes suddenly extremely popular (happens a lot on tiktok) it should be recommended whereas an hour previous it was unpopular and irrelevant thus should not have been recommended
@jordanhasnolife5163
@jordanhasnolife5163 3 месяца назад
It does not, and good point! I think for something like this you'd want to see the Top K video, and basically keep a cache of which videos are "trending" in the last x hours to apply a score boost.
@ava9xx3js9j
@ava9xx3js9j 3 месяца назад
Hey Jordan! Are you looking to adopt by any chance? Jk Love your content ❤ Are you a full time RU-vidr now?
@jordanhasnolife5163
@jordanhasnolife5163 3 месяца назад
Nope, still working for better or for worse lol Happy to adopt - you any good at cooking?
@jordanhasnolife5163
@jordanhasnolife5163 4 месяца назад
To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/Jordanhasnolife/ . You’ll also get 20% off an annual premium subscription.
@user-mz9gf8ux8u
@user-mz9gf8ux8u 2 месяца назад
agree tech lead is a sham that shafted his followers (as a millionaire)
@jordanhasnolife5163
@jordanhasnolife5163 2 месяца назад
I might do it too (as a non millionaire)
@user-mz9gf8ux8u
@user-mz9gf8ux8u 2 месяца назад
@@jordanhasnolife5163 yes jordan responded to me! it would be an honor to be your victim, sempai
Далее
Сказала дочке НЕТ!
00:24
Просмотров 514 тыс.
What are AI Agents?
12:29
Просмотров 309 тыс.
Сказала дочке НЕТ!
00:24
Просмотров 514 тыс.