Design an Autocomplete System | System Design

Code with Irtiza

Подписаться 13 тыс.

Просмотров 8 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

1 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 35

@ManojReddy22 4 месяца назад

How do we handle frequency in case of trie. i want to only return top 10 most frequent suggestions

@ketheric Год назад

If you're going to store the data in Cassandra, you don't need a trie...

@irtizahafiz 11 месяцев назад

I don't think that's what I meant in the video. I was trying to give an idea of how Tries can be used.

@luh-lat 2 месяца назад

Okay there are several gaps. When dealing with large scale data you'll likely need to partition your tries. If you partition. Your tries how do you partition them? If you partition your tries by hash of the full search queries they represent then it will be tricky because it will be hard for you to determine from a prefix what partition to go to? The prefix 's hash could send you to a totally different partition

@nimageofmine Год назад

how do you handle auto suggest with multiple tokens. E.g. "how does auto suggest work in google"?

@LuluHou Год назад

great design! thanks a lot! just one question hope you could help. generating trie is kind of distributed right? for example, a-bi, bj-cid...,suppose pefix sorted. each sub-trie contributes to hashmap construction, it is possible that lower level sub-trie could overlap on higher level prefix key. and a heap could be used to filter out top k frequency. is my understanding right?

@irtizahafiz 11 месяцев назад

Hi! Unfortunately, I don't know too much about tries. It's been a while since I took Data Structures, haha. Hopefully someone else can help answer here.

@fancyAlex1993 4 месяца назад

So I have two questions 1. If we are going for the TRIE data structure then there is no need for the table 3:23 right, we just build the TRIE ds and store it in the DB. From there, we can either serve the data either from the DB or cache right ? 2. Every once a week, we can run a Cron job to execute a spark job which will take whatever there is in our logs and update our TRIE data structure and update it in our DB. Is my understanding correct ?

@avanaresident 10 месяцев назад

Very nicely explained here . Thx . Just a question on choice of data store , why casandara as its not even a key-value store ?

@nishpatwa007 2 года назад

Did not understand the part on how Trie is stored in Cassandra? Can you throw some light there?

@irtizahafiz 2 года назад

Hi! Sure. So as part of the optimization, for every Trie node, we are storing all the leaf nodes you could reach from that node. Let’s say we have a node with prefix “ca”. Some leaf nodes you could reach from this node are: canada, cat, can, cambodia, etc. So in the Trie node “ca” we can store all those leaf nodes. And then we can store that Trie node in Cassandra with the key being “ca”. Now, if you want to find all the leaf nodes for the node “ca”, instead of parsing the tree, you can just query Cassandra for primary key “ca”.

@석상주 2 года назад

@@irtizahafiz In order to parse Trie, don't you have to build the Trie first? How do you build it if it doesn't fit in main memory?

@stylishskater92 Год назад

@@석상주 I would like to know this as well

@flameex1708 2 года назад

Great explanation. Keep up the good work big dawg

@irtizahafiz 2 года назад

Appreciate it!

@gowthamannachiappan706 5 месяцев назад

Thanks for the awesome video. Should we use Graph database Neo4j or something to store the Trie data structure?

@irtizahafiz 5 месяцев назад

Not too familiar with Neo4j unfortunately, but I hope someone else can help you out here.

@deepjyotkaurbindra Год назад

This was beautifully explained, thanks!

@bulgakovwork2022 2 года назад

Sorry, but how hashmap will exist? the key is a letter and the value is a list of words or a list of next letters?

@irtizahafiz Год назад

Key will be the prefix, value will be list of possible search terms in some order.

@leamon9024 Год назад

Hi, thanks for sharing. Is it possible to use elasticsearch here instead of cassandra?

@irtizahafiz 11 месяцев назад

Yes. There are multiple ways you can implement an autocomplete feature. This is just one of them, and also at a very high level.

@turtleoctopusdog3792 10 месяцев назад

freaking awsome

@mayureshsohani9913 2 месяца назад

Much helpful

@SupriyaBaruQA 5 месяцев назад

Amazing

@irtizahafiz 5 месяцев назад

Thank you! Cheers!

@조바이든-r6r 2 года назад

damn good man

@amitrastogi1405 11 месяцев назад

Very well explained. Thanks!

@irtizahafiz 10 месяцев назад

Thank you for watching!

@VijayNarayanan25 2 года назад

Thank you! Best explanation that I could find.

@irtizahafiz 2 года назад

Glad you found it helpful! : ) Stay tuned for more of these.

@TheodoreTaylor-mq8yi 7 месяцев назад

2 times 4:17 @@irtizahafiz

@Analytics4u 2 месяца назад

What is this crap ?

@wellaide Год назад

Thanks for sharing. However, I do feel there are more nuances than simply using a trie structure. For example, how can each user's personal interest be reflected? What if some users start with uncommon query which isn't stored in the database but the intent can be clearly predicted?

@irtizahafiz 10 месяцев назад

Yes definitely! This is just a high level idea.