Chat App | WhatsApp | Facebook Messenger | System Design

ByteMonk

Подписаться 44 тыс.

Просмотров 101 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

22 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 114

@jwbonnett Год назад

I would treat a 2 person chat as a group, that way you can easily add users to that same chat without converting it to a group. Messenger has a character of 20,000 which is 156.25Kb and the average message will be far less than that, maybe around the 400 char mark. That is 3.125Kb, so a total average of 585.93GB per day.

@ByteMonk Год назад

great suggestion!

@kalahari8295 Год назад

Interface segregation principle

@_plsubscribe_ 4 месяца назад

So like discord

@rr6403 4 месяца назад

@jwbonett- How do you get 3.125Kb number?. When avg number is 400 char mark, it will be .4Kb per message.

@shivansh901 Месяц назад

He may incorporated msg size of photos as well. But still it won't be 75PB 😂 it'd be 75TB instead

@SAURABHGUPTA_DINOSAUR Год назад

I swear nobody else on youtube explained this and that also in such a well presented manner, wow!

@jarodmorris4408 9 месяцев назад

This is a fantastic overview. There are many more questions to answer now, but this gives a great summary of what to expect and how to think about the service and all its parts.

@lance3401 8 месяцев назад

I always we have to draw all our projects before anything else, planning it's the most important aspect in development, product specifications are a great video my friend first time to saw it, and I will see all other videos I love system design to become a a better software developer.

@araj6920 6 месяцев назад

It is right to use socket for sending messages purpose rather than http connection which is stateless but it may also break the socket connection due to payload and even if we use stream or batches to send heavy payloads then it will be more preferrable to use rest api for sending message along with socket inside it to notify users in real time.

@KavitaChawla-q3n 10 месяцев назад

Just few thoughts, wanted to understand few points: 1. We can have cache to handle the connection information of all websocket and user connections 2. Why can web socket manager not be used for containing data on web socket to user connections for 1:1 communication ? 3. For unsent messages can we have a queue for the service to stream it async.

@govindagarwal2389 7 месяцев назад

Fantastic video. Finally I understand this topic. One question, Why use session service for messaging service and web-socket manager for group messaging. Session info only has info about service or system which is connected to user and web socket can only tell which web-socket user is connected to but don't we need both for group messaging and single user messaging? In any case should we not need same info for both Group and single user messaging? It will be great if you could help out.

@ByteMonk 7 месяцев назад

Thank you! If I understand your question correctly. For single-user messaging, the Session Service is typically sufficient because you're dealing with a direct exchange between two users. The Session Service keeps track of who is logged in and facilitates the communication between them. For group messaging, while WebSocket managers are indeed used for the real-time communication aspect, you may still need the Session Service to handle authentication, authorization, and potentially other user-specific details. However, once the users are authenticated and authorized, the WebSocket manager can take over to manage the real-time exchange of messages among multiple users efficiently. In short, the Session Service handles user authentication and individual user sessions, while the WebSocket manager handles real-time message exchange, especially for group messaging where broadcasting messages to multiple users is essential.

@InvincibleMan99 Месяц назад

@@ByteMonk yes well explained. Session Service - Auth service -> Manager authentication in LB env Websocket - To manage websocket connection, transfer of data. But I think that can be managed by messaging service itself?

@Adamskyization Месяц назад

This is the best explanation I could find online. and i see your other videos are pretty good too! thank you!

@kejalkalyani1263 Год назад

Great Video ... But, 1.5 Billion * 50 KB = 75 TB and not 75 PB

@koeber99 9 месяцев назад

lol..... where did you get 1.5 billion from? 500M*3 = 1.5B .. 500M*30 is 15B!!

@vivekpaliwal1876 Год назад

Please make vedios on components also which are used in system design like websocket, what is no sql db, sql db, s3, cassandra, hadoop, apache spark, replica, cluster, cdn, redis cache etc etc...only basic info we need because these are common in every system design.

@ByteMonk Год назад

Please checkout the description for "System Design Interview Basics Playlist" almost all topics covered there and more coming. Enjoy :)

@AnkitaNallana 4 месяца назад

This was succint and actually dealt with the whole application (with the important features) really well! Thanks so much!!

@henrygengiti7861 8 месяцев назад

How can one justify storing that huge volume of data in a relational DB? This can lead to many questions.. I would suggest storing user metadata in a RDMS but messages in a document store like MongoDB which can scale better horizontally

@nicolastreat3477 Год назад

Thanks for making this video. The glaring issue I see with this design is in the back of the envelope estimation. Any good interviewer is going to stop you as soon as you say 75 PB / day. Unless you’re building the next Google cloud, this amount of storage is completely unreasonable, especially for a chat app. Estimations are not there to just impress the interviewer, they exist to influence the design. The right move would be to say, this amount of storage is likely unreasonable for a system such as this and thus, we will not be storing user messages in the cloud.

@ByteMonk Год назад

Thanks for the comment, what you mentioned is definitely an approach to consider. Most of the MAANG interviewers we know don't even care about those numbers, but it's always considered a good thing to note that candidate is thinking along those lines. There are however occasions when interviewer tend to dive deep with tech requirements, but that should be coming from the interviewer (not the candidate). So if a candidate says 75 PB/day, the interviewer can stop him that he is not looking for WhatsApp/Messenger scale, on the other hand, if candidate suggests /assumes massive storage is not needed, the interviewer can still stop him and clarify that he is looking for high scale. Either way, it's not necessarily a bad thing to have that conversation. It's however better to refrain from spending too much time here.

@mandyvarel Год назад

@@ByteMonk Would you please review your calculation? I think it's 75 TB instead of PB. Hence, it would be much more reasonable.

@JoshSmeda Год назад

@@mandyvarel Yes, it's supposed to be TB

@ByteMonk Год назад

@@JoshSmeda thanks for the correction, sorry for the delayed response 🙏

@debasish2332 6 месяцев назад

I can bet that I haven't heard someone explained better that you.

@InvincibleMan99 Месяц назад

Excellent video but so much said in a very short time if explained in bit detail, then would have been awesome

@rohitk2497 5 месяцев назад

what is the difference between the session service and the web socket manager? Why does the group service use web socket manager to figure out where to direct messages to reach the right users, while the messaging service gets this info from the session service?

@sejalkacharia3132 Месяц назад

I have the same question. @bytemonk can you please explain this

@amark7422 Год назад

500 m * 30 = 15000m = 15 B 2:38

@arashkoushkebaghi1432 13 дней назад

this took me long to figure out. fix the video at least. its supposed to be educational with correct material...

@frankiewong2215 9 месяцев назад

Thank you so much for your content. Just a question, if a mobile device is not in use for long time or your messaging application has been killed by the Android. Then the WebSocket will be failed. How can we prevent this for messaging application?

@rahul6839 Месяц назад

Excellent video, one doubt why do we have message queue like kafka for group message alone, what is the reason behind it?

@nknidhi321 8 месяцев назад

Awesome

@jadeedstoresupport8916 Год назад

Correction at 2:40 500M x 30 = 15,000 M = 15 B

@kebman Месяц назад

E2E is not enough. You need Dynamic DHT as well. Every user is an ephemeral relay-node. You can't look this up, because it hasn't been made yet.

@Anton-zb9dc Год назад

Hey! How are you creating excalidraw files into such a beautiful presentation? Have you found your own approach? Could you share any resource to create visually similar videos to yours?

@ByteMonk Год назад

Thank you, it is time consuming work to maintain quality of what I want to say and sync them up with visuals. I combine photoshop , FCP and Adobe products

@Anton-zb9dc Год назад

@@ByteMonk , looks awesome! I really enjoy open creators who response to the comment section! Keep your content up!

@aungthura396 10 месяцев назад

In order to implement in django, I need to treat services as an app in django?Or should I create another projects for the service like messaging services or Group services?

@hackwithharsha Год назад

Thank You… You have a load balancer between user machine and web socket handler machine… How does load balancer handles bi-directional connections like web sockets ?

@ByteMonk Год назад

Great question, as you probably know by now, that the difference with a regular HTTP connection is that the WebSocket connection is meant to stay open. If using websockets, your sever will reply the clients with 101 Switching Protocols, telling the client to upgrade to a WebSocket connection. So an HTTP connection becomes WS, and HTTPS becomes WSS. Modern day load balancers can automatically upgrade an HTTP connection to a WebSocket connection and once that happens, messages will travel back and forth through a WebSocket tunnel. Here is an example of AWS ALB here aws.amazon.com/blogs/compute/using-websockets-and-load-balancers-part-two/

@hackwithharsha Год назад

@@ByteMonk Got it, Thank You !!

@kishoreKumar-ee1eg Месяц назад

can we consider it as a multiroom chat project?

@samirpathak8580 Год назад

Why do we need both session service and websocket manager? Can’t webscoket manager locate recipient for one on one chat?

@mandyvarel Год назад

Great video! Thank you! A question on the Relay service: would you consider using a queue instead of a database for the unsent messages?

@ByteMonk Год назад

Thank you, To your question, it depends how you are using "messages". Queues are typically used for loose coupling or fan out, but you can use them here too! especially if you can show if it can be operationally cost effective. Thanks for the suggestion.

@koeber99 9 месяцев назад

Wow the number are definitely off for capacity planning!!! 1) 500M*3 = 1.5B .. 500M*30 is 15B!! 2) 1.5 B *50KB is 75 TB. 3) The average size of a photo is 1-2 MB and videos are definitely bigger, so 50KB is an under estimation for a MSG!!!!

@ByteMonk 9 месяцев назад

Thank you for noticing and commenting here, its definitely way off! that will however impact the autoscaling, I won't change the system architecture much except treating a 2 person chat as a group as pointed by another user whose comment I have pinned. This video has been posted a while back, will ensure we crosscheck the calculations before posting. Thanks again!🙏

@sunilk9760 4 месяца назад

We can use firebase realtime database

@m_t_t_ Год назад

What technology would you recommend for inter-service communication? For example the message service communicating with the session service using RPC or a private API or something else

@jwbonnett Год назад

A message bus. If you use other technologies like HTTP, GRPC .etc you have to pass requests in a chained fashion and that can cause huge problems. A message bus has natural load balancing and resilience to services being down.

@PadamAgrawal Год назад

Short and crystal clear

@deadlyecho Год назад

I am confused between the session service and the web socket manager. I thought that the session service is used to keep track of the user/server (socket) handler mappings. So, why do we need the socket manager ?

@ByteMonk Год назад

Session service is keeping track of the Servers users are connected to, a Server can have multiple ports for Web Sockets and Web Socket handlers (light weight), which needs to be managed by a Web Socket Manager. For further clarity, please checkout my video on Web Sockets from the system design playlist in the description above. Thanks.

@deadlyecho Год назад

@@ByteMonk Thanks 😊 and keep up the good work 👏

@eitannakash Год назад

thanks for the great video. can you explain why 50 kb for each message? Isn't that too big?

@ByteMonk Год назад

Yes, these are not hard/actual numbers, but given that a message can be a jpg/video file occasionally, 50kb is assumed as the average. Your interviewer may stop and ask you, and you can adjust your capacity accordingly. No negative points as long you are not way out of range, in fact this can be a healthy discussion with your interviewer if they insist.

@PramodThakur4u 11 месяцев назад

I have one query, if a user is offline the message goes to relay service and stored in a queue and will be sent when a user come online but let's say the relay servce is busy and user is online and he receives a new message from a different user. Meanwhile the relay service process the message from the queue and sends to the user. If this happens then the order of the message is not preserved how do we prevent this.

@fancyAlex1993 5 месяцев назад

How do you get server id in session service ? Is it from the websocket ?

@wil9861 Год назад

What tool are you using for the architecture diagram?

@dhoneybeekingdom7889 Год назад

In another comment they said they used FCP and Adobe products.

@ByteMonk Год назад

thats correct

@satish-pokala Год назад

Great video!!

@mohamadbt4055 10 месяцев назад

nice

@himanshubhandari7222 8 месяцев назад

How messaging service will send the message to a specific handler/server?

@VivekSingh-wr9vb 5 месяцев назад

Is there any scope of cache??

@dragonzhao433 Год назад

looks like we only talked about dataschema for remote side not local? how should we store/retrieve messages locally? and how we make sure the consistence between local data and remote?

@ByteMonk Год назад

Great question! To store and retrieve messages locally, you can use a local database on the client-side. The most common options for this include SQLite, Realm, or CoreData. When it comes to ensuring consistency between local data and remote data, one approach is to implement a synchronization mechanism that keeps the data in sync between the client and server. Here are a few ways you could approach this: Real-time synchronization: Use websockets or push notifications to notify the client of new messages. This way, as soon as a new message arrives on the server, it can be immediately pushed to the client. This approach ensures that the local data is always up-to-date. Periodic synchronization: Periodically check for new messages on the server and update the local database accordingly. This approach requires the client to periodically poll the server to check for new messages. This can be less efficient than real-time synchronization, but it can be useful if you don't need real-time updates. Conflict resolution: In case of conflicts between local and remote data, you need to define a conflict resolution strategy. For example, if a message was deleted on the server but not on the client, you need to decide which version of the message to keep. One approach could be to always favor the server-side data, or to prompt the user to choose which version to keep. This could be a follow up question in the interview setting, I won't expect the interviewer to do a detailed design, here would be my response.

@dragonzhao433 Год назад

@@ByteMonk thanks for your reply, one more question: we only store message in client is that correct? because i didn't see we have table to store message from remote side as shown in video.

@mukeshp5081 Год назад

2:50 Sir why are you dividing it by 3600(what does 3600 Mean??)

@ByteMonk Год назад

There are 24 hours in a day, 60 minutes in an hour, and 60 seconds in a minute. 3600 is the number of seconds in an hour. We wan't to know how many messages need to be processed in seconds and so I am dividing it by 24 * 3600 = 86400 (Total number of seconds in a day)

@jadeedstoresupport8916 Год назад

Considering its an interview environment, we could have approximated the total number of seconds in a day to be 100,000. That way we would avoid unduly taxing ourselves with precise calculations in an interview setting.

@derrickmugerwagiles1809 Год назад

I loved it ❤❤

@piyushjaiswal8993 Год назад

Great Video! Clear and to the point explanation.

@ashishbhardwaj9760 Год назад

I spoke the same system design in the interview and the interviewer was impressed thanks @ByteMonk

@ByteMonk Год назад

Awesome!

@manoelramon8283 Год назад

what do you use to create these animations ? this video is awesome

@ByteMonk Год назад

Thank you, I am using FCP and Adobe products.

@manoelramon8283 Год назад

can we say the websocket manager can be zookeeper ?

@ByteMonk Год назад

WebSocket Manager provides a way for the server to push data to the client in real-time and enables bidirectional communication between the two endpoints and is typically used in web applications that require real-time updates or messaging. On the other hand, ZooKeeper is a distributed coordination service that provides a centralized repository for storing configuration information and synchronization data across a distributed system. I will prefer Websocket manager terminology.

@m_t_t_ Год назад

Did you forget to make it end-to-end encrypted? I think you will need more API endpoints to implement the signal algorithm

@ByteMonk Год назад

I left it out intentionally. Implementing end-to-end encryption is a complex task, and it's crucial to thoroughly test and review your implementation. Implementing end-to-end encryption in a chat application similar to WhatsApp involves using a cryptographic protocol such as the Signal Protocol, which is known for its strong security and privacy features To implement the Signal Protocol or a similar algorithm, you'll need to consider 1. Implementing the cryptographic primitives: You'll need to include libraries or implement cryptographic functions for key generation, key exchange, symmetric encryption, and decryption. 2. Manage user keys: Develop a system to generate and securely store the encryption keys for each user. These keys should be protected using strong encryption and access controls. 3. Implement key exchange: 4. Session management 5. Encrypt and decrypt messages Regarding API endpoints, while the Signal Protocol itself doesn't necessarily require additional API endpoints, you may need to develop APIs for managing user registration, key exchange, and message transmission. These APIs would handle tasks such as user authentication, key storage/retrieval, and message delivery. The specific endpoints required will depend on your application's architecture and requirements.

@m_t_t_ Год назад

@@ByteMonk that’s a brilliant explanation. I implemented the signal algorithm with using no libraries for my high school computer science project but didn’t have time to fully implement it (QR codes to compare identity keys, group messaging). Not proud of it though because I didn’t even know about transactions back then so it can be broken easily, it’s not distributed, it was implemented using pure TCP sockets with an API built on top of that and it was designed to be monolithic.

@ByteMonk Год назад

@@m_t_t_ While your high school project may not have been perfect, it's commendable that you tackled such a challenging task. Take this experience as an opportunity to learn and improve your skills in building secure and scalable systems. Recognizing the limitations of your initial implementation is an important step towards improvement. Here are a few suggestions on how you could enhance your implementation: 1. Use secure communication channels: Instead of relying solely on TCP sockets, consider implementing your solution using secure communication protocols like TLS or HTTPS. 2. Implement secure key exchange: In addition to QR codes, consider incorporating a secure key exchange mechanism such as the Diffie-Hellman key exchange protocol. 3. Group messaging: Extend your implementation to support secure group messaging. This involves managing group memberships, handling encryption and decryption for multiple participants, and ensuring secure message distribution within the group. 4. Consider a distributed architecture: To enhance scalability and reliability, consider moving away from a monolithic architecture and explore a distributed system design. This could involve using technologies like distributed databases, load balancing, and fault-tolerant infrastructure. Check out my full system design playlist in description and share with your friends if you found this helpful :)

@Test-hq4jq 11 месяцев назад

Can someone help me understand the start phase - having websocket handlers for each user and the use of session service, Does all users are on separate servers, what is the architecture for websocket connection? Please point to any other available resources if possible, thanks

@arturodejongh4641 8 месяцев назад

I don't think he meant every user is in a separate server. There are N users and M servers. Every server can handle one or more websocket connections, one per user.

@Glashutte1900 Год назад

The biggest issiue we have is when the app is in kill state so the messages wont deliver then we say lets use the fcm for that may i ask is there any way that we can create our own system like the fcm as lets say we dont trust google for data handling and want to create our own service i will be looking forward for your reply

@ByteMonk Год назад

It is possible to create your own messaging system similar to FCM (Firebase Cloud Messaging) for handling message delivery in a scenario where the app is in a "kill state." However, building your own messaging system can be complex and resource-intensive, so it's important to carefully consider the trade-offs involved. You would need to set up a reliable and scalable infrastructure to handle the messaging service. This would involve managing servers, databases, network infrastructure, and other components necessary for message storage and delivery. You would also need to Implement a message queuing system to handle the delivery of messages. When a message is sent while the app is in a "kill state," it can be stored in a message queue for later retrieval and delivery to the intended recipient.

@Glashutte1900 Год назад

@@ByteMonk thanks for the reply I am very great full we did try that but the biggest problem we faced that we were unable to awake the app in android and iOS

@ByteMonk Год назад

@@Glashutte1900 Awakening the app in Android and iOS when it is in a "kill state" can be challenging due to the operating system's restrictions and limitations. Both platforms have specific guidelines and restrictions on how apps can be woken up or run in the background. In Android, you can use background services to perform certain tasks even when the app is not actively running. However, starting from Android 8.0 (API level 26) and above, there are limitations on background execution to improve battery life and performance. Background services may be subject to restrictions and may be terminated by the system if deemed necessary. iOS provides specific background modes that allow apps to perform certain tasks in the background. However, Apple imposes restrictions on the use of background modes to preserve battery life and protect user privacy. You need to carefully review and comply with Apple's guidelines for background execution.

@Glashutte1900 Год назад

@@ByteMonk you are absolutely right 👍 the only solution it seems to build something similar to FCM but again this is also a Google product and we don't know what facilities they already have which we can not access