► What should I test next? ► AWS is expensive - Infra Support Fund: buymeacoffee.com/antonputra ► Benchmarks: ru-vid.com/group/PLiMWaCMwGJXmcDLvMQeORJ-j_jayKaLVn&si=p-UOaVM_6_SFx52H
Great video! Could you do a comparison between Go and Swift? They're both compiled languages that can be used for backend, but Go is super minimalistic, while Swift feels more like Rust with its rich feature set. Would love to see how they stack up against eack other.
My company uses RabbitMQ as our communication backbone for an IOT-type deployment right now, so this is actually super interesting to see Also, I might have a PR - it looks like you're instantiating a connection for every RabbitMQ consumer, when Rabbit generally prefers that you try to use only one connection and multiple channels to dole out multiple logical connections to the broker. There might be performance to gain/some wasted CPU resources there, depending on how many individual consumers you're actually constructing
I don’t know about rabbit MQ streams, but the catch with things like Redis streams and even NATS is that you can process things out of order on the same partition or message subject if you have multiple consumers which makes it a non starter for a lot of projects As always, thanks for making these videos :)
Thank you for the videos, very useful Would love to see some kind of websocket benching for number of connections and throughput with go, rust, js, erlang/elixir
It would also be excellent to see a comparison when using production best practices, being 3 replications and min isr of 2. Not sure if rabbit mq streams has an analogue but it would be really interesting if so
It would be interesting to see how these two cope with slow consumers or consumer outages. One of the advantages of Kafka or Redpanda is the ability to accommodate differences in speed of processing between producer and consumer.
What you have in Kafka which is quite essential for many due to security is the append only and immutable lugs, the fact that they are stored to disk also retains the documents even in case of a crash. So for a banking system etc it is very important that you know the last transaction and that it isn't lost. RabbitMQ is more for less serious workloads, maybe in a web application backend but nothing I would use for anything that needs security.
This channel keeps getting better and better! Kinda my breakfast companion at this time. A bit curious, do you have Indonesian or south east asia parents? Due to the "Putra" last name.
So, I do feel your comparison and wording is a bit misleading. Since both can be clustered Its not a true apples to apples comparison. Now although we do know Kafka can push more due to the way it is designed. Rabbit can handle 50k msgs per second on a single node - ive tested and seen that. But it also depends on the node specs. However when you cluster, which is what most people would do (HA, reliability and scalability) - then we can see a really good test. Although I do get having the single nodes - but that should really be said. Since its not rabbit vs kafka - its a single rabbit and kafka node vs eachother - since both are designed to be clustered
IME, RabbitMQ is a better choice when your messages are spread over a large number of queues/topics to which many distinct clients subscribe. It can't beat Kafka on throughput on a single queue on a single node, but IME Kafka can, in principle, manage many topics, but doesn't like it - it may unexpectedly crash nodes. If you scale your RabbitMQ cluster dynamically, based on node load, you can a humongous number of queues shared by an equally humongous number of clients with zero problems. Then again, my experience with both is a bit dated.
Did you bind stream into large queues setup or something else? I mean the kind of topology configuration that could affect the test? It seems to be weird as we easily handled 1kk message workload (RabbitMQ) with binary protocol client and there was a spot to get more. The all results are to be reasonable and expected (Kafka is definitely has maturity on a stream processing) but the numbers should be higher.
Not sure this tells you very much. Kafka and RabbitMQ guarantees are very different. Sure, if you are OK to lose data by holding it in memory, go for RabbitMQ. If you can't, then RabbitMQ will be a dumb choice. While it is interesting to see what happens when CPU usage reaches 100%, it's not a safe place to be.
could you compare redwood and kafka and all other and give them points in term of specific points (like throughput , from 0 to 10) , and also a general winner when we add all the points from all metrics. This could be an interesting form of benchmark
@@AntonPutra of course. What I meant was - kafka can perform much better in cluster mode over rabbitmq in cluster mode. You're right, it's different case
Comparing the performance for which one is better i think not suitable. It is more sense when as a solution architect or developer to decide which broker you will choose as your solution for specific project requirement and if choose one of its you know at what time you need to scale and estimating additional cost.
Nice compare. But now I really curious about how topic(s) are created in Kafka. I found num.partitions=1 in Kafka config, if topic created without explicit number of partitions it basically mean single thread for producer/consumer.
i have a topic and single partition per producer/consumer pair just for the test - github.com/antonputra/tutorials/blob/main/lessons/218/client/kafka/producer.go#L31
Now you've got me wanting to write a message broker :/ I'm guessing there are a ton of tricks to improve stability and throughout which would be fun to explore.
Why you test Rabbit MQ with option keep msg memory but Kafka written to disk? That's is not fair. Can you test RabbitMQ ( Amazon MQ in AWS) with config mode lazy( written to disk).
@@AntonPutra As long as my knowledge helps me kafka cluster is active-active, means load and data is distributed between its nodes according to topic partitions and each partition has a different leader in cluster but rabbitmq cluster is meant for HA (active-passive cluster). a replica of master node is there so if something bad happens for master node the replica node takes its place (May be im wrong but if im correct then kafka cluster is Way stronger than a rabbitmq cluster)
Unfortunately this test is flawed: it does not show completely what use case is, 99% of the time you'll use both for different use cases, not for performance itself. If you use Kafka/RabbitMQ for performance, simply sending/receiving then you probably doing this wrong. As a result, this is essentially comparing apples vs oranges. RabbitMQ has superstream, introduced not very long time ago, which is close to Kafka in way how it works, but it still not Kafka, it does not have Kafka guarantees and replication of data (replication is random and can't be changed by anything or key). Moreover, rabbit and kafka can be tuned for performance, but still - use case is a king here, you probably will sacrifice some capabilities that you need.
the requests per second ramp too quickly - i think that once you hit 100% cpu and the latency is building up, all you're doing is increasing the size of an internal queue - you can see that most clearly with rabbitMQ how the ram consumed just shoots through the roof - your requests per second are just being fired off without consideration of the response. once you are sending more requests than can be answered in 1 second, the completed requests per second should stop going up, even if you are sending more.
Kafka is architected to push the messages to disk pretty much instantly and attempts to use the kernel’s page cache instead of a lot of internal caching. The result is that it can ingest and insane amount of data per node. The cpu in this test comes more from the consumers, so as the cpu maxes out you can see that latency increases. The other significant design in that helps it continue to operate at higher volumes is that both producer and consumer batch messages by default. So a producer isn’t ack’ing single messages, but instead it’s ack’ing a batch of messages. So even at max CPU with higher latency, throughout keeps going up because the batch sizes continue to climb.
no one should use either of these if they care about performance and since you dont care about performance you'll probably use kafka anyway those who want performance will use aeron UDP (multicast/broadcast) for pub/sub and persis in/out queues with archive or chronicle queue... and if you do its not even comparable to kafka, chronicle alone was 750x faster and had consistent latency, i will gues aeron is even better