We are seeing increasing numbers of enterprise projects where data is produced, consumed, analyzed, and reacted to in real-time. In this way, the technology becomes aware of what’s going on inside and around it—making pragmatic, tactical decisions on its own. We see this being played out in transportation, telephony, healthcare, security and law enforcement, finance, manufacturing, and in most sectors of every industry.
Prior to this evolution, the analytical ramifications inherent in the data were derived long after the event that produced or created the data had passed. Now we can use technology to capture, analyze, and take action based on what is going on in the moment.
This category of data is known by several names: streaming, messaging, live feeds, real-time, and event-driven. In the streaming data and message queuing technology space, there are a number of popular technologies in use, including Apache Kafka and Apache Pulsar ™.
In January, DataStax, known for its commercial support, software, and cloud database-as-a-service for Apache Cassandra™, launched a new line of business for data streaming called Luna Streaming. DataStax Luna Streaming is a subscription service based on open-source Apache Pulsar. In April, DataStax launched a private beta for streaming Pulsar as a service to target data engineers, software engineers, and enterprise architects.
We recently ran a performance test comparing Luna Streaming (Pulsar) and Kafka clusters with Kubernetes. We wanted to see if the inherent architectural benefits of Pulsar (tiered storage, decoupled compute and storage, multitenancy) enabled an efficient architecture that yields tangible performance benefits in real-world scenarios.
We deployed a Kubernetes cluster onto Amazon Web Services EC2 instances and used the OpenMessaging Benchmark (OMB) test harness to conduct our evaluation. We worked with the Confluent fork of the OpenMessaging Benchmark on GitHub. We also used the same hardware configuration instance types for Kafka brokers and to co-locate the Pulsar brokers and Bookkeeper nodes to take advantage of the two large (2.5TB), fast, locally-attached NVMe solid-state drives.
For Kafka, we spanned the persistent volume storage across both disks. For Pulsar, we created persistent volumes and used both of the local drives for the Bookkeeper ledger and the other for the ranges. For the Bookkeeper journal, we provisioned a 100GB gp3 AWS Elastic Band Storage (EBS) volume with 4,000 IOPS and 1,000 MB/s throughput. Other than taking advantage of this storage configuration for both platforms, we performed no other specific tuning of either platform and preferred instead to go with their “out-of-the-box” configurations as they were deployed via their respective Docker images and Helm charts.
Our performance testing revealed Luna Streaming had a higher average throughput in all the OMB testing workloads we performed. In terms of broker node equivalence, we found:
3 Luna Streaming nodes @ 5 Kafka nodes
6 Luna Streaming nodes @ 8 Kafka nodes
9 Luna Streaming nodes @ 14 Kafka nodes
We assumed simple linear growth of an enterprise’s streaming data needs over a three-year period—a “small” cluster (3x Luna Streaming or 5x Kafka) in Year 1, a “medium” in Year 2 (6x Luna Streaming or 8x Kafka), and a “large” (9x Luna Streaming or 14x Kafka) in Year 3. Using the node equivalences found in our testing above, this would result in a 33% savings in infrastructure costs by using Luna Streaming instead of Kafka.
In this scenario focused on “peak period” workloads, we found a savings of around 50%, depending on the percentage of time the peak periods last.
For our third cost scenario, we focused on projects that may have significant complexity but limited raw throughput requirements, resulting in an organizational environment that mandates a high number of topics and partitions to handle the wide range of needs across the entire enterprise. In this case, we found infrastructure savings of 75% using Luna Streaming over Kafka.
You can download the report, with a complete description of the tests and implications of the results, here.