Multiple Producers and Multiple Consumers in a Kafka Topic: A Beginner’s Guide: Part 3

Manishankar Jaiswal
4 min readSep 10, 2024

--

Apache Kafka is a powerful distributed streaming platform that allows multiple producers and consumers to interact with data in real-time. This makes Kafka ideal for use cases where you need to process large amounts of data efficiently and in parallel. In this blog post, we’ll explore how Kafka enables multiple producers and multiple consumers to work seamlessly with a single topic.

Multiple Producers and Multiple Consumers in a Kafka Topic
Multiple Producers and Multiple Consumers in a Kafka Topic

We’ll start by understanding the basic concepts, and then dive into how to implement multiple producers and consumers in Kafka using command-line tools. Whether you’re new to Kafka or have some experience, this guide will help you get started with this powerful messaging system.

Understanding Kafka’s Architecture

What is a Kafka Topic?

A Kafka topic is a logical channel where data is published by producers and consumed by consumers. Think of a topic as a category or feed name to which records (messages) are sent by producers. Kafka stores data in topics, and consumers subscribe to these topics to read the data.

Partitions in a Kafka Topic

Each Kafka topic is divided into partitions. Partitions allow Kafka to scale horizontally by distributing the data across multiple brokers. This ensures that Kafka can handle large amounts of data by spreading the load.

Each partition in a topic is an ordered, immutable sequence of records. Producers write data to partitions, and consumers read data from them. By dividing a topic into multiple partitions, Kafka allows multiple producers and consumers to operate in parallel.

Producers and Consumers

  • Producers: Producers are clients that send data to Kafka topics. Multiple producers can write to the same topic, which allows for flexibility and scalability in data ingestion.
  • Consumers: Consumers are clients that read data from Kafka topics. Multiple consumers can read from the same topic, and they can be grouped together in a consumer group to enable parallel data processing.

Now that we have a basic understanding of Kafka’s architecture, let’s dive into how to set up multiple producers and consumers in a Kafka topic.

Why Multiple Producers and Consumers?

In real-world applications, you often need to handle large volumes of data from different sources. For example, in a financial application, you might have multiple producers sending stock market data to a Kafka topic. On the other end, you could have multiple consumers processing that data in parallel for different purposes, such as analytics, trading decisions, and data storage.

Using multiple producers and consumers allows you to:

  • Scale: Handle large volumes of data by distributing the workload.
  • Increase throughput: Process data faster by running tasks in parallel.
  • Ensure reliability: If one producer or consumer fails, others can continue operating without interruption.

Setting Up Multiple Producers and Consumers in Kafka

Let’s go through the steps to set up multiple producers and consumers for a Kafka topic.

Step 1: Start Kafka and Zookeeper

First, ensure that Kafka and Zookeeper are running. You can start them with the following commands (assuming you have Kafka installed):

Start Zookeeper:

bin/zookeeper-server-start.sh config/zookeeper.properties

Start the Kafka broker:

bin/kafka-server-start.sh config/server.properties

Step 2: Create a Kafka Topic with Multiple Partitions

To enable multiple producers and consumers to work in parallel, create a topic with multiple partitions:

bin/kafka-topics.sh --create --topic my-multi-topic --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

Here, we create a topic named my-multi-topic with 3 partitions.

Step 3: Start Multiple Producers

You can run multiple producers in parallel to send data to the same topic. Open multiple terminal windows and run the following command in each window:

bin/kafka-console-producer.sh --topic my-multi-topic --bootstrap-server localhost:9092

Now, each producer can send messages to the my-multi-topic topic. For example:

Producer 1:

>Message from Producer 1

Producer 2:

>Message from Producer 2

Producer 3:

>Message from Producer 3

Each of these producers will send messages to the my-multi-topic topic, and Kafka will distribute the messages across the 3 partitions.

Step 4: Start Multiple Consumers

Now, let’s start multiple consumers, each of which will read messages independently from the Kafka topic. Since we are not using consumer groups, each consumer will read all messages from all partitions.

Open multiple terminal windows and run the following command in each window:

bin/kafka-console-consumer.sh --topic my-topic --bootstrap-server localhost:9092 --from-beginning

The --from-beginning flag ensures that the consumer reads all messages from the start of the topic.

How Independent Consumers Work

In this setup, each consumer reads all messages from all partitions. This means:

  • No Partition Ownership: Unlike consumers in a group, each consumer will attempt to read from all partitions of the topic.
  • Duplicate Processing: Since each consumer reads all messages independently, the same message will be processed by all consumers.

For example:

Consumer 1:

>Message from Producer 1
>Message from Producer 2
>Message from Producer 3

Consumer 2:

>Message from Producer 1
>Message from Producer 2
>Message from Producer 3

This setup is useful when you need all consumers to process all messages, such as in scenarios where each consumer sends data to a different system or performs a different task on the same data.

Step 5: Monitor the Consumers

You can monitor the logs in each consumer terminal to ensure that they are reading all the messages from the topic. Since each consumer is independent, they should all display the same messages.

Conclusion

In this guide, we demonstrated how to set up multiple producers and multiple consumers in a Kafka topic without using consumer groups. This configuration is beneficial when you need all consumers to process the entire stream of data independently. Whether you’re building a system for data replication, logging, or real-time analytics, Kafka’s flexibility with producers and consumers allows you to handle data efficiently.

If you have any questions or run into issues, feel free to leave a comment below. Happy streaming with Kafka!

--

--

No responses yet