0% found this document useful (0 votes)

12 views

AK

Uploaded by

testmailnew19128

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

AK

Uploaded by

testmailnew19128

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Apache Kafka Basics for Beginners

1. What is Apache Kafka?

• Apache Kafka is an open-source distributed event-streaming platform.
• It's primarily used for real-time data pipelines and stream processing.
• Kafka helps apps send and receive large volumes of data efficiently.

2. Key Concepts
1. Event (Message):
o A unit of data written to Kafka (like a log entry or JSON record).
o Example: "User A purchased item X."
2. Producer:
o Sends events (messages) into Kafka topics.
3. Consumer:
o Reads events from Kafka topics.
4. Topics:
o A named category in Kafka where messages are stored.
o Example: A topic called purchases stores all purchase events.
5. Partitions:
o Each topic is split into partitions for scalability.
o Messages in partitions are ordered.
6. Offset:
o A unique number identifying the position of a message in a partition.
7. Broker:
o A Kafka server that stores data and serves client requests.
o Kafka is a cluster of multiple brokers.
8. Zookeeper:
o Coordinates the Kafka cluster by managing metadata, leader elections, etc.
o Not required in newer Kafka versions (2.8+ uses KRaft mode).
9. Consumer Group:
o A group of consumers that work together to consume messages from a topic.
10. Kafka Connect:
o A tool for moving data between Kafka and external systems.

3. Kafka Architecture
1. Producers:
o Write messages to a topic.
2. Topics and Partitions:
o Topics are split into partitions to handle large-scale data.
o Messages in a partition are immutable and ordered.
3. Consumers:
o Read messages from topics in a pull-based model.
4. Brokers:
o Kafka servers that store topic data and respond to client requests.
5. Replication:
o Kafka replicates topic partitions across brokers to ensure fault tolerance.
6. ZooKeeper:
o Maintains cluster state and handles leader election.
o (Optional in Kafka 2.8+)

4. Key Functionalities
1. Publish-Subscribe:
o Kafka uses a topic-based publish-subscribe model.
2. Fault Tolerance:
o Data is replicated across brokers to ensure reliability.
3. Durability:
o Kafka stores data on disk, making it highly durable.
4. Scalability:
o Kafka handles large-scale data by scaling brokers and partitions.
5. Real-Time Processing:
o Kafka processes messages in near real-time.

Commands for Apache Kafka

1. Setup Kafka and Zookeeper
• Start Zookeeper:
bin/zookeeper-server-start.sh config/zookeeper.properties
• Start Kafka Broker:
bin/kafka-server-start.sh config/server.properties
2. Basic Kafka CLI Commands
a) Create a Topic:
bin/kafka-topics.sh --create --topic my-topic --bootstrap-server localhost:9092 --partitions 3 --replication-
factor 1
b) List All Topics:
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
c) Describe a Topic:
bin/kafka-topics.sh --describe --topic my-topic --bootstrap-server localhost:9092
d) Delete a Topic:
bin/kafka-topics.sh --delete --topic my-topic --bootstrap-server localhost:9092

3. Producer and Consumer CLI

a) Start a Producer:
bin/kafka-console-producer.sh --topic my-topic --bootstrap-server localhost:9092
• Enter messages manually.
b) Start a Consumer:
bin/kafka-console-consumer.sh --topic my-topic --from-beginning --bootstrap-server localhost:9092

4. Kafka Connect Commands

a) Start Kafka Connect:
bin/connect-distributed.sh config/connect-distributed.properties
b) Check Active Connectors:
curl -X GET http://localhost:8083/connectors
c) Create a Connector:
curl -X POST -H "Content-Type: application/json" \
-d '{
"name": "my-source-connector",
"config": {
"connector.class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"tasks.max": "1",
"file": "/path/to/input/file.txt",
"topic": "my-topic"
}
}' http://localhost:8083/connectors
d) Delete a Connector:
curl -X DELETE http://localhost:8083/connectors/my-source-connector

5. Kafka Service Management

a) Kafka Service File:
• Example systemd file for Kafka:
[Unit]
Description=Apache Kafka
After=network.target

[Service]
User=kafka
Group=kafka
ExecStart=/path/to/kafka/bin/kafka-server-start.sh /path/to/kafka/config/server.properties
ExecStop=/path/to/kafka/bin/kafka-server-stop.sh
Restart=on-failure

[Install]
WantedBy=multi-user.target
b) Enable Kafka Service:
sudo systemctl enable kafka
c) Start Kafka Service:
sudo systemctl start kafka
d) Check Kafka Service Status:
sudo systemctl status kafka

6. Zookeeper Service Management

a) Zookeeper Service File:
• Example systemd file for Zookeeper:
[Unit]
Description=Apache Zookeeper
After=network.target

[Service]
User=zookeeper
Group=zookeeper
ExecStart=/path/to/zookeeper/bin/zkServer.sh start
ExecStop=/path/to/zookeeper/bin/zkServer.sh stop
Restart=on-failure

[Install]
WantedBy=multi-user.target
b) Enable Zookeeper Service:
sudo systemctl enable zookeeper
c) Start Zookeeper Service:
sudo systemctl start zookeeper
d) Check Zookeeper Service Status:
sudo systemctl status zookeeper

7. Advanced Kafka CLI Commands

a) View Consumer Groups:
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
b) Describe a Consumer Group:
bin/kafka-consumer-groups.sh --describe --group my-group --bootstrap-server localhost:9092
c) Reset Offsets for a Consumer Group:
bin/kafka-consumer-groups.sh --reset-offsets --group my-group --topic my-topic --to-earliest --execute --
bootstrap-server localhost:9092
1. Event (Message)
Definition: A single unit of data sent to Kafka.
• Structure:
o Key: Used for partitioning or filtering (optional).
o Value: The actual data (payload).
o Headers: Metadata (optional).
o Timestamp: When the event was created.
Example: Imagine a shopping app:
• Event Key: user_123
• Event Value: { "item": "laptop", "price": 1200 }
• Timestamp: 2025-01-07T12:30:00Z

2. Topics
Definition: A topic is like a folder where Kafka stores messages. Producers write to topics, and consumers
read from topics.
• Example:
o Topic name: orders
o Messages:
▪ { "order_id": 1, "user": "John", "amount": 250 }
▪ { "order_id": 2, "user": "Alice", "amount": 500 }

3. Partitions
Definition: Each topic is divided into partitions for scalability and parallelism.
• Key Points:
o Messages in a partition are ordered.
o Partitions are distributed across Kafka brokers.
o A topic can have multiple partitions.
Example:
• Topic: orders with 3 partitions.
• Partition 0: { "order_id": 1 }
• Partition 1: { "order_id": 2 }
• Partition 2: { "order_id": 3 }
Messages are assigned to partitions based on:
1. Key Hashing: e.g., "order_id" % number_of_partitions
2. Round-robin (if no key is provided).

4. Producers
Definition: Producers send events (messages) to Kafka topics.
• Example: A payment gateway produces messages:
o Topic: transactions
o Messages:
▪ { "transaction_id": 101, "status": "success" }
▪ { "transaction_id": 102, "status": "failed" }
How Producers Work:
• Use Kafka’s Producer API.
• Specify:
o Topic: Where the message should go.
o Key (optional): Determines the partition.
o Value: The actual message.
Example Code (Java):
java

Properties props = new Properties();

props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

KafkaProducer<String, String> producer = new KafkaProducer<>(props);

producer.send(new ProducerRecord<>("orders", "order_id_1", "{ 'item': 'phone', 'price': 700 }"));
producer.close();

5. Consumers
Definition: Consumers read messages from Kafka topics.
• Example:
o Consumer reads from the orders topic:
▪ { "order_id": 1 }
▪ { "order_id": 2 }
How Consumers Work:
• Use Kafka’s Consumer API.
• Specify:
o Topic: Which topic to read from.
o Group ID: Groups consumers for parallel processing.
o Offset: Controls where to start reading:
▪ earliest (start from the beginning).
▪ latest (start from new messages).
Example Code (Java):
java

Properties props = new Properties();

props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "order-processing-group");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);

consumer.subscribe(Arrays.asList("orders"));

while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
System.out.println("Consumed record: " + record.value());
}
}

6. Offset
Definition: The position of a message in a partition.
• Example:
o Partition 0 contains:
▪ Offset 0: { "order_id": 1 }
▪ Offset 1: { "order_id": 2 }
Consumers use offsets to track what they’ve read:
• Offset 0 → Read.
• Offset 1 → Next to be read.

7. Consumer Groups
Definition: A set of consumers sharing the workload.
• Example:
o Topic: orders with 3 partitions.
o Consumer Group: order-group with 3 consumers.
▪ Consumer 1 → Reads Partition 0.
▪ Consumer 2 → Reads Partition 1.
▪ Consumer 3 → Reads Partition 2.
If a consumer fails, its partition is reassigned to another consumer in the group.

8. Kafka Brokers
Definition: Kafka servers that store and manage messages.
• Cluster Example:
o Broker 1: Stores Partition 0 of orders.
o Broker 2: Stores Partition 1 of orders.
o Broker 3: Stores Partition 2 of orders.
If one broker goes down, replicas (copies) on other brokers ensure data availability.

9. Replication
Definition: Each partition is copied (replicated) to other brokers for fault tolerance.
• Example:
o Topic: orders with 2 partitions.
o Replication Factor: 2.
o Partition 0:
▪ Leader: Broker 1.
▪ Replica: Broker 2.
o Partition 1:
▪ Leader: Broker 2.
▪ Replica: Broker 3.
Leader handles all read/write operations. Replicas are backups.

10. ZooKeeper
Definition: Manages Kafka cluster metadata, leader election, and configurations.
• Tasks:
o Keeps track of brokers.
o Handles partition leader election.

11. Kafka Connect

Definition: A tool to integrate Kafka with external systems (e.g., databases, file systems).
• Example Use Case:
o Source Connector: Read records from a database and push to a Kafka topic.
o Sink Connector: Read records from a Kafka topic and write to a database.
Source Connector Example (File):
json

{
"name": "file-source-connector",
"config": {
"connector.class": "FileStreamSource",
"tasks.max": "1",
"file": "/tmp/input.txt",
"topic": "file-topic"
}
}
Sink Connector Example (Database):
json

{
"name": "db-sink-connector",
"config": {
"connector.class": "JDBC",
"connection.url": "jdbc:mysql://localhost:3306/mydb",
"connection.user": "root",
"connection.password": "password",
"topics": "db-topic"
}
}

12. Real-Life Analogy

Think of Kafka like a post office system:
1. Producer: You (sending letters).
2. Topic: Mailbox (different topics for different letters, e.g., bills, ads).
3. Partition: Sections inside the mailbox.
4. Consumer: Postman reading and delivering the letters.
5. Broker: A specific post office branch.
6. Replication: Backups of letters in case one branch loses them.
Confluent Kafka Concepts and Features
1. Kafka Ecosystem Overview
• Kafka Core: Topics, producers, consumers, brokers, partitions, offsets.
• ZooKeeper (legacy): Manages metadata and leader election (replaced by KRaft in Kafka 2.8+).
• Kafka Connect: Integrates Kafka with external systems.
• Kafka Streams: A library for building real-time stream processing applications.
• KSQL (ksqlDB):
o A SQL-like interface for real-time stream processing.
o Queries Kafka topics for analytics, filtering, and transformations.
• Confluent Control Center: A GUI to monitor and manage Kafka clusters.

2. Confluent Platform-Specific Features

1. Schema Registry:
o Stores and manages Avro, JSON Schema, or Protobuf schemas.
o Enforces schema compatibility (backward, forward, full).
o Enables efficient serialization/deserialization.
o Schema Registry APIs:
▪ Register a schema.
▪ Fetch schema by ID or subject.
2. Rest Proxy:
o Allows Kafka operations (produce/consume messages) via REST APIs.
o Ideal for lightweight clients or non-Java applications.
3. Tiered Storage:
o Moves older Kafka log segments to cheaper storage (e.g., S3, GCS).
o Reduces broker disk usage.
o Enables querying of older data without consuming broker resources.
4. RBAC (Role-Based Access Control):
o Granular permissions for producers, consumers, and admins.
o Roles: ClusterAdmin, ResourceOwner, etc.
5. Multi-Region Replication:
o Uses Confluent Replicator or MirrorMaker 2 for geo-replication.
o Ensures high availability and disaster recovery across regions.

3. Kafka Data Model and Formats

1. Serialization/Deserialization:
o Common formats: Avro, JSON, Protobuf, and raw strings.
o Use Confluent Schema Registry for schema evolution and compatibility.
2. Message Structure:
o Headers: Metadata (e.g., tracing info, content type).
o Key: Determines partitioning.
o Value: The payload of the message.
3. Partitioning Strategies:
o Key-based hashing (default).
o Custom partitioners (e.g., round-robin, sticky partitioning).

4. Kafka Connect Advanced Concepts

1. Source and Sink Connectors:
o Examples:
▪ Source: JDBC Source Connector, FileStreamSource.
▪ Sink: JDBC Sink Connector, Elasticsearch Sink.
o Understand how to configure connectors with properties:
▪ tasks.max: Number of parallel tasks.
▪ batch.size: Optimize performance.
▪ Error handling: errors.tolerance, errors.deadletterqueue.topic.name.
2. Distributed Mode:
o Shares workloads across multiple workers for scalability and fault tolerance.
3. Transformations:
o Single Message Transforms (SMTs):
▪ Modify messages (e.g., mask fields, flatten structures).
▪ Examples:
▪ ReplaceField: Include/exclude fields.
▪ ExtractField: Extract nested fields.
4. Debezium:
o A popular connector for Change Data Capture (CDC) from databases (MySQL,
PostgreSQL, MongoDB).

5. Kafka Streams and KSQL (ksqlDB)

1. Kafka Streams:
o A Java library for real-time stream processing.
o Key concepts:
▪ KStream: Represents an unbounded stream of records.
▪ KTable: Represents a changelog or table view of a stream.
▪ Stateless and stateful operations (e.g., filtering, aggregations).
2. ksqlDB:
o SQL-like syntax for Kafka streams.
o Key operations:
▪ Persistent Queries:
sql

CREATE STREAM filtered_stream AS

SELECT *
FROM original_stream
WHERE order_amount > 100;
▪ Windowed Aggregations:
sql

SELECT COUNT(*) AS total_orders

FROM orders
WINDOW TUMBLING (SIZE 1 HOUR)
GROUP BY customer_id;

6. Monitoring and Metrics

1. Kafka Metrics:
o Understand key broker metrics:
▪ UnderReplicatedPartitions: Indicates replication issues.
▪ ConsumerLag: Measures lag between producer and consumer.
▪ Disk and network I/O metrics.
2. Confluent Control Center:
o GUI for monitoring:
▪ Consumer lag.
▪ Broker health.
▪ Throughput.
3. JMX Monitoring:
o Exposes metrics via JMX (Java Management Extensions).
o Integrate with tools like Prometheus or Grafana.

7. Security in Kafka
1. Encryption:
o TLS (SSL): Secures communication between Kafka clients and brokers.
o Key configurations:
▪ ssl.keystore.location
▪ ssl.truststore.location
2. Authentication:
o SASL (Simple Authentication and Security Layer):
▪ Mechanisms: PLAIN, SCRAM, GSSAPI (Kerberos).
▪ Configurations for SASL/PLAIN:
properties

sasl.mechanism=PLAIN
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="user"
password="pass";
3. Authorization:
o ACLs (Access Control Lists):
▪ Define permissions for topics, consumer groups, etc.
▪ Example:
bash

kafka-acls --add --allow-principal User:alice --operation READ --topic my-topic

4. Role-Based Access Control (RBAC):
o Confluent-specific.
o Assign roles to users or services (e.g., ResourceOwner for topic management).

8. Advanced Kafka Topics

1. Rebalancing:
o When partitions are reassigned among consumers in a group.
o Avoid excessive rebalancing with sticky assignors or tuning:
▪ session.timeout.ms
▪ max.poll.interval.ms
2. Compaction:
o Retains the latest record for each key in a topic.
o Use case: Maintaining stateful logs or changelogs.
3. Dead Letter Queues (DLQs):
o Stores failed messages for debugging and retries.
o Example configuration:
properties

errors.deadletterqueue.topic.name=my-dlq
errors.deadletterqueue.context.headers.enable=true
4. Idempotence and Exactly-Once Semantics (EOS):
o Ensure messages are not duplicated during retries.
o Idempotent producer:
properties

enable.idempotence=true
o Transactions:
java

producer.initTransactions();
producer.beginTransaction();
producer.send(record);
producer.commitTransaction();
5. Cluster Balancing:
o Tools: Cruise Control for automatic rebalancing and monitoring.

9. Multi-Tenancy
• Namespace Management:
o Use prefixes or separate clusters for tenants.
o Example: Tenant-specific topics (tenantA.orders, tenantB.orders).
• Quota Management:
o Limit producer and consumer throughput:
properties
quota.producer.default=1000000
quota.consumer.default=2000000

10. Disaster Recovery and Backup

1. Multi-Region Clusters:
o Use Confluent Replicator or MirrorMaker 2.
o Synchronous or asynchronous replication.
2. Backup Strategies:
o Backup Kafka logs to S3 or HDFS.
o Archive schemas in the Schema Registry.

11. Kafka Optimization

1. Producer Performance:
o Use batching:
properties
linger.ms=10
batch.size=16384
o Compression:
properties
compression.type=snappy
2. Consumer Performance:
o Increase fetch.min.bytes for fewer but larger fetches.
3. Broker Tuning:
o Log segment size:
properties

log.segment.bytes=1073741824
o Retention settings:
properties

log.retention.hours=168
Practice Questions with Explanations
1. Kafka Topics and Partitions
Question 1:
You have a topic user-logins with 5 partitions and a replication factor of 3. What happens if one broker goes
offline?
• A) All partitions become unavailable.
• B) All partitions will have reduced replication.
• C) Only some partitions will become unavailable.
• D) The cluster will stop accepting writes.
Answer:
• B) All partitions will have reduced replication.
Explanation: If one broker goes offline, the partitions hosted on that broker will still be available if
their replicas are on other brokers. However, the replication factor will temporarily decrease until the
broker is restored or replicas are reassigned.

2. Schema Registry
Question 2:
You register a schema for a topic using Schema Registry. What happens if a producer sends a message that
does not match the schema?
• A) The message is rejected.
• B) The message is accepted but logged as a warning.
• C) The consumer fails to deserialize the message.
• D) The message is dropped silently.
Answer:
• A) The message is rejected.
Explanation: Schema Registry enforces schema validation during message production. If the data
does not conform to the schema, it is rejected before being written to the topic.

Question 3:
Which compatibility modes does Confluent Schema Registry support?
• A) Backward, Forward, Full.
• B) Strict, Relaxed, None.
• C) Additive, Non-Additive, Full.
• D) None of the above.
Answer:
• A) Backward, Forward, Full.
Explanation: Schema Registry supports compatibility modes to ensure smooth schema evolution.
Examples:
o Backward Compatibility: Consumers using older schemas can read new data.
o Forward Compatibility: Consumers using newer schemas can read older data.
o Full Compatibility: Both backward and forward compatibility are maintained.

3. Kafka Connect
Question 4:
You configure a Kafka Connect source connector for a database. However, you notice duplicate messages in
the Kafka topic. What is the likely cause?
• A) The connector tasks are set too high.
• B) The source database has duplicate records.
• C) Offset management is not properly configured.
• D) The topic has too many partitions.
Answer:
• C) Offset management is not properly configured.
Explanation: Kafka Connect tracks offsets to ensure exactly-once delivery. If offset management fails
(e.g., connector restarts or misconfiguration), duplicate messages can occur.

Question 5:
What property determines the maximum number of parallel tasks in Kafka Connect?
• A) tasks.parallel
• B) connect.tasks
• C) tasks.max
• D) max.tasks
Answer:
• C) tasks.max.
Explanation: tasks.max defines the number of parallel tasks that a connector can spawn, enabling
parallelism and scalability.

4. Kafka Streams
Question 6:
You are using Kafka Streams to aggregate order totals per user. Which Kafka Streams concept should you
use to store the state of the aggregation?
• A) KStream
• B) GlobalKTable
• C) KTable
• D) Processor API
Answer:
• C) KTable.
Explanation: A KTable is used for aggregations and maintains the latest state of a keyed record. In
this case, it stores the total order amounts per user.

Question 7:
Write a Kafka Streams application to filter out transactions below $100 and write valid transactions to a new
topic.
Answer (Java):
java

Properties props = new Properties();

props.put(StreamsConfig.APPLICATION_ID_CONFIG, "transaction-filter");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());

StreamsBuilder builder = new StreamsBuilder();

KStream<String, String> transactions = builder.stream("transactions");
KStream<String, String> validTransactions = transactions.filter((key, value) -> {
JsonObject transaction = JsonParser.parseString(value).getAsJsonObject();
return transaction.get("amount").getAsDouble() >= 100;
});
validTransactions.to("valid-transactions");

KafkaStreams streams = new KafkaStreams(builder.build(), props);

streams.start();

5. Monitoring and Troubleshooting

Question 8:
What metric indicates that a consumer is not keeping up with the producer?
• A) UnderReplicatedPartitions
• B) BytesOutPerSec
• C) ConsumerLag
• D) ActiveConsumerGroups
C) ConsumerLag.
Explanation: Consumer lag measures the difference between the latest offset in the partition and the last
committed offset by the consumer.
Question 9:
You observe high UnderReplicatedPartitions. What should you check first?
• A) Network bandwidth between brokers.
• B) Consumer group lag.
• C) Number of partitions in the topic.
• D) Kafka topic configuration.
Answer:
• A) Network bandwidth between brokers.
Explanation: UnderReplicatedPartitions occurs when followers cannot keep up with the leader.
This is often caused by network bottlenecks or disk I/O issues.

6. Security
Question 10:
You are setting up TLS encryption for Kafka. Which configuration is required on the broker?
• A) ssl.enabled=true
• B) security.protocol=SSL
• C) ssl.keystore.location
• D) Both B and C.
D) Both B and C.
Explanation: For TLS encryption, you must specify the security protocol as SSL and provide the keystore
and truststore locations.

7. Advanced Kafka Features

Question 11:
Explain how exactly-once semantics (EOS) works in Kafka.
Answer:
• Kafka ensures EOS by combining:
1. Idempotent Producers: Prevent duplicate messages during retries.
▪ Config: enable.idempotence=true.
2. Transactional Producers: Group messages into atomic transactions.
▪ Config: transactional.id.
3. Consumer Offsets in Transactions: Commit offsets atomically with the messages.

Question 12:
You need to replicate data between two Kafka clusters in different regions. Which tool should you use?
• A) Kafka Streams
• B) Confluent Replicator
• C) MirrorMaker 2
• D) Both B and C.
Answer:
• D) Both B and C.
Explanation:
• Confluent Replicator: Ideal for Confluent Platform users with schema registry support.
• MirrorMaker 2: Open-source solution for geo-replication.

Hands-On Lab Questions

1. Kafka Connect:
o Create a Kafka source connector to read from a PostgreSQL database.
o Ensure exactly-once semantics and configure a dead-letter queue.
2. Kafka Streams:
o Build a stream processing application that reads from a topic, aggregates sales per product,
and writes the result to a new topic.
3. Monitoring:
o Use JMX to monitor Kafka broker metrics and identify bottlenecks.
4. Security:
o Configure a Kafka broker to use SASL/PLAIN authentication and authorize a user to read a
specific topic.

Mock Scenario
Scenario:
You are managing a Kafka cluster with the following requirements:
1. Messages must be replicated across 3 brokers with fault tolerance.
2. Consumer lag should be minimized.
3. Integrate with a database for CDC (Change Data Capture).
Questions:
1. What replication factor should you configure for the topic?
o Answer: 3.
2. How do you monitor and reduce consumer lag?
o Answer: Use the Consumer Lag metric, and ensure consumers are properly distributed across
partitions.
3. Which Kafka Connect plugin is ideal for CDC?
o Answer: Debezium.

Kafka Using Spring Boot
No ratings yet
Kafka Using Spring Boot
136 pages
Apache Kafka Documentation
No ratings yet
Apache Kafka Documentation
419 pages
Apache Kafka
No ratings yet
Apache Kafka
9 pages
Netplus Cheatsheet
No ratings yet
Netplus Cheatsheet
15 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Apache Kafka
No ratings yet
Apache Kafka
13 pages
Kafka
No ratings yet
Kafka
23 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
25 pages
1646412329504-CCDAK_study_guide
No ratings yet
1646412329504-CCDAK_study_guide
56 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
6 pages
Kafka
No ratings yet
Kafka
12 pages
Apache Kafka Key Concepts
100% (1)
Apache Kafka Key Concepts
8 pages
Apache Kafka Long Polling
No ratings yet
Apache Kafka Long Polling
20 pages
kafka
No ratings yet
kafka
43 pages
KAFKAExample2
No ratings yet
KAFKAExample2
12 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Apache Kafka Beginner Guide
No ratings yet
Apache Kafka Beginner Guide
40 pages
Apache Kafka(1)
No ratings yet
Apache Kafka(1)
10 pages
Kafka My Kafka Note v67
No ratings yet
Kafka My Kafka Note v67
55 pages
5_kafka_2.7m
No ratings yet
5_kafka_2.7m
46 pages
Apache_Kafka_360_1631077800
No ratings yet
Apache_Kafka_360_1631077800
137 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Apache Kafka
No ratings yet
Apache Kafka
38 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
24 pages
Kafka Using Spring Boot v2
No ratings yet
Kafka Using Spring Boot v2
150 pages
unit 3
No ratings yet
unit 3
26 pages
Apache Kafka - PPT
No ratings yet
Apache Kafka - PPT
27 pages
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
No ratings yet
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
23 pages
KAFKA PRESENTATION (1)
No ratings yet
KAFKA PRESENTATION (1)
16 pages
Apache Kafka
No ratings yet
Apache Kafka
130 pages
Cours - Kafka
No ratings yet
Cours - Kafka
72 pages
Kafka
No ratings yet
Kafka
19 pages
Apache Kafka
No ratings yet
Apache Kafka
32 pages
kafka-overview
No ratings yet
kafka-overview
36 pages
Kafka
No ratings yet
Kafka
88 pages
Documentation
No ratings yet
Documentation
105 pages
4. Introduction to Apache Kafka and its setup (3)
No ratings yet
4. Introduction to Apache Kafka and its setup (3)
29 pages
BDA Lab A7
No ratings yet
BDA Lab A7
10 pages
Apache Kafka Description
No ratings yet
Apache Kafka Description
36 pages
Kafka 1
No ratings yet
Kafka 1
10 pages
Kafka Notes1
No ratings yet
Kafka Notes1
19 pages
Kafka Patterns and Anti-Patterns
No ratings yet
Kafka Patterns and Anti-Patterns
7 pages
? Kafka
No ratings yet
? Kafka
2 pages
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
18 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
Kafka With Spring Boot
No ratings yet
Kafka With Spring Boot
48 pages
Chapter 1 - Introduction To KAFKA: Objectives
No ratings yet
Chapter 1 - Introduction To KAFKA: Objectives
17 pages
KAFKA LAB MANUAL- 3 EXPERIMENTS
No ratings yet
KAFKA LAB MANUAL- 3 EXPERIMENTS
15 pages
Kafka
No ratings yet
Kafka
5 pages
Unit 5 Apache Kafka Notes
No ratings yet
Unit 5 Apache Kafka Notes
54 pages
Apache Kafka Tutorial
No ratings yet
Apache Kafka Tutorial
3 pages
Apache Kafka Introduction
No ratings yet
Apache Kafka Introduction
21 pages
Pache Kafka Is An Open-Source Distr
No ratings yet
Pache Kafka Is An Open-Source Distr
1 page
Apache Kafka
No ratings yet
Apache Kafka
6 pages
Kafka Streaming Data
No ratings yet
Kafka Streaming Data
154 pages
KAFKA PPT
No ratings yet
KAFKA PPT
11 pages
Fast Data Processing Systems with SMACK Stack
From Everand
Fast Data Processing Systems with SMACK Stack
Raúl Estrada
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
From Everand
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
Peter Jones
No ratings yet
ReleaseNote
No ratings yet
ReleaseNote
21 pages
Reference Manual: Software Rip Solutions Version 14
No ratings yet
Reference Manual: Software Rip Solutions Version 14
57 pages
Rsa Envision Event Source
No ratings yet
Rsa Envision Event Source
19 pages
ELEGOO Uno Kit Proto Shield Prototype Expansion Board
No ratings yet
ELEGOO Uno Kit Proto Shield Prototype Expansion Board
4 pages
Modbus 通信協定03
No ratings yet
Modbus 通信協定03
10 pages
UM A31G22x Eng
No ratings yet
UM A31G22x Eng
503 pages
A Gamers Guide To Building A Gaming Computer - David Talmage
100% (2)
A Gamers Guide To Building A Gaming Computer - David Talmage
130 pages
Sams ComputerFacts - Apple IIe
No ratings yet
Sams ComputerFacts - Apple IIe
73 pages
Readme
No ratings yet
Readme
154 pages
WSC2024 TP39 MA Actual En
No ratings yet
WSC2024 TP39 MA Actual En
13 pages
Zabbix Server 4.0 VirtualBox Image - Idomaster
No ratings yet
Zabbix Server 4.0 VirtualBox Image - Idomaster
4 pages
Ценовник Нексио компјутери M04.10.2022г
No ratings yet
Ценовник Нексио компјутери M04.10.2022г
200 pages
Computer Properties
No ratings yet
Computer Properties
4 pages
Atheros AR8152
No ratings yet
Atheros AR8152
2 pages
Overclocking Overclocking Clarkdale in A LGA 1156 Motherboardclarkdale in A LGA 1156 Motherboard
No ratings yet
Overclocking Overclocking Clarkdale in A LGA 1156 Motherboardclarkdale in A LGA 1156 Motherboard
14 pages
Firewall Architectures in The Data Centre and Internet Edge
No ratings yet
Firewall Architectures in The Data Centre and Internet Edge
140 pages
Unit 3 (3.3) Inter Process Communication (IPC)
No ratings yet
Unit 3 (3.3) Inter Process Communication (IPC)
18 pages
Ivms-4200 Data Sheet
No ratings yet
Ivms-4200 Data Sheet
4 pages
VIO 2.1 and NPIV - An Update
No ratings yet
VIO 2.1 and NPIV - An Update
46 pages
SRA Leveraging AF XDP For Programmable Network Functions With IPv6 Segment Routing
No ratings yet
SRA Leveraging AF XDP For Programmable Network Functions With IPv6 Segment Routing
8 pages
No Nama Mahasiswa J - Kel Nilai KET
No ratings yet
No Nama Mahasiswa J - Kel Nilai KET
3 pages
T-7700 Series IP Network Audio A Cquisition Terminal T-7770
No ratings yet
T-7700 Series IP Network Audio A Cquisition Terminal T-7770
1 page
GATE - CS - Digital Logic
No ratings yet
GATE - CS - Digital Logic
22 pages
Os 101 Notes
No ratings yet
Os 101 Notes
102 pages
End Semester Examination, May 2 EC Microprocesso: 008 /COEIIC-311: RS
No ratings yet
End Semester Examination, May 2 EC Microprocesso: 008 /COEIIC-311: RS
10 pages
Edited Specs
No ratings yet
Edited Specs
7 pages
MicroSTM4 User Manual
No ratings yet
MicroSTM4 User Manual
12 pages
CN-CSET-207-Lab 5 Assignment 1
No ratings yet
CN-CSET-207-Lab 5 Assignment 1
3 pages
Access Test Bank for Operating Systems: Internals and Design Principles, 8/E 8th Edition : 0133805913 All Chapters Immediate PDF Download
100% (20)
Access Test Bank for Operating Systems: Internals and Design Principles, 8/E 8th Edition : 0133805913 All Chapters Immediate PDF Download
23 pages

AK

Uploaded by

AK

Uploaded by

Apache Kafka Basics for Beginners

1. What is Apache Kafka?

Commands for Apache Kafka

3. Producer and Consumer CLI

4. Kafka Connect Commands

5. Kafka Service Management

6. Zookeeper Service Management

7. Advanced Kafka CLI Commands

Properties props = new Properties();

KafkaProducer<String, String> producer = new KafkaProducer<>(props);

Properties props = new Properties();

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);

11. Kafka Connect

12. Real-Life Analogy

2. Confluent Platform-Specific Features

3. Kafka Data Model and Formats

4. Kafka Connect Advanced Concepts

5. Kafka Streams and KSQL (ksqlDB)

CREATE STREAM filtered_stream AS

SELECT COUNT(*) AS total_orders

6. Monitoring and Metrics

kafka-acls --add --allow-principal User:alice --operation READ --topic my-topic

8. Advanced Kafka Topics

10. Disaster Recovery and Backup

11. Kafka Optimization

Properties props = new Properties();

StreamsBuilder builder = new StreamsBuilder();

KafkaStreams streams = new KafkaStreams(builder.build(), props);

5. Monitoring and Troubleshooting

7. Advanced Kafka Features

Hands-On Lab Questions

You might also like