0% found this document useful (0 votes)
33 views

MQTC v2017 Intro To Kafka

Uploaded by

pippo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

MQTC v2017 Intro To Kafka

Uploaded by

pippo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Introduction to Kafka

(and why you care)


Richard Nikula
VP, Product Development and Support
Nastel Technologies, Inc.

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Introduction
 Richard Nikula
 VP of Product Development and Support

2
 Involved in “MQ” since early 90’s
 Primarily at the technology layer
 Various certifications

 About Nastel Technologies


 Founded in 1994
 Middleware-centric Application Performance Management software supplier
 Core competency : Messaging Middleware, Java Application Servers, ESB's and other
SOA technologies

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


About this session
 Apache Kafka is showing up everywhere and is likely already being used
today somewhere in your organization. In this session we will cover the
fundamentals of Kafka. The basics of producers, consumers and message
processing will be explained, along with several examples including
clustered configuration. We will also look at several typical use cases.

 This session is targeted at technical resources familiar with IBM MQ.


Session will compare Kafka to IBM MQ-based messaging to help you
prepare for when your expertise is needed in a hybrid IBM MQ/IIB/Kafka
environment.

 This session is not an exhaustive tutorial to Kafka and only touches on


programming concepts.

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


BACKGROUND

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Time Travel

1993

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Your workstation

Operating system PC DOS 4.01 7 color


CPU Intel 80386SX @ 16 MHz Reverse video and blink
Memory 2 MB ~ 6 MB Graphical display capable

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Your Enterprise

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Your Network

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Modern consumer electronics

Text Messaging and PDA functions become available on


cell phones

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


The Birth of MQ
 MQ Solved many of the problems that existed in this environment

1. A consistent API across a number of disparate operating systems

2. Ability to interact between applications using different architectures


(EBCIC/ASCII - Big/Little Endian)

3. Asynchronous support for systems and applications that ran at different


speeds

4. Independent operations for systems that lost network connectivity

5. Ability to recover from regular system outages

6. Many more…

7. Publish / subscribe & Clustering came later

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Your Environment today

Other Legacy
Systems

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Internet of Things

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Kafka History and reference
 Developed by LinkedIn

 Wanted a system that was not restricted by the past and exploited
technologies commonly available

 Key requirements
 High speed
 Fault tolerant
 Infinitely scalable
 Distributed access

 Became open source in 2011 under Apache

 https://kafka.apache.org

 In the rest of this presentation, we will introduce Kafka concepts which will
demonstrate how it achieves these objectives

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


BASIC CONCEPTS

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Basic constructs
 Producer

 Consumer
Producer Producer Producer

 Cluster Stream Processor

 Broker Kafka
Connectors
Cluster
Stream Processor

 Streams Consumer Consumer Consumer

 Connectors

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


More constructs
 Topics

❖ As you would expect, a unique string to which produces produce


“messages” and to which subscribers subscribe

 Logs

❖ The basic construct of Kafka. Messages are persisted as a sequence of


items to logs (just as MQ captures transaction logs).

❖ Logs consist of one or more immutable sequence of “messages”

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Putting it Together in a Simple Example

Topic: “Green”
Producer1

0 1 2 3 4 5 6 7 8 9

Consumer1

Offset

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


INTRODUCTION BY EXAMPLE

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Simple Example - Producer

 Simple console based console example putting to topic Green

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Simple Example – Consumer 1

 A sample console consumer reads the messages

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Simple Example – Consumer 2

 Now we start a 2nd consumer for Green

 Did you expect this result? Probably not if you are thinking of IBM MQ
which only publishes messages for that topic if a subscription exists when
published

o The messages in Kafka do not go away simply because they were read by a
consumer like a queue (more later)

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Topics and Logs Revisited
Topic: “Green”
Producer1

0 1 2 3 4 5 6 7 8 9

Consumer1

Topic: “Red” Topic: “Blue”


Producer1
Producer1

0 1 2 3 4 5 6 7
0 1 2 3 4 5 6 7 8 9 10 11 12

Consumer1
Consumer1

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


EXPANDED CONCEPTS

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Partitions
Topic: “Green”

0 1 2 3 4 5 6 7 8 9

Producer1 0 1 2 3 Consumer1

0 1 2 3 4 5 6 7

 Partitions divide the topic storage across multiple logs

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Distributed Partitions

0 1 2 3 4 5

0 1 2 3 4 5 6 7 8 9

Producer1 0 1 2 3 Consumer1

0 1 2 3 4 5 6 7

0 1 2 3 4 5 6 7

0 1 2 3 4

 Partitions divide the topic storage across multiple logs

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Partition Assignment
❖ Default Partitioner

 Kafka partition specified

 If key specified, hash of the key

 Round robin assignment

❖ Custom Partitioner

 Assign based on various criteria

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Replicated Partitions
0 1 2 3  Replication Factor determines the
numbers of copies of each partition
0 1 2 3 4
created
0 1 2 3 4 5 6 7 8
 Failure of (replication-factor – 1)
nodes does not result in loss of data
0 1 2 3  Sufficient server instances are
0 1 2 3 4 required to provide segregation of
instances (in this example, 3
0 1 2 3 4 5 6 7 8 partitions on 3 brokers leaves limited
room for failover)
0 1 2 3 4 5 6 7 8

0 1 2 3

0 1 2 3 4

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Leaders and Followers
0 1 2 3  All producer/consumer
communication is via the
0 1 2 3 4 Leader
0 1 2 3 4 5 6 7 8  Followers get updates
from Leader

Producer1 0 1 2 3 Consumer1

0 1 2 3 4

0 1 2 3 4 5 6 7 8

0 1 2 3 4 5 6 7 8

0 1 2 3

Producer1 Consumer1
0 1 2 3 4

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Replication Failover
0 1 2 3  Since leader on Green
0 1 2 3
instance 1 failed, new
0 1 2 3 4 leader for Green has to
be elected.
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 6 7 8
 Lost replicas also need to

X
be reestablished (why 3
Producer1 0 1 2 3
servers was inadequate)
0 1 2 3 4

0 1 2 3 4 5 6 7 8
Consumer1

0 1 2 3 4 5 6 7 8

0 1 2 3

Producer1 Consumer1
0 1 2 3 4

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Topic Describe
 Create a topic with 3 partitions and a replication factor of 2

3 partitions
Leaders
Replicas (leader and followers)
In Sync Replicas

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Behind the Scenes
 How is possible to coordinate all of this across the kafka brokers?

 Apache Zookeeper

 A distributed coordination service

 Coordinates the state of Kafka for the brokers, publishers and consumers

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


SCALABILITY

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Scalability options
 Increasing consumers and producers

 Consumer groups

 Adding Partitions

 Message Key partitioning

 Adding Topics

 Streaming processes

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Consumer Groups

0 1 2 3 4 5 6 7 8 9

Producer1 0 1 2 3 Consumer1
Producer1
Producer1

0 1 2 3 4 5 6 7 Consumer2

 Up to 1 consumer per partition in the group

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Consumer Groups

0 1 2 3 4 5 6 7 8 9 Consumer1

Consumer3
Producer1 0 1 2 3
Producer1

0 1 2 3 4 5 6 7 Consumer2

Consumer4
 Excess consumers have nothing to process

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


RUNTIME CONSIDERATIONS

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Data Retention
 Retention by Time

❖ log.retention.(hours|minutes|ms)

❖ Default is 1 week

▪ Retention by Partition Size

❖ log.retention.bytes

❖ Applied per partition

▪ Applies to the log segment

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Producer Considerations
 Fire and Forget
 Send without any confirmation

 Asynch Send
 Max inflight

 Synchronous
 All writes complete with verification

 Replication Acknowledgement level

 (Automatic) retry logic

 Partition Assignment

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Consumer Considerations
 Consumer Groups

 Commit Strategy

 Committed Offset (the last message committed by the consumer)

 Batch size

 Rebalancing

 Timely Interaction with Kafka

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


General Considerations
 Log Retention
 "Kafka's performance is effectively constant with respect to data size so storing data
for a long time is not a problem“

 Latency between producers and consumers

 Brokers health (running?)

 Distributed server health

 Runaway producers

 Other components in the stack

 “Poison” messages

 Partitioner Effectiveness

 Rebalance

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Monitoring
 Kafka Instrumented with JMX

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


“SOME” CODE

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Simple Producer
String topicName = “MyKafkaTopic";
String key = “KeyToUse";
String value = “Some Text to Send";

Properties props = new Properties();


props.put("bootstrap.servers", "localhost:9092,localhost:9093");
props.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");

Producer<String, String> producer = new KafkaProducer <>(props);

ProducerRecord<String, String> record =


new ProducerRecord<> (topicName, key, value);

producer.send(record);

producer.close();

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


KAFKA VERSUS MQ

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


MQ to Kafka Summary
 No MQI – just code

 Conditions MQ handles out of the box may require coding

 Kafka has no “queues”, just publish and subscribe

 Publishes with no subscribers still publish

 ‘Guaranteed’ messaging

 failover for lost nodes (including file systems)

 resource consumption models

 Less “admin” involvement

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


KAFKA AND IBM

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Kafka Integration
 IIB (10.0.0.7) Kafka consumer and producer nodes

 MQ Connector
 https://github.com/ibm-messaging/kafka-connect-mq-source

 MQ Bridge (Bluemix)

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


SUMMARY

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


So why do you care?
 Kafka is an alternative to MQ that is likely already in use in your
organization

 Kafka is part of the middleware space and needs to be managed

 Kafka can and will integrate with your existing applications creating a
hybrid messaging environment

 Kafka has advantages to MQ for some message patterns but may not work
for all

 Since it is messaging, there is an expectation you will be the expert…

Copyright © Nastel Technologies, Inc. 2017 MQ Technical Conference v2.0.1.7


Questions & Answers

MQ Technical Conference v2.0.1.7

You might also like