Hands-on Kafka

About Kafka

  • Messaging system
  • It doesnt transform data
  • Messages are organized into Topics
  • Producers push messages
  • Consumers pull messages
  • Kafka runs in a cluster
  • Nodes are called brokers

Why Kafka – Advantages 

  • Large number of Consumers
  • Ad-hoc consumers
  • Batch Consumers
  • Automatic recovery from Broker failures

Limitations

  • Not end-user solution: Need to write Producers and Consumers
  • No data transformations (no encryption..)
  • No authorization, authentication  yet

Topic

  • Topic has multiple partitions
  • can put each partition on a separate machine
  • Partitions help to parallelise the topics

Partitions

  • Inside partitions, each message has an id
  • Ids are called offsets
  • While consuming messages, offsets are specified

 

  • Each Broker has many partitions
  • Each partition has leader and replicas
  • All reads and writes go to the leaders
  • Leader replicates to the replicas
  • Producers / Consumers never interact with replicas
  • Kafka retains the messages or topics for a certain amount of time
  • Consumers are responsible for consuming data ahead of deletion

Producers

  • Synchronous Producers
    • Send a message – wait for Kafka to ack
  • Asyc producer:
    • write to the producer
    • doesnt wait for ack
    • buffers in local memory
    • at some point it writes to Kafka
    • suitable for cases which care about performance and losing some data is less relevant

Kafka On Windows

Download Kafka from : http://kafka.apache.org/

Save and untar it under C:\mykafka\kafka_2.11-0.10.0.1

Start Zookeeper

C:\mykafka\kafka_2.11-0.10.0.1\bin\windows>zookeeper-server-start.bat       C:\mykafka\kafka_2.11-0.10.0.1\config\zookeeper.properties

Edit server.properties

  • broker.id=1
  • port=9091
  • log.dirs=C:\mykafka\kafka-log-2
  • zookeeper.connect=localhost:2181

Similarly create server-1.properties file

Start Broker

kafka-server-start.bat C:\mykafka\kafka_2.11-0.10.0.1\config\server.properties

kafka-server-start.bat C:\mykafka\kafka_2.11-0.10.0.1\config\server-1.properties

Create Topics

Create a topic: first

kafka-topics.bat –zookeeper localhost:2181 –create –topic first –partitions 2 –replication-factor 2

All brokers pickup change from zookeeper

Producer Console

kafka-console-producer.bat –broker-list localhost:9092 –topic first

Consumer Console

kafka-console-consumer.bat –zookeeper localhost:2181 –topic first

Any messages typed in the Producer console is immediately available in the Consumer console.

Get all message from beginning:

C:\mykafka\kafka_2.11-0.10.0.1\bin\windows>kafka-console-consumer.bat –zook
eeper localhost:2181 –topic first –from-beginning

Referencehttps://cwiki.apache.org/confluence/display/KAFKA/Ecosystem

 

Advertisements

About shalishvj : My Experience with BigData

6+ years of experience using Bigdata technologies in Architect, Developer and Administrator roles for various clients. • Experience using Hortonworks, Cloudera, AWS distributions. • Cloudera Certified Developer for Hadoop. • Cloudera Certified Administrator for Hadoop. • Spark Certification from Big Data Spark Foundations. • SCJP, OCWCD. • Experience in setting up Hadoop clusters in PROD, DR, UAT , DEV environments.
This entry was posted in Kafka and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s