How Kerberos works



  • Every user and service that participates in the Kerberos authentication protocol requires a principal to uniquely identify itself.
  • there are user principals and service principals
  • eg:- alice@EXAMPLE.COM


  • an authentication administrative domain

Key distribution center (KDC):

  • The KDC ()  is comprised of three components:
  • the Kerberos database, the authentication service (AS),
  • and the ticket-granting service (TGS).
  • eg:- – The KDC for the Kerberos realm EXAMPLE.COM

Kerberos WORKFLOW:-

Aim: User needs to access the Service identified by myservice/

  • User initiates a request to the AS at, (identifying himself as the principal xyz@EXAMPLE.COM)
  • AS responds by providing a TGT that is encrypted using the key (password) for the principal
  • User is now prompted to enter the correct password for the principal in order to decrypt the message
  • User now uses TGT and requests a service ticket from the TGS at
  • TGS validates the TGT and provides user a service ticket, encrypted with the myservice/ principal’s key
  • User now presents the service ticket to myservice, which can then decrypt it using the myservice/ key and validate the ticket.
Posted in Security | Tagged , , , | Leave a comment

Hands-on Kafka

About Kafka

  • Messaging system
  • It doesnt transform data
  • Messages are organized into Topics
  • Producers push messages
  • Consumers pull messages
  • Kafka runs in a cluster
  • Nodes are called brokers

Why Kafka – Advantages 

  • Large number of Consumers
  • Ad-hoc consumers
  • Batch Consumers
  • Automatic recovery from Broker failures


  • Not end-user solution: Need to write Producers and Consumers
  • No data transformations (no encryption..)
  • No authorization, authentication  yet


  • Topic has multiple partitions
  • can put each partition on a separate machine
  • Partitions help to parallelise the topics


  • Inside partitions, each message has an id
  • Ids are called offsets
  • While consuming messages, offsets are specified


  • Each Broker has many partitions
  • Each partition has leader and replicas
  • All reads and writes go to the leaders
  • Leader replicates to the replicas
  • Producers / Consumers never interact with replicas
  • Kafka retains the messages or topics for a certain amount of time
  • Consumers are responsible for consuming data ahead of deletion


  • Synchronous Producers
    • Send a message – wait for Kafka to ack
  • Asyc producer:
    • write to the producer
    • doesnt wait for ack
    • buffers in local memory
    • at some point it writes to Kafka
    • suitable for cases which care about performance and losing some data is less relevant

Kafka On Windows

Download Kafka from :

Save and untar it under C:\mykafka\kafka_2.11-

Start Zookeeper

C:\mykafka\kafka_2.11-\bin\windows>zookeeper-server-start.bat       C:\mykafka\kafka_2.11-\config\


  • port=9091
  • log.dirs=C:\mykafka\kafka-log-2
  • zookeeper.connect=localhost:2181

Similarly create file

Start Broker

kafka-server-start.bat C:\mykafka\kafka_2.11-\config\

kafka-server-start.bat C:\mykafka\kafka_2.11-\config\

Create Topics

Create a topic: first

kafka-topics.bat –zookeeper localhost:2181 –create –topic first –partitions 2 –replication-factor 2

All brokers pickup change from zookeeper

Producer Console

kafka-console-producer.bat –broker-list localhost:9092 –topic first

Consumer Console

kafka-console-consumer.bat –zookeeper localhost:2181 –topic first

Any messages typed in the Producer console is immediately available in the Consumer console.

Get all message from beginning:

C:\mykafka\kafka_2.11-\bin\windows>kafka-console-consumer.bat –zook
eeper localhost:2181 –topic first –from-beginning



Posted in Kafka | Tagged , , , , , , | Leave a comment

Tips: Unix

Display available space on file system

df -h

Display number of kilobyes used by each subdirectory

du -h

du -sh (summary)


Find all files that starts with pro

find . -name pro\*

Primary group: group to which an id belongs-mainly for access
Netgroup: group containing members to access/perform operations on a machine

Get members in a netgroup:

getent netgroup fg_UP20_dev_users

View netgroup on a machine:

bash-4.2$ cat /etc/passwd-
$ grep @ /etc/passwd

Get PID of process running in a port 

netstat -anp | grep 8080 (get pid)

Get Process Details

ps -aux | grep pid

stay tuned..

Posted in Unix | Tagged , , , , , , , | Leave a comment

Tips: Cluster Installation

  1. For Ambari not to override configs: edit the file corr. to the service          /var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/templates/


Posted in Hadoop Cluster Installation, Uncategorized | Leave a comment

Hadoop on Windows

Hadoop on Windows

  • Download the required binaries (e.g., winutils.exe) necessary to run hadoop
  • Download link:
  • Add it to $HADOOP_HOME/bin
  • Set  $HADOOP_HOME, $JAVA_HOME under environment variables
  • Reference:


Spark on Windows

  • While running spark, you can refer to a local path in your computer
  • Spark Master needs to be set to local

String inpath = “C:/New/abc.txt”;
String outpath = “C:/New/New1”;

SparkConf conf = new SparkConf().setAppName(“sparkAction”).setMaster(“local”);

Stay tuned..

Posted in Hadoop, spark, Uncategorized | Tagged , , | Leave a comment

Tips: Maven


GroupId: Package name used in the application

ArtifactId: Name of the class


Create Uber Jar


      <!-- Maven shade plug-in that creates uber JARs -->
Posted in Maven, Uncategorized | Tagged , , | Leave a comment

Tips: Java

Anonymous Classes

  • They enable you to declare and instantiate a class at the same time.
  • They are like local classes except that they do not have a name.
  • Use them if you need to use a local class only once.
  • Help to instantiate an Interface


Posted in Java, Uncategorized | Tagged , | Leave a comment