Monthly Archives: August 2016

Hands-on Kafka

About Kafka Messaging system It doesnt transform data Messages are organized into Topics Producers push messages Consumers pull messages Kafka runs in a cluster Nodes are called brokers Why Kafka – Advantages  Large number of Consumers Ad-hoc consumers Batch Consumers Automatic … Continue reading

Posted in Kafka | Tagged , , , , , , | Leave a comment

Tips: Unix

Display available space on file system df -h Display number of kilobyes used by each subdirectory du -h du -sh (summary) Find Find all files that starts with pro find . -name pro\* Groups Primary group: group to which an … Continue reading

Posted in Unix | Tagged , , , , , , , | Leave a comment

Tips: Cluster Installation

For Ambari not to override configs: edit the file corr. to the service          /var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/templates/  

Posted in Hadoop Cluster Installation, Uncategorized | Leave a comment

Hadoop / SPARK on Windows

Hadoop on Windows Download the required binaries (e.g., winutils.exe) necessary to run hadoop Download link: Add it to $HADOOP_HOME/bin Set  $HADOOP_HOME, $JAVA_HOME under environment variables Reference:   Spark on Windows While running spark, you can refer to a local path in … Continue reading

Posted in Hadoop, spark, Uncategorized | Tagged , , | Leave a comment