Monthly Archives: August 2016

Hands-on Kafka

About Kafka Messaging system It doesnt transform data Messages are organized into Topics Producers push messages Consumers pull messages Kafka runs in a cluster Nodes are called brokers Why Kafka – Advantages  Large number of Consumers Ad-hoc consumers Batch Consumers Automatic … Continue reading

Posted in Kafka | Tagged , , , , , , | Leave a comment

Tips: Unix

Display available space on file system df -h Display number of kilobyes used by each subdirectory du -h du -sh (summary) Find Find all files that starts with pro find . -name pro\* Groups Primary group: group to which an … Continue reading

Posted in Unix | Tagged , , , , , , , | Leave a comment

Tips: Cluster Installation

For Ambari not to override configs: edit the file corr. to the service          /var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/YARN/package/templates/yarn-env.sh.j2  

Posted in Hadoop Cluster Installation, Uncategorized | Leave a comment

Hadoop / SPARK on Windows

Hadoop on Windows Download the required binaries (e.g., winutils.exe) necessary to run hadoop Download link: https://github.com/srccodes/hadoop-common-2.2.0-bin/archive/master.zip Add it to $HADOOP_HOME/bin Set  $HADOOP_HOME, $JAVA_HOME under environment variables Reference: http://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path   Spark on Windows While running spark, you can refer to a local path in … Continue reading

Posted in Hadoop, spark, Uncategorized | Tagged , , | Leave a comment