Monthly Archives: January 2016

Tips: Unix-Hadoop

stay tuned.. Advertisements

Posted in Uncategorized | Leave a comment

Developer’s template: MapReduce (Java)

Developer’s template series is intended to ease the life of  Bigdata developers with their application development and leave behind the headache of starting from the scratch. Here is a mapreduce java program with its pom file. Prerequisites Hadoop cluster Eclipse Maven Java … Continue reading

Posted in Java-Maven-Hadoop | Tagged , , | Leave a comment

Developer’s template: Spark

Developer’s template series is intended to ease the life of  Bigdata developers with their application development and leave behind the headache of starting from the scratch. Following program helps you develop and execute an application using  Apache Spark with Java. Prerequisites Hadoop … Continue reading

Posted in Java-Maven-Hadoop, spark | Tagged , , , | Leave a comment

Tips: Spark

Execute a Spark Pi From Spark directory (usually /usr/hdp/current/spark-client , in case of Hortonworks HDP 2.3.2) run ./bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster  –num-executors 3 –driver-memory 512m  –executor-memory 512m   –executor-cores 1  lib/spark-examples*.jar 10   stay tuned..

Posted in spark, Tips | Tagged , , | Leave a comment

Tips: Sqoop-Hive

  Import data from a database (eg:- SQL Server) into Hive and create an ORC table out of that sqoop import -Dmapred.job.queue.name=default \ –connect ‘jdbc:sqlserver://<host>:<port>;username=<user>;password=<pwd>;database=<dbname>’ \ –hive-import –hive-table <db.tablename> \ –table <sqlservertable> –split-by <splitcolumn> –as-textfile CREATE TABLE <db.tablename_orc> LIKE <db.tablename> … Continue reading

Posted in hive, sqoop, Tips | Leave a comment

Tips: Vertica

Copy data from HDFS to Vertica COPY <Vertica database> Hdfs(url=’http://<NameNode&gt;:50070/webhdfs/v1/<path in hdfs>’, username='<hdfs user>’) DELIMITER ‘,’   stay tuned…

Posted in Tips, vertica | Tagged , | Leave a comment

Tips: Sqoop

Override Cluster properties Eg:- disable compression for sqoop output when compression is turned on in the cluster sqoop import -Dmapred.job.queue.name=default \ -Dmapreduce.map.output.compress=false \ -Dmapreduce.output.fileoutputformat.compress=false \ –driver com.ibm.db2.jcc.DB2Driver –connect jdbc:db2://<host>/<db>\ –username <user>–password <pwd> \ –table <db2 table> –target-dir <hdfs path> \ … Continue reading

Posted in sqoop, Tips | Tagged , , , , , | Leave a comment