Category Archives: Tips

Tips: Spark

Configuration Pass configuration values from a property file spark-submit supports loading configuration values from a file read whitespace-delimited key/value pairs from this file customize the exact location of the file using the –properties-file flag to spark-submit $ bin/spark-submit \ –class … Continue reading

Posted in spark, Tips | Tagged , , , , , , , , , , | Leave a comment

Tips: Spark

Execute a Spark Pi From Spark directory (usually /usr/hdp/current/spark-client , in case of Hortonworks HDP 2.3.2) run ./bin/spark-submit –class org.apache.spark.examples.SparkPi –master yarn-cluster  –num-executors 3 –driver-memory 512m  –executor-memory 512m   –executor-cores 1  lib/spark-examples*.jar 10   stay tuned..

Posted in spark, Tips | Tagged , , | Leave a comment

Tips: Sqoop-Hive

  Import data from a database (eg:- SQL Server) into Hive and create an ORC table out of that sqoop import -Dmapred.job.queue.name=default \ –connect ‘jdbc:sqlserver://<host>:<port>;username=<user>;password=<pwd>;database=<dbname>’ \ –hive-import –hive-table <db.tablename> \ –table <sqlservertable> –split-by <splitcolumn> –as-textfile CREATE TABLE <db.tablename_orc> LIKE <db.tablename> … Continue reading

Posted in hive, sqoop, Tips | Leave a comment

Tips: Vertica

Copy data from HDFS to Vertica COPY <Vertica database> Hdfs(url=’http://<NameNode&gt;:50070/webhdfs/v1/<path in hdfs>’, username='<hdfs user>’) DELIMITER ‘,’   stay tuned…

Posted in Tips, vertica | Tagged , | Leave a comment

Tips: Sqoop

Override Cluster properties Eg:- disable compression for sqoop output when compression is turned on in the cluster sqoop import -Dmapred.job.queue.name=default \ -Dmapreduce.map.output.compress=false \ -Dmapreduce.output.fileoutputformat.compress=false \ –driver com.ibm.db2.jcc.DB2Driver –connect jdbc:db2://<host>/<db>\ –username <user>–password <pwd> \ –table <db2 table> –target-dir <hdfs path> \ … Continue reading

Posted in sqoop, Tips | Tagged , , , , , | Leave a comment

Tips: OOZIE

I would include important commands and other tips related to oozie here.. Trigger Workflow oozie job -oozie http://<host&gt;:11000/oozie -config /users/oozieTest/ooziePigTest/oozie-test/job/job.properties -run Trigger Workflow Through REST API curl -X POST -H “Content-Type: application/xml” -d @oozie-testcurl.xml “http://<host&gt;:11000/oozie/v2/jobs?action=start” Trigger Coordinator oozie job -oozie http://<host&gt;:11000/oozie … Continue reading

Posted in oozie, Rest API, Tips | Tagged , , | Leave a comment