Tips: Sqoop-Hive

 

Import data from a database (eg:- SQL Server) into Hive and create an ORC table out of that

sqoop import -Dmapred.job.queue.name=default \
–connect ‘jdbc:sqlserver://<host>:<port>;username=<user>;password=<pwd>;database=<dbname>’ \
–hive-import –hive-table <db.tablename> \
–table <sqlservertable> –split-by <splitcolumn> –as-textfile
CREATE TABLE <db.tablename_orc> LIKE <db.tablename> STORED AS ORC tblproperties (“orc.compress.size”=”8192”);

INSERT INTO TABLE <db.tablename_orc> select * from <db.tablename>;

 

stay tuned..

Advertisements

About shalishvj : My Experience with BigData

6+ years of experience using Bigdata technologies in Architect, Developer and Administrator roles for various clients. • Experience using Hortonworks, Cloudera, AWS distributions. • Cloudera Certified Developer for Hadoop. • Cloudera Certified Administrator for Hadoop. • Spark Certification from Big Data Spark Foundations. • SCJP, OCWCD. • Experience in setting up Hadoop clusters in PROD, DR, UAT , DEV environments.
This entry was posted in hive, sqoop, Tips. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s