Phoenix and HBase


  • HBase is a distributed column-oriented database built on top of HDFS.
  • One of the most popular NoSQL databases
  • Hadoop application to use when you require real-time read/write random access to very large datasets.
  • It is built from the ground up to scale linearly just by adding nodes.
  • It has its own data model operations such as Get, Put, Scan and Delete and it does not offer SQL-like capabilities


  • Architecture is based on three key components: HBase Master server, HBase Region Servers and Zookeeper.
  • The client needs to find the RegionServers in order to work with the data stored in HBase.
  • Regions are the basic elements for distributing tables across the cluster.
  • In order to find the Region servers, the client first will have to talk to Zookeeper.

     HBase Datamodel

  • A sorted multidimensional Map.
  • The key elements in the HBase datamodel are tables, column families, columns and rowkeys.
  • The tables are made of columns and rows.
  • The individual elements at the column and row intersections (cells in HBase term) are version based on timestamp.
  • The rows are identified by rowkeys which are sorted – these rowkeys can be considered as primary keys and all the data in the table can be accessed via them.
  • The columns are grouped into column families; at table creation time you do not have to specify all the columns, only the column families.
  • Columns have a prefix derived from the column family and its own qualifier,a column name looks like this: ‘contents:html’.


  • Phoenix is an open source SQL skin for HBase
  • Phoenix provides a command line tool called sqlline – it is a utility written in Python.

      Access Phoenix via sqlline

In a hortonworks sandbox, navigate to /usr/hdp/current/phoenix-client/bin

..$ ./ <host>:2181:/hbase-unsecure

Map Phoenix table to an existing HBase table

  • All Phoenix tables can be viewed via HBase shell.
  • To access the tables in phoenix, that were created using plain hbase shell, create a table or create a view.

About shalishvj : My Experience with BigData

6+ years of experience using Bigdata technologies in Architect, Developer and Administrator roles for various clients. • Experience using Hortonworks, Cloudera, AWS distributions. • Cloudera Certified Developer for Hadoop. • Cloudera Certified Administrator for Hadoop. • Spark Certification from Big Data Spark Foundations. • SCJP, OCWCD. • Experience in setting up Hadoop clusters in PROD, DR, UAT , DEV environments.
This entry was posted in HBase, Phoenix. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s