3. HBase: Features
• Strictly consistent reads and writes.
• Automatic and configurable sharding of tables
• Automatic failover support between RegionServers.
• Base classes for MapReduce jobs
• Easy java API
• Block cache and Bloom Filters for real-time queries.
3
4. HBase: Features
• Query predicate push down via server side Filters
• Thrift gateway and a REST-ful Web service that
supports XML, Protobuf, and binary data encoding
options
• Extensible jruby-based (JIRB) shell
• Support for exporting metrics via the Hadoop metrics
subsystem to files or Ganglia; or via JMX
4
5. HBase: Installation
• It can be run in 3 settings:
– Single-node standalone
– Pseudo-distributed single-machine
– Fully-distributed cluster
• We will see how to install HBase using Docker
• Source code at
https://github.com/fabiofumarola/NoSQLDatabases
Courses
5
7. Single-node standalone
• Source code at
https://github.com/fabiofumarola/NoSQLDatabasesCourses
• It uses the local file system not HDFS (not for production).
• Download the tar distribution
• Edit hbase-site.xml
• Start HBase via start-hbase.sh
• We can use jps to test if HBase is running
7
8. Hbase-site.xml
The folders are created automatically by HBase
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///hbase-data/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/hbase-data/zookeeper</value>
</property>
</configuration>
8
9. Single-node standalone
• Build the image
– docker build –tag=wheretolive/hbase:single ./
• Run the image
– docker run –d –p 2181:2181 -p 60010:60010 -p
60000:60000 -p 60020:60020 -p 60030:60030 –h hbase
--name=hbase wheretolive/hbase:single
9
11. Pseudo-distributed
• Run HBase in this mode means that each daemon
(HMaster, HRegionServer and Zookpeeper) run as
separate process.
• Here we can store the data into HDFS if it is available
• The main change is the hbase-site.xml
11
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
</configuration>
12. Pseudo-distributed
• Build the image
– docker build –tag=wheretolive/hbase:pseudo ./
• Run the image
– docker run –d –p 2181:2181 -p 60010:60010 -p
60000:60000 -p 60020:60020 -p 60030:60030 –h hbase
--name=hbase wheretolive/hbase:pseudo
12
14. HBase Shell
• Start the shell
• Create a table
• List the tables
14
$ ./bin/hbase shell
hbase(main):001:0>
hbase(main):001:0> create 'test', 'cf'
0 row(s) in 0.4170 seconds
=> Hbase::Table - test
hbase(main):002:0> list 'test'
TABLE
test
1 row(s) in 0.0180 seconds
=> ["test"]
Notas del editor
. You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
. You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
. You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
. You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
. You need to run HBase on HDFS to ensure all writes are preserved. Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.