SlideShare una empresa de Scribd logo
1 de 17
Apache HBase
Farzad Nozarian
4/3/15 @AUT
Purpose
• This guide describes the setup of a standalone HBase instance running
against the local filesystem/HDFS.
• This is not an appropriate configuration for a production instance of
HBase, but will allow you to experiment with HBase.
• This section shows you how to create a table in HBase using the hbase
shell CLI, insert rows into the table, perform put and scan operations
against the table, enable or disable the table, and start and stop HBase.
2
Required Software
• Java™ HBase requires that a JDK be installed.
http://hbase.apache.org/book.html#java
• Choose a download site from this list of Apache Download Mirrors.
• Click on the suggested top link.
• Click on the folder named stable and then download the binary file that ends
in .tar.gz to your local filesystem.
• Be sure to choose the version that corresponds with the version of Hadoop you
are likely to use later. (hbase-0.98.3-hadoop2-bin.tar.gz.)
3
Prepare to Start the Hadoop Cluster (Cont.)
• Like HDFS, HBase has also 3 running mode:
• Standalone HBase (Setup of a standalone HBase instance running against the local filesystem)
• In standalone mode HBase runs all daemons within this single JVM, i.e. the HMaster, a single
HRegionServer, and the ZooKeeper daemon.
• Using HBase with a local filesystem does not guarantee durability.
• Prior to HBase 0.94.x, HBase expected the loopback IP address to be 127.0.0.1. Ubuntu and some other
distributions default to 127.0.1.1 and this will cause problems for you.
• Pseudo-Distributed Local Install
• HBase still runs completely on a single host, but each HBase daemon (HMaster, HRegionServer, and
Zookeeper) runs as a separate process.
• Fully Distributed
4
Prepare to Start the HBase-Standalone
• Unpack the downloaded HBase distribution. In the distribution, edit the
file conf/hbase-env.sh, uncomment the line starting with JAVA_HOME, and
set it to the appropriate location for your operating system:
# set to the root of your Java installation
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0
5
Prepare to Start the HBase-Standalone (Cont.)
• Edit conf/hbase-site.xml, which is the main HBase configuration file.
• At this time, you only need to specify the directory on the local filesystem where HBase and
ZooKeeper write data.
• By default, a new directory is created under /tmp.
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///home/testuser/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/testuser/zookeeper</value>
</property>
</configuration>
6
Prepare to Start the HBase-Standalone (Cont.)
• The bin/start-hbase.sh script is provided as a convenient way to start
HBase.
• Use the jps command to verify that you have one running process called Hmaster.
• Remember that in standalone mode HBase runs all daemons within this single JVM, i.e. the
HMaster, a single HRegionServer, and the ZooKeeper daemon.
7
Pseudo-Distributed Configuration
• You can re-configure HBase to run in pseudo-distributed mode:
• Stop HBase if it is running.
• Configure HBase:
• Edit the hbase-site.xml configuration.
• This directs HBase to run in distributed mode,
with one JVM instance per daemon.
• Next, change the hbase.rootdir from the local
filesystem to the address of your HDFS instance, using the hdfs://// URI syntax.
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
8
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
</property>
Lab
Assignment
1. Start HBase daemon;
2. Start HBase shell;
3. Create a Book table …
4. Add information to Book table …
5. Count the number of rows …
6. Retrieve an entire record with ID 1;
7. Only retrieve title and description for record with ID 3
8. Change a record …
9. Display all the records to the screen
10. Display title and author's last name for all the records
11. Display title and description for the first 2 records
12. Explore HBase Web-based management console …
13. Check the detailed status of your cluster via HBase
shell
14. Delete a record …
15. Drop the table Book.
9
1- Start HBase daemon
start-hbase.sh
2- Start HBase shell
hbase shell
10
Create a table called Book whose schema will able to house book's title, description, author's
first and last names. Book's title and description should be grouped as they will saved
retrieved together. Author's first and last name should also be grouped.
(hint: since title and description need to be grouped together and so do author's first and last
name, it would be wise to place them into 2 families such as info and author. Then title and
description will become columns of info family and first and last columns of author family).
3-
create 'Book', {NAME=>'info'}, {NAME=>'author'}
4- Add the following information to Book table:
11
put 'Book', '1', 'info:title', 'Faster than the speed love'
put 'Book', '1', 'info:description', 'Long book about love'
put 'Book', '1', 'author:first', 'Brian'
put 'Book', '1', 'author:last', 'Dog'
put 'Book', '2', 'info:title', 'Long day'
put 'Book', '2', 'info:description', 'Story about Monday'
put 'Book', '2', 'author:first', 'Emily'
put 'Book', '2', 'author:last', 'Blue'
put 'Book', '3', 'info:title', 'Flying Car'
put 'Book', '3', 'info:description', 'Novel about airplanes'
put 'Book', '3', 'author:first', 'Phil'
put 'Book', '3', 'author:last', 'High'
12
Count the number of rows. Make sure that every row is printed to
the screen as it being counted.
5-
count 'Book', INTERVAL => 1
6- Retrieve an entire record with ID 1
get 'Book', '1'
7- Only retrieve title and description for record with ID 3.
get 'Book', '3', {COLUMNS=>['info:title', 'info:description']}
13
Change the last name of an author for the record with title Long
Day to Happy.
8-
put 'Book', '2', 'author:last', 'Happy'
# to verify select the record
get 'Book', '2', {COLUMNS=>'author:last'}
# to display both versions
get 'Book', '2', {COLUMNS=>'author:last', VERSIONS=>3}get 'Book', '3',
{COLUMNS=>['info:title', 'info:description']}
• Display the record on the screen to verify the change.
• Display both new and old value. You should be able to see both Blue and
Happy. Why is that?
14
scan 'Book'
9- Display all the records to the screen.
scan 'Book', {COLUMNS=>['info:title', 'author:last']}
scan 'Book', {COLUMNS=>['info:title','info:description'], LIMIT=>'2'}
or
scan 'Book', {COLUMNS=>['info:title','info:description'],
STOPROW=>'3'}
10- Display title and author's last name for all the records.
11- Display title and description for the first 2 records.
15
Book table is hosted via 1 Region Server and there is only 1 Region.
There are no start or end keys for that region because there is only 1
region. It has 2 families info and author. There is no compression set
for both families, and replication is set to 3.
13- Check the detailed status of your cluster via HBase shell.
status 'detailed'
Explore HBase Web-based management console, try and learn as
much as you can about your new table.
12-
16
Delete a record whose title is Flying Car, and validate the record was
deleted by scanning all the records or by attempting to select the record.
delete 'Book', '3', 'info:title'
14-
delete 'Book', '3', 'info:title'
delete 'Book', '3', 'info:description'
delete 'Book', '3', 'author:first'
delete 'Book', '3', 'author:last'
15- Drop the table Book.
disable 'Book'
drop 'Book'
References:
hbase.apache.org
(http://hbase.apache.org/book.html)
17

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Practice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
 
Apache hive
Apache hiveApache hive
Apache hive
 
Bootstrap 3
Bootstrap 3Bootstrap 3
Bootstrap 3
 
Bootstrap.pptx
Bootstrap.pptxBootstrap.pptx
Bootstrap.pptx
 
Physical architecture of sql server
Physical architecture of sql serverPhysical architecture of sql server
Physical architecture of sql server
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Data Source API in Spark
Data Source API in SparkData Source API in Spark
Data Source API in Spark
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Sharding
ShardingSharding
Sharding
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Spark
SparkSpark
Spark
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 
Oracle SQL, PL/SQL Performance tuning
Oracle SQL, PL/SQL Performance tuningOracle SQL, PL/SQL Performance tuning
Oracle SQL, PL/SQL Performance tuning
 
MySql slides (ppt)
MySql slides (ppt)MySql slides (ppt)
MySql slides (ppt)
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Introduction to NOSQL databases
Introduction to NOSQL databasesIntroduction to NOSQL databases
Introduction to NOSQL databases
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Testing Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitTesting Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnit
 

Destacado

Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
Farzad Nozarian
 
An Introduction to MapReduce
An Introduction to MapReduceAn Introduction to MapReduce
An Introduction to MapReduce
Frane Bandov
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
Cloudera, Inc.
 

Destacado (17)

Apache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce TutorialApache Hadoop MapReduce Tutorial
Apache Hadoop MapReduce Tutorial
 
Shark - Lab Assignment
Shark - Lab AssignmentShark - Lab Assignment
Shark - Lab Assignment
 
Apache HDFS - Lab Assignment
Apache HDFS - Lab AssignmentApache HDFS - Lab Assignment
Apache HDFS - Lab Assignment
 
Apache Storm Tutorial
Apache Storm TutorialApache Storm Tutorial
Apache Storm Tutorial
 
Intro to MapReduce
Intro to MapReduceIntro to MapReduce
Intro to MapReduce
 
Object Based Databases
Object Based DatabasesObject Based Databases
Object Based Databases
 
Big Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing EnvironmentsBig Data Processing in Cloud Computing Environments
Big Data Processing in Cloud Computing Environments
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
 
Apache Spark Tutorial
Apache Spark TutorialApache Spark Tutorial
Apache Spark Tutorial
 
S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
 
Big data Clustering Algorithms And Strategies
Big data Clustering Algorithms And StrategiesBig data Clustering Algorithms And Strategies
Big data Clustering Algorithms And Strategies
 
Apache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesApache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use Cases
 
An Introduction to MapReduce
An Introduction to MapReduceAn Introduction to MapReduce
An Introduction to MapReduce
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
MapReduce in Simple Terms
MapReduce in Simple TermsMapReduce in Simple Terms
MapReduce in Simple Terms
 

Similar a Apache HBase - Lab Assignment

Linux apache installation
Linux apache installationLinux apache installation
Linux apache installation
Dima Gomaa
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
Aiden Seonghak Hong
 
Bash shell
Bash shellBash shell
Bash shell
xylas121
 

Similar a Apache HBase - Lab Assignment (20)

03 h base-2-installation_andshell
03 h base-2-installation_andshell03 h base-2-installation_andshell
03 h base-2-installation_andshell
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBase
 
Linux apache installation
Linux apache installationLinux apache installation
Linux apache installation
 
RBASH (1).pdf
RBASH (1).pdfRBASH (1).pdf
RBASH (1).pdf
 
RBASH.pptx
RBASH.pptxRBASH.pptx
RBASH.pptx
 
RBASH.pdf
RBASH.pdfRBASH.pdf
RBASH.pdf
 
Content server installation guide
Content server installation guideContent server installation guide
Content server installation guide
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
 
Configuration of Apache Web Server On CentOS 8
Configuration of Apache Web Server On CentOS 8Configuration of Apache Web Server On CentOS 8
Configuration of Apache Web Server On CentOS 8
 
Drupal from scratch
Drupal from scratchDrupal from scratch
Drupal from scratch
 
Bash shell
Bash shellBash shell
Bash shell
 
Wordpress install setup
Wordpress install setupWordpress install setup
Wordpress install setup
 
Hbase
HbaseHbase
Hbase
 
Installing lemp with ssl and varnish on Debian 9
Installing lemp with ssl and varnish on Debian 9Installing lemp with ssl and varnish on Debian 9
Installing lemp with ssl and varnish on Debian 9
 
Hadoop - Apache Hbase
Hadoop - Apache HbaseHadoop - Apache Hbase
Hadoop - Apache Hbase
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
 
Cloud Automation with Opscode Chef
Cloud Automation with Opscode ChefCloud Automation with Opscode Chef
Cloud Automation with Opscode Chef
 

Último

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 

Último (20)

%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 

Apache HBase - Lab Assignment

  • 2. Purpose • This guide describes the setup of a standalone HBase instance running against the local filesystem/HDFS. • This is not an appropriate configuration for a production instance of HBase, but will allow you to experiment with HBase. • This section shows you how to create a table in HBase using the hbase shell CLI, insert rows into the table, perform put and scan operations against the table, enable or disable the table, and start and stop HBase. 2
  • 3. Required Software • Java™ HBase requires that a JDK be installed. http://hbase.apache.org/book.html#java • Choose a download site from this list of Apache Download Mirrors. • Click on the suggested top link. • Click on the folder named stable and then download the binary file that ends in .tar.gz to your local filesystem. • Be sure to choose the version that corresponds with the version of Hadoop you are likely to use later. (hbase-0.98.3-hadoop2-bin.tar.gz.) 3
  • 4. Prepare to Start the Hadoop Cluster (Cont.) • Like HDFS, HBase has also 3 running mode: • Standalone HBase (Setup of a standalone HBase instance running against the local filesystem) • In standalone mode HBase runs all daemons within this single JVM, i.e. the HMaster, a single HRegionServer, and the ZooKeeper daemon. • Using HBase with a local filesystem does not guarantee durability. • Prior to HBase 0.94.x, HBase expected the loopback IP address to be 127.0.0.1. Ubuntu and some other distributions default to 127.0.1.1 and this will cause problems for you. • Pseudo-Distributed Local Install • HBase still runs completely on a single host, but each HBase daemon (HMaster, HRegionServer, and Zookeeper) runs as a separate process. • Fully Distributed 4
  • 5. Prepare to Start the HBase-Standalone • Unpack the downloaded HBase distribution. In the distribution, edit the file conf/hbase-env.sh, uncomment the line starting with JAVA_HOME, and set it to the appropriate location for your operating system: # set to the root of your Java installation export JAVA_HOME=/usr/lib/jvm/jdk1.7.0 5
  • 6. Prepare to Start the HBase-Standalone (Cont.) • Edit conf/hbase-site.xml, which is the main HBase configuration file. • At this time, you only need to specify the directory on the local filesystem where HBase and ZooKeeper write data. • By default, a new directory is created under /tmp. <configuration> <property> <name>hbase.rootdir</name> <value>file:///home/testuser/hbase</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/testuser/zookeeper</value> </property> </configuration> 6
  • 7. Prepare to Start the HBase-Standalone (Cont.) • The bin/start-hbase.sh script is provided as a convenient way to start HBase. • Use the jps command to verify that you have one running process called Hmaster. • Remember that in standalone mode HBase runs all daemons within this single JVM, i.e. the HMaster, a single HRegionServer, and the ZooKeeper daemon. 7
  • 8. Pseudo-Distributed Configuration • You can re-configure HBase to run in pseudo-distributed mode: • Stop HBase if it is running. • Configure HBase: • Edit the hbase-site.xml configuration. • This directs HBase to run in distributed mode, with one JVM instance per daemon. • Next, change the hbase.rootdir from the local filesystem to the address of your HDFS instance, using the hdfs://// URI syntax. <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> 8 <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8020/hbase</value> </property>
  • 9. Lab Assignment 1. Start HBase daemon; 2. Start HBase shell; 3. Create a Book table … 4. Add information to Book table … 5. Count the number of rows … 6. Retrieve an entire record with ID 1; 7. Only retrieve title and description for record with ID 3 8. Change a record … 9. Display all the records to the screen 10. Display title and author's last name for all the records 11. Display title and description for the first 2 records 12. Explore HBase Web-based management console … 13. Check the detailed status of your cluster via HBase shell 14. Delete a record … 15. Drop the table Book. 9
  • 10. 1- Start HBase daemon start-hbase.sh 2- Start HBase shell hbase shell 10 Create a table called Book whose schema will able to house book's title, description, author's first and last names. Book's title and description should be grouped as they will saved retrieved together. Author's first and last name should also be grouped. (hint: since title and description need to be grouped together and so do author's first and last name, it would be wise to place them into 2 families such as info and author. Then title and description will become columns of info family and first and last columns of author family). 3- create 'Book', {NAME=>'info'}, {NAME=>'author'}
  • 11. 4- Add the following information to Book table: 11 put 'Book', '1', 'info:title', 'Faster than the speed love' put 'Book', '1', 'info:description', 'Long book about love' put 'Book', '1', 'author:first', 'Brian' put 'Book', '1', 'author:last', 'Dog' put 'Book', '2', 'info:title', 'Long day' put 'Book', '2', 'info:description', 'Story about Monday' put 'Book', '2', 'author:first', 'Emily' put 'Book', '2', 'author:last', 'Blue' put 'Book', '3', 'info:title', 'Flying Car' put 'Book', '3', 'info:description', 'Novel about airplanes' put 'Book', '3', 'author:first', 'Phil' put 'Book', '3', 'author:last', 'High'
  • 12. 12 Count the number of rows. Make sure that every row is printed to the screen as it being counted. 5- count 'Book', INTERVAL => 1 6- Retrieve an entire record with ID 1 get 'Book', '1' 7- Only retrieve title and description for record with ID 3. get 'Book', '3', {COLUMNS=>['info:title', 'info:description']}
  • 13. 13 Change the last name of an author for the record with title Long Day to Happy. 8- put 'Book', '2', 'author:last', 'Happy' # to verify select the record get 'Book', '2', {COLUMNS=>'author:last'} # to display both versions get 'Book', '2', {COLUMNS=>'author:last', VERSIONS=>3}get 'Book', '3', {COLUMNS=>['info:title', 'info:description']} • Display the record on the screen to verify the change. • Display both new and old value. You should be able to see both Blue and Happy. Why is that?
  • 14. 14 scan 'Book' 9- Display all the records to the screen. scan 'Book', {COLUMNS=>['info:title', 'author:last']} scan 'Book', {COLUMNS=>['info:title','info:description'], LIMIT=>'2'} or scan 'Book', {COLUMNS=>['info:title','info:description'], STOPROW=>'3'} 10- Display title and author's last name for all the records. 11- Display title and description for the first 2 records.
  • 15. 15 Book table is hosted via 1 Region Server and there is only 1 Region. There are no start or end keys for that region because there is only 1 region. It has 2 families info and author. There is no compression set for both families, and replication is set to 3. 13- Check the detailed status of your cluster via HBase shell. status 'detailed' Explore HBase Web-based management console, try and learn as much as you can about your new table. 12-
  • 16. 16 Delete a record whose title is Flying Car, and validate the record was deleted by scanning all the records or by attempting to select the record. delete 'Book', '3', 'info:title' 14- delete 'Book', '3', 'info:title' delete 'Book', '3', 'info:description' delete 'Book', '3', 'author:first' delete 'Book', '3', 'author:last' 15- Drop the table Book. disable 'Book' drop 'Book'