Hbase: Introduction to column oriented databases

•

14 recomendaciones•20,272 vistas

Big Data is getting more attention each day, followed by new storage paradigms. This presentation shows a fast intro to HBase, a column oriented database used by Facebook and other big players to store and extract knowledge of high volume of data.

Tecnología

HBase
Introduction to
column oriented
databases

Luís Cipriani
@lfcipriani (twitter, linkedin, github, ...)
22o. GURU (2012-02-25) - Sao Paulo/Brazil

sexta-feira, 24 de fevereiro de 12

intro

“A BigTable HBase is a sparse,
distributed, persistent
multidimensional sorted map”

http://research.google.com/archive/bigtable.html
sexta-feira, 24 de fevereiro de 12

intro > data model
{ <-- table
// ...
"aaaaa" : { <-- row
"A" : { <-- column family
"foo" : { <-- column (qualifier)
15: "y", <-- timestamp, value
4: "m"
}
"bar" : {...}
},
"B" : {
"" : {...}
}
},
"aaaab" : {
"A" : {
"foo" : {...},
"bar" : {...},
"joe" : {...}
},
"B" : {
"" : {...}
}
},
// ...
}
sexta-feira, 24 de fevereiro de 12

intro > data model

(Table, RowKey, Family, Column, Timestamp) → Value

sexta-feira, 24 de fevereiro de 12

intro > hadoop stack

• hadoop HDFS (or not)
• hadoop MapReduce
• hadoop ZooKeeper
• hadoop HBase
• hadoop Hue, Whirr, etc...

sexta-feira, 24 de fevereiro de 12

architecture

sexta-feira, 24 de fevereiro de 12

key design > read/write model

• randon reads (get)
• sequential reads (scan)
• partial key scans
• writes (put = update)

sexta-feira, 24 de fevereiro de 12

key design > storage model

http://ofps.oreilly.com/titles/9781449396107/advanced.html
sexta-feira, 24 de fevereiro de 12

key design > strategies

• tall-narrow vs ﬂat-wide
• partial key scans
• pagination
• time series
• salting
• ﬁeld swap
• randomization
• secondary indexes

sexta-feira, 24 de fevereiro de 12

key design > example

sexta-feira, 24 de fevereiro de 12

development

• installation modes
• standalone, pseudo-distributed, distributed
• JRuby console
• Access
• java/jruby API (more features)
• entrypoints REST, Thrift, Avro, Protobuffers
• there several other libs

sexta-feira, 24 de fevereiro de 12

cons

• complex conﬁg and maintenance
• hot regions
• no secondary index built-in
• no transactions built-in
• complex schema design

sexta-feira, 24 de fevereiro de 12

pros

• distributed
• scalable (auto-sharding)
• built on Hadoop stack
• handles Big Data
• high performance for write and read
• no SPOF
• fault tolerant, no data loss
• active community

sexta-feira, 24 de fevereiro de 12

Reformulação Box de Login Abril ID
http://engineering.abril.com.br/
http://abr.io/hbase-intro
https://pinboard.in/u:lfcipriani/t:hbase/
http://hbase.apache.org/

? http://shop.oreilly.com/product/0636920014348.do

sexta-feira, 24 de fevereiro de 12

Más contenido relacionado

Similar a Hbase: Introduction to column oriented databases

Hive introduction 介绍ablozhou

Intro to HBase - Lars GeorgeJAX London

Hadoop Overview & Architecture EMC

Brust hadoopecosystemAndrew Brust

Generalized framework for using NoSQL DatabasesKIRAN V

Hadoop/Mahout/HBaseでテキスト分類器を作ったよNaoki Yanai

20100128ebayJeff Hammerbacher

Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) BigDataEverywhere

HBase, no troubleLINE Corporation

Hadoop - Introduction to HadoopVibrant Technologies & Computers

HBaseConEast2016: HBase and Spark, State of the ArtMichael Stack

HBase and Impala Notes - Munich HUG - 20131017larsgeorge

MongoDB at GULIsrael Gutiérrez

Hadoop @ eBay: Past, Present, and FutureRyan Hennig

Similar a Hbase: Introduction to column oriented databases (14)

Hive introduction 介绍

Intro to HBase - Lars George

Hadoop Overview & Architecture

Brust hadoopecosystem

Generalized framework for using NoSQL Databases

Hadoop/Mahout/HBaseでテキスト分類器を作ったよ

20100128ebay

Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)

HBase, no trouble

Hadoop - Introduction to Hadoop

HBaseConEast2016: HBase and Spark, State of the Art

HBase and Impala Notes - Munich HUG - 20131017

MongoDB at GUL

Hadoop @ eBay: Past, Present, and Future

Más de Luis Cipriani

Adventures with Raspberry Pi and Twitter APILuis Cipriani

Capturando o pulso do planeta com as APIs de Streaming do TwitterLuis Cipriani

Twitter e suas APIs de Streaming - Campus Party Brasil 7Luis Cipriani

Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupadosLuis Cipriani

API Caching, why your server needs some restLuis Cipriani

Explaining A Programming Model for Context-Aware Applications in Large-Scale ...Luis Cipriani

Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...Luis Cipriani

Como um verdadeiro sistema REST funciona: arquitetura e performance na AbrilLuis Cipriani

Explaining Semantic WebLuis Cipriani

Fearless HTTP requests abuseLuis Cipriani

Más de Luis Cipriani (10)

Adventures with Raspberry Pi and Twitter API

Capturando o pulso do planeta com as APIs de Streaming do Twitter

Twitter e suas APIs de Streaming - Campus Party Brasil 7

Segurança de APIs HTTP, um guia sensato para desenvolvedores preocupados

API Caching, why your server needs some rest

Explaining A Programming Model for Context-Aware Applications in Large-Scale ...

Alexandria: um Sistema de Sistemas para Publicação de Conteúdo Digital utiliz...

Como um verdadeiro sistema REST funciona: arquitetura e performance na Abril

Explaining Semantic Web

Fearless HTTP requests abuse

Último

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Commit 2024 - Secret Management made easyAlfredo García Lavilla

WordPress Websites for Engineers: Elevate Your Brandgvaughan

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Anypoint Exchange: It’s Not Just a Repo!Manik S Magar

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

How to write a Business Continuity PlanDatabarracks

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Hbase: Introduction to column oriented databases

1. HBase Introduction to column oriented databases Luís Cipriani @lfcipriani (twitter, linkedin, github, ...) 22o. GURU (2012-02-25) - Sao Paulo/Brazil sexta-feira, 24 de fevereiro de 12

2. ME sexta-feira, 24 de fevereiro de 12

3. intro “A BigTable HBase is a sparse, distributed, persistent multidimensional sorted map” http://research.google.com/archive/bigtable.html sexta-feira, 24 de fevereiro de 12

4. intro > data model { <-- table // ... "aaaaa" : { <-- row "A" : { <-- column family "foo" : { <-- column (qualifier) 15: "y", <-- timestamp, value 4: "m" } "bar" : {...} }, "B" : { "" : {...} } }, "aaaab" : { "A" : { "foo" : {...}, "bar" : {...}, "joe" : {...} }, "B" : { "" : {...} } }, // ... } sexta-feira, 24 de fevereiro de 12

5. intro > data model (Table, RowKey, Family, Column, Timestamp) → Value sexta-feira, 24 de fevereiro de 12

6. intro > hadoop stack • hadoop HDFS (or not) • hadoop MapReduce • hadoop ZooKeeper • hadoop HBase • hadoop Hue, Whirr, etc... sexta-feira, 24 de fevereiro de 12

7. architecture sexta-feira, 24 de fevereiro de 12

8. key design > read/write model • randon reads (get) • sequential reads (scan) • partial key scans • writes (put = update) sexta-feira, 24 de fevereiro de 12

9. key design > storage model http://ofps.oreilly.com/titles/9781449396107/advanced.html sexta-feira, 24 de fevereiro de 12

10. key design > strategies • tall-narrow vs ﬂat-wide • partial key scans • pagination • time series • salting • ﬁeld swap • randomization • secondary indexes sexta-feira, 24 de fevereiro de 12

11. key design > example sexta-feira, 24 de fevereiro de 12

12. development • installation modes • standalone, pseudo-distributed, distributed • JRuby console • Access • java/jruby API (more features) • entrypoints REST, Thrift, Avro, Protobuffers • there several other libs sexta-feira, 24 de fevereiro de 12

13. cons • complex conﬁg and maintenance • hot regions • no secondary index built-in • no transactions built-in • complex schema design sexta-feira, 24 de fevereiro de 12

14. pros • distributed • scalable (auto-sharding) • built on Hadoop stack • handles Big Data • high performance for write and read • no SPOF • fault tolerant, no data loss • active community sexta-feira, 24 de fevereiro de 12

15. Reformulação Box de Login Abril ID http://engineering.abril.com.br/ http://abr.io/hbase-intro https://pinboard.in/u:lfcipriani/t:hbase/ http://hbase.apache.org/ ? http://shop.oreilly.com/product/0636920014348.do sexta-feira, 24 de fevereiro de 12

Hbase: Introduction to column oriented databases

Recomendados

Recomendados

Más contenido relacionado

Similar a Hbase: Introduction to column oriented databases

Similar a Hbase: Introduction to column oriented databases (14)

Más de Luis Cipriani

Más de Luis Cipriani (10)

Último

Último (20)

Hbase: Introduction to column oriented databases