SlideShare una empresa de Scribd logo
1 de 11
Descargar para leer sin conexión
HBase – Coprocessors

 Mingjie Lai, Trend Micro
       HUG NYC,
      Oct. 11, 2010
What are Coprocessors
●   Inspired by Google Bigtable Coprocessors (Jeff
    Dean's keynote talk at LADIS 09)
    ●   Arbitrary code that runs at each tablet in table server
    ●   High-level call interface for clients
         –   Calls addressed to rows or ranges of rows. coprocessor client library
             resolves to actual locations
         –   Calls across multiple rows automatically split into multiple
             parallelized RPC
    ●   Very flexible model for building distributed services
         –   automatic scaling, load balancing, request routing for app
Current Status
●   Umbrella case: HBASE-2000
●   Coprocessor framework – HBASE-2001:
    ●   Includes RegionObserver, CommandTarget, CP class
        loading
    ●   Code submitted for review
    ●   Will be commited to TRUNK very soon, and 0.92 release
●   Client side support – HBASE-2002:
    ●   Dynamic RPC, between clients and region servers
    ●   Code submitted for review
    ●   Will be commited to TRUNK soon, and 0.92 release
●   The first Coprocessor application – HBASE-3025 and
    3045: Coprocessor based access control
    ●   Code complete
RegionObserver
●   If a coprocessor implements this interface, it will be
    interposed in all region actions via upcalls
    ●   Provides hooks for client side requests: HTable.get(), put(),
        exists(), delete(), scannerOpen(), checkAndPut(), etc.
    ●   Chaining of multiple observers (by priority)
    ●   The first coprocessors application – HBase access control –
        is built on top of it
    ●   More extensions can be built on top of RegionObserver
         –   Secondary indexes
         –   Filters
●   How to develop a RegionObserver
    ●   No new client API defined for RegionObserver
    ●   Need to implement RegionObserver interface and override
        upcall methods: preGet(), postGet(), prePut(), postPut(), etc.
RegionObserver




     Client requests   Region server   CP framework   RegionObserver
CommandTarget
●   CommandTarget with Dynamic RPC provides a way to
    define one's own protocol communicated between
    client and region server, and execute arbitrary code at
    region server
●   CommandTarget methods are triggered by calling
    dynamic RPC client side method –
    Htable.coprocessorExec(...), etc.
●   How to develop
    ●   Defines protocol interface (extends CoprocessorProtocol)
    ●   Implements this protocol interface
    ●   Extend BaseCommandTarget: protocol will be automatically
        registered at coprocessor load
    ●   On client side, the CommandTarget can be triggered by:
         –   HTable.coprocessorProxy() - single region
         –   HTable.coprocessorExec() - region range
Dynamic RPC: a sample
Given CoprocessorProtocol:
public interface CountProtocol extends CoprocessorProtocol {
  int getRowCount();
}
Coprocessors Class Loading
●   Load from configuration: set coprocessors class
    names in HBase configuration
    ●   hbase.coprocessor.default.classes
    ●   Class names are comma seperated
    ●   They will be picked up when region is opened, as default
        coprocessors
●   Load from table attributes
    ●   Utilize table attribute: a path (e.g. HDFS URI) to jar file
    ●   Loaded when region is opened
●   We can utilize CommandTarget to have a way to load
    coprocessors on demand
    ●   Security is the biggest concern
Next Steps
●   See HBASE-2000 and subtasks
●   Framework
    ●   MapReduce
        –   Runs concurrently on all regions of the table
        –   Like Hadoop MapReduce: Mappers, reducers, partitioners,
            intermediates
        –   Not table MapReduce, parallel region MapReduce
    ●   Code weaving
        –   Allow arbitrary code execution right now
        –   Use a rewriting framework like ASM to weave in policies at load time
        –   Improve fault isolation and system integrity protections
        –   Wrap heap allocations to enforce limits
        –   Monitor CPU time
        –   Reject APIs considered unsafe
    ●   On demand Coprocessors class loading
Next Steps
●   Applications
    ●   HBase access control: HBASE-3025, HBASE-3045.
    ●   Aggregate: HBASE-1512
    ●   Region level indexing: HBASE-2038
    ●   Table metacolumns: HBASE-2893
    ●   Secondary indexing?
    ●   New Filtering?
Q&A

Más contenido relacionado

La actualidad más candente

Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insightsOmid Vahdaty
 
SMB Traffic Analyzer @ Samba Xp 2009
SMB Traffic Analyzer @  Samba Xp 2009SMB Traffic Analyzer @  Samba Xp 2009
SMB Traffic Analyzer @ Samba Xp 2009hhetter
 
He Pi Xii2003
He Pi Xii2003He Pi Xii2003
He Pi Xii2003FNian
 
Flume-Cassandra Log Processor
Flume-Cassandra Log ProcessorFlume-Cassandra Log Processor
Flume-Cassandra Log ProcessorCLOUDIAN KK
 
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...Ontico
 
5 Stage Pipeline Semi-custom Design
5 Stage Pipeline Semi-custom Design5 Stage Pipeline Semi-custom Design
5 Stage Pipeline Semi-custom DesignJiang Xue
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vectorSandeep Kunkunuru
 
Ftp server linux
Ftp server linuxFtp server linux
Ftp server linuxPawan Kumar
 
File Transfer Protocol(FTP)
File Transfer Protocol(FTP)File Transfer Protocol(FTP)
File Transfer Protocol(FTP)Varnit Yadav
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon
 
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandShared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandFuenteovejuna
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network
 
Health Check Your DB2 UDB For Z/OS System
Health Check Your DB2 UDB For Z/OS SystemHealth Check Your DB2 UDB For Z/OS System
Health Check Your DB2 UDB For Z/OS Systemsjreese
 
Apache flume - an Introduction
Apache flume - an IntroductionApache flume - an Introduction
Apache flume - an IntroductionErik Schmiegelow
 
[Altibase] 10 replication part3 (system design)
[Altibase] 10 replication part3 (system design)[Altibase] 10 replication part3 (system design)
[Altibase] 10 replication part3 (system design)altistory
 
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...Ishan Thakkar
 
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017Netgate
 
How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018
How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018
How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018javier ramirez
 
PLNOG 13: Michał Dubiel: OpenContrail software architecture
PLNOG 13: Michał Dubiel: OpenContrail software architecturePLNOG 13: Michał Dubiel: OpenContrail software architecture
PLNOG 13: Michał Dubiel: OpenContrail software architecturePROIDEA
 

La actualidad más candente (20)

Flume and Hadoop performance insights
Flume and Hadoop performance insightsFlume and Hadoop performance insights
Flume and Hadoop performance insights
 
SMB Traffic Analyzer @ Samba Xp 2009
SMB Traffic Analyzer @  Samba Xp 2009SMB Traffic Analyzer @  Samba Xp 2009
SMB Traffic Analyzer @ Samba Xp 2009
 
He Pi Xii2003
He Pi Xii2003He Pi Xii2003
He Pi Xii2003
 
Flume-Cassandra Log Processor
Flume-Cassandra Log ProcessorFlume-Cassandra Log Processor
Flume-Cassandra Log Processor
 
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
Tempesta FW - Framework и Firewall для WAF и DDoS mitigation, Александр Крижа...
 
5 Stage Pipeline Semi-custom Design
5 Stage Pipeline Semi-custom Design5 Stage Pipeline Semi-custom Design
5 Stage Pipeline Semi-custom Design
 
System performance monitoring pcp + vector
System performance monitoring   pcp + vectorSystem performance monitoring   pcp + vector
System performance monitoring pcp + vector
 
Ftp server linux
Ftp server linuxFtp server linux
Ftp server linux
 
File Transfer Protocol(FTP)
File Transfer Protocol(FTP)File Transfer Protocol(FTP)
File Transfer Protocol(FTP)
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandShared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
Shared Personalization Service - How To Scale to 15K RPS, Patrice Pelland
 
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
October 2016 HUG: Pulsar,  a highly scalable, low latency pub-sub messaging s...
 
Health Check Your DB2 UDB For Z/OS System
Health Check Your DB2 UDB For Z/OS SystemHealth Check Your DB2 UDB For Z/OS System
Health Check Your DB2 UDB For Z/OS System
 
Apache flume - an Introduction
Apache flume - an IntroductionApache flume - an Introduction
Apache flume - an Introduction
 
[Altibase] 10 replication part3 (system design)
[Altibase] 10 replication part3 (system design)[Altibase] 10 replication part3 (system design)
[Altibase] 10 replication part3 (system design)
 
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
 
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
 
How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018
How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018
How a BEAM runner executes a pipeline. Apache BEAM Summit London 2018
 
PLNOG 13: Michał Dubiel: OpenContrail software architecture
PLNOG 13: Michał Dubiel: OpenContrail software architecturePLNOG 13: Michał Dubiel: OpenContrail software architecture
PLNOG 13: Michał Dubiel: OpenContrail software architecture
 

Similar a HBase Coprocessors @ HUG NYC

Maintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesMaintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesPaolo Corti
 
haproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdfhaproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdfPawanVerma628806
 
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Futurercastain
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stackLuca Mattia Ferrari
 
Nov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeNov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeO'Reilly Media
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftRX-M Enterprises LLC
 
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013Modern Data Stack France
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)Apache Apex
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 introTerry Cho
 
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformRedge Technologies
 
Informix HA Best Practices
Informix HA Best Practices Informix HA Best Practices
Informix HA Best Practices Scott Lashley
 
Always on high availability best practices for informix
Always on high availability best practices for informixAlways on high availability best practices for informix
Always on high availability best practices for informixIBM_Info_Management
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...Cloudera, Inc.
 
Towards constrained semantic web
Towards constrained semantic webTowards constrained semantic web
Towards constrained semantic web☕ Remy Rojas
 

Similar a HBase Coprocessors @ HUG NYC (20)

Maintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queuesMaintaining spatial data infrastructures (SDIs) using distributed task queues
Maintaining spatial data infrastructures (SDIs) using distributed task queues
 
haproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdfhaproxy-150423120602-conversion-gate01.pdf
haproxy-150423120602-conversion-gate01.pdf
 
HAProxy
HAProxy HAProxy
HAProxy
 
Scalable Web Apps
Scalable Web AppsScalable Web Apps
Scalable Web Apps
 
HPC Controls Future
HPC Controls FutureHPC Controls Future
HPC Controls Future
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stack
 
Nov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars georgeNov. 4, 2011 o reilly webcast-hbase- lars george
Nov. 4, 2011 o reilly webcast-hbase- lars george
 
Building high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache ThriftBuilding high performance microservices in finance with Apache Thrift
Building high performance microservices in finance with Apache Thrift
 
KrakenD API Gateway
KrakenD API GatewayKrakenD API Gateway
KrakenD API Gateway
 
slides (PPT)
slides (PPT)slides (PPT)
slides (PPT)
 
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
Marcel Kornacker: Impala tech talk Tue Feb 26th 2013
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
 
Kubernetes #1 intro
Kubernetes #1   introKubernetes #1   intro
Kubernetes #1 intro
 
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platform
 
HAProxy 1.9
HAProxy 1.9HAProxy 1.9
HAProxy 1.9
 
Informix HA Best Practices
Informix HA Best Practices Informix HA Best Practices
Informix HA Best Practices
 
Always on high availability best practices for informix
Always on high availability best practices for informixAlways on high availability best practices for informix
Always on high availability best practices for informix
 
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
 
Grpc present
Grpc presentGrpc present
Grpc present
 
Towards constrained semantic web
Towards constrained semantic webTowards constrained semantic web
Towards constrained semantic web
 

Último

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Último (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

HBase Coprocessors @ HUG NYC

  • 1. HBase – Coprocessors Mingjie Lai, Trend Micro HUG NYC, Oct. 11, 2010
  • 2. What are Coprocessors ● Inspired by Google Bigtable Coprocessors (Jeff Dean's keynote talk at LADIS 09) ● Arbitrary code that runs at each tablet in table server ● High-level call interface for clients – Calls addressed to rows or ranges of rows. coprocessor client library resolves to actual locations – Calls across multiple rows automatically split into multiple parallelized RPC ● Very flexible model for building distributed services – automatic scaling, load balancing, request routing for app
  • 3. Current Status ● Umbrella case: HBASE-2000 ● Coprocessor framework – HBASE-2001: ● Includes RegionObserver, CommandTarget, CP class loading ● Code submitted for review ● Will be commited to TRUNK very soon, and 0.92 release ● Client side support – HBASE-2002: ● Dynamic RPC, between clients and region servers ● Code submitted for review ● Will be commited to TRUNK soon, and 0.92 release ● The first Coprocessor application – HBASE-3025 and 3045: Coprocessor based access control ● Code complete
  • 4. RegionObserver ● If a coprocessor implements this interface, it will be interposed in all region actions via upcalls ● Provides hooks for client side requests: HTable.get(), put(), exists(), delete(), scannerOpen(), checkAndPut(), etc. ● Chaining of multiple observers (by priority) ● The first coprocessors application – HBase access control – is built on top of it ● More extensions can be built on top of RegionObserver – Secondary indexes – Filters ● How to develop a RegionObserver ● No new client API defined for RegionObserver ● Need to implement RegionObserver interface and override upcall methods: preGet(), postGet(), prePut(), postPut(), etc.
  • 5. RegionObserver Client requests Region server CP framework RegionObserver
  • 6. CommandTarget ● CommandTarget with Dynamic RPC provides a way to define one's own protocol communicated between client and region server, and execute arbitrary code at region server ● CommandTarget methods are triggered by calling dynamic RPC client side method – Htable.coprocessorExec(...), etc. ● How to develop ● Defines protocol interface (extends CoprocessorProtocol) ● Implements this protocol interface ● Extend BaseCommandTarget: protocol will be automatically registered at coprocessor load ● On client side, the CommandTarget can be triggered by: – HTable.coprocessorProxy() - single region – HTable.coprocessorExec() - region range
  • 7. Dynamic RPC: a sample Given CoprocessorProtocol: public interface CountProtocol extends CoprocessorProtocol { int getRowCount(); }
  • 8. Coprocessors Class Loading ● Load from configuration: set coprocessors class names in HBase configuration ● hbase.coprocessor.default.classes ● Class names are comma seperated ● They will be picked up when region is opened, as default coprocessors ● Load from table attributes ● Utilize table attribute: a path (e.g. HDFS URI) to jar file ● Loaded when region is opened ● We can utilize CommandTarget to have a way to load coprocessors on demand ● Security is the biggest concern
  • 9. Next Steps ● See HBASE-2000 and subtasks ● Framework ● MapReduce – Runs concurrently on all regions of the table – Like Hadoop MapReduce: Mappers, reducers, partitioners, intermediates – Not table MapReduce, parallel region MapReduce ● Code weaving – Allow arbitrary code execution right now – Use a rewriting framework like ASM to weave in policies at load time – Improve fault isolation and system integrity protections – Wrap heap allocations to enforce limits – Monitor CPU time – Reject APIs considered unsafe ● On demand Coprocessors class loading
  • 10. Next Steps ● Applications ● HBase access control: HBASE-3025, HBASE-3045. ● Aggregate: HBASE-1512 ● Region level indexing: HBASE-2038 ● Table metacolumns: HBASE-2893 ● Secondary indexing? ● New Filtering?
  • 11. Q&A