More Related Content
Similar to De-Mystifying the Apache Phoenix QueryServer (20)
De-Mystifying the Apache Phoenix QueryServer
- 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About me
• (Recent) Apache Phoenix Committer
• Apache Calcite Committer and PMC
• Long-time NoSQL developer, re-learning SQL
Apache Calcite and Apache Phoenix are projects at the Apache Software Foundation.
These names are trademarks of the Foundation.
- 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What?
Why?
How?
Apache Phoenix QueryServer
- 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“What” is Apache Phoenix?
Been called many things [1]
– “We put the SQL back in NoSQL!”
– “A SQL skin on HBase”
– “A relational layer on HBase”
– “Online transaction processing and operational analytics for Hadoop”
Built on HDFS and HBase
– Clients use a JDBC driver
– Lots of server-side “magic” through HBase Coprocessors
A query system capable of both OLAP and OLTP workloads
– More or less
[1] https://medium.com/salesforce-open-source/apache-phoenix-a-conversation-with-pmc-chair-james-taylor-cc0dd8c7c3e5
- 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“What” is the Apache Phoenix QueryServer?
An HTTP abstraction of a JDBC Driver
– Built on Apache Calcite’s Avatica sub-project
A standalone-service to be run on each node in a cluster
– An HTTP server
– Configurable serialization mechanism
A new JDBC Driver to use with the QueryServer
– A glorified HTTP client
– A new sqlline script
- 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“What” is Apache Calcite?
SQL Parser
– One SQL implementation usable by everyone
Cost-Based Optimizer
– “Optimizations are easy”
Pluggable Data Sources
– Implement your own SQL engine
Avatica
– Calcite sub-project
– Implements the JDBC-over-HTTP abstraction
– Written to the JDBC spec, not database-specific
The coolest project approximately one person can explain
- 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What?
Why?
How?
Apache Phoenix QueryServer
- 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“Why” should I care?
A true “thin” client
– No required connection to HBase/ZooKeeper/HDFS
– Greatly simplifies definition of “Phoenix client”
Offload computational resources to cluster
– QueryServers run on the cluster
– Not your laptop or some “edge” node
Enables non-Java clients
– The big one
Because it’s friggin’ cool!
- 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“Why” are non-Java clients important?
”Native” bindings in any language
– HTTP clients are easily implemented
– Serialization approaches (often) have cross-language support
Access to data in HBase is suddenly easily accessible
– Standardized table format through Phoenix
– Well-defined APIs: Python Database API, Ruby ActiveRecord, etc
ODBC and BI Tools
– The moonshot.
– The hopes and dreams of services people everywhere.
Not everyone wants to use Java.
- 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“Why” not <insert rpc framework here> instead of HTTP?
HTTP is simple
– “You have multiple versions of Thrift on the classpath”
– “You have to use Protobuf 2.4”
Designed to be stateless
– JDBC doesn’t make this easy
– Can work around it via Avatica’s wire API
Statelessness makes scaling easier
– Pull down any HTTP load balancer
– Deploy more Avatica servers to scale up
Because portability sucks
- 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
What?
Why?
How?
Apache Phoenix QueryServer
- 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“How” does it work?
HTTP Server
– Jetty
– Phoenix “thick” Driver
Serialization mechanism
– Protocol Buffers
– JSON
Metrics system
– Dropwizard Metrics
– Apache Hadoop Metrics2
Authentication
– Kerberos via SPNEGO
– HTTP Basic or Digest
The QueryServer itself
- 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“How” does the serialization work?
Google Protocol Buffers (v3)
– “think XML, but smaller, faster, and simpler” [1]
– 110% supported WRT compatibility
– Native bindings in most every popular language
– Clients can use any version of protobuf3
JSON
– Nice for testing
– 110% unsupported WRT compatibility
– You will run into issue with mismatched client/server versions
Please, please, please use Protocol Buffers
[1] https://developers.google.com/protocol-buffers/
- 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“How” do I make a client?
Choose a language
– Find an HTTP client supported with that language
– Install Protobuf bindings for that language
Read the Avatica docs [1]
– Tell us when docs are incorrect/lacking/wrong/boring/lame
Write tests
Publish the client
– And tell us!
Sit down and write code
[1] http://calcite.apache.org/avatica/docs/protobuf_reference.html
- 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
“How” do I get involved?
Provide servers for databases
– A simple project for a specific database
Write some tests
Proofread the docs
Contribute a client
Answer questions on Stackoverflow/mailing lists
Carpe diem
- 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thanks!
Email: elserj@apache.org
Twitter: @josh_elser
Mailing lists:
Phoenix: dev@phoenix.apache.org, user@phoenix.apache.org,
Calcite: dev@calcite.apache.org
Project info:
https://phoenix.apache.org/server.html
https://calcite.apache.org/avatica/