Why is data independence (still) so important? Optiq and Apache Drill.

•Descargar como PPT, PDF•

2 recomendaciones•4,674 vistas

Presentation to the Apache Drill Meetup in Sunnyvale, CA on 2012/9/13. Framing the debate about Drill's goals in terms of a "typical" modern DBMS architecture; and also introducing the Optiq extensible query optimizer.

Tecnología

Why is data independence
(still) so important?
Julian Hyde @julianhyde

http://github.com/julianhyde/optiq
http://github.com/julianhyde/optiq-splunk

Apache Drill Meeting
2012/9/13

Data independence
This is my opinion about data management systems in general. I don't
claim that it is the right answer for Apache Drill.
I claim that a logical/physical separation can make a data management
system more widely applicable, therefore more widely adopted,
therefore better.
What “data independence” means in today's “big data” world.

About me
Julian Hyde

Database hacker (Oracle, Broadbase, SQLstream, LucidDB)
Open source hacker (Mondrian, olap4j, LucidDB, Optiq)

@julianhyde
http://github.com/julianhyde

http://www.flickr.com/photos/torkildr/3462606643

http://www.flickr.com/photos/sylvar/31436961/

“Big Data”
Right data, right time
Diverse data sources / Performance / Suitable format
Volume / Velocity / Variety

Volume – solved :)
Velocity – not one of Drill's goals (?)
Variety – ?

Variety
Variety of source formats (csv, avro, json, weblogs)
Variety of storage structures (indexes, projections, sort
order, materialized views) now or in future
Variety of query languages (DrQL, SQL)
Combine with other data (join, union)
Embed within other systems, e.g. Hive
Source for other systems, e.g. Drill | Cascading > Teradata
Tools generate SQL

Use case: Optiq* at Splunk
SQL interface on NoSQL system
“Smart” JDBC driver – pushes processing down to Splunk

* Truth in advertising: I am the author of Optiq.

Expression tree SELECT p.“product_name”, COUNT(*) AS c
FROM “splunk”.”splunk” AS s
JOIN “mysql”.”products” AS p
ON s.”product_id” = p.”product_id”
WHERE s.“action” = 'purchase'
GROUP BY p.”product_name”
Splunk ORDER BY c DESC

Table: splunk
Key: product_name
Key: product_id Agg: count
Condition: Key: c DESC
action =
'purchase'
scan
join
MySQL filter group sort
scan
Table: products

Expression tree SELECT p.“product_name”, COUNT(*) AS c
FROM “splunk”.”splunk” AS s
(optimized) JOIN “mysql”.”products” AS p
ON s.”product_id” = p.”product_id”
WHERE s.“action” = 'purchase'
GROUP BY p.”product_name”
Splunk ORDER BY c DESC
Condition:
Table: splunk action =
'purchase' Key: product_name
Agg: count
Key: c DESC
Key: product_id
scan filter

MySQL
join group sort
scan
Table: products

Conventional DBMS architecture
JDBC client

JDBC server
SQL parser /
validator Metadata
Query
optimizer
Data-flow
operators

Data Data

Drill architecture
DrQL client

DrQL parser /
validator

?
Metadata

Data-flow
operators

Data Data

Optiq architecture
JDBC client

JDBC server
Optional SQL parser / Metadata
validator SPI
Core Query Pluggable
optimizer rules
3rd 3rd
Pluggable party party
ops ops
3rd party 3rd party
data data

Analogy: Compiler architecture

front end C++ C Fortran

middle end Optimizations

back end x86 ARM Fortran

Conclusions
Clear logical / physical separation allows a data
management system to handle a wider variety of data,
query languages, and packaging.
Also provides a clear interface between the sub-teams
working on query language and operators.
A query optimizer allows new operators, and alternative
algorithms and data structures, to be easily added to
the system.

Writing an adapter
Driver – if you want a vanity URL like “jdbc:drill:”
Schema – describes what tables exist
Table – what are the columns, and how to get the data.
Operators (optional) – non-relational operators, if any
Rules (optional, but recommended) – improve efficiency by changing the
question
Parser (optional) – additional source languages

Más contenido relacionado

La actualidad más candente

A talk given at the August, 2013 NoSQL Now! conference in San Jose, California. 2013 is the year of that SQL meets Hadoop & NoSQL. There is a surprisingly good fit between thirty-something SQL and the upstart millennial data technologies. SQL helps you deliver faster value, improve integration with visualization and analytics tools, and bring your data to a larger audience. Julian Hyde, author of Mondrian and Optiq, and a contributor to Apache Drill and Cascading Lingual, shows how to quickly build a SQL interface to a NoSQL system using the Optiq framework. As a case study, he shows how Optiq has allows the Mondrian in-memory analysis engine (part of Pentaho's business intelligence suite) to leverage the memory and compute power of a cluster and pull data from hybrid SQL and NoSQL sources.

SQL Now! How Optiq brings the best of SQL to NoSQL data.

Julian Hyde

https://fosdem.org/2017/schedule/event/hpc_bigdata_calcite/ When working with BigData & IoT systems we often feel the need for a Common Query Language. The platform specific languages are often harder to integrate with and require longer adoption time. To fill this gap many NoSql (Not-only-Sql) vendors are building SQL layers for their platforms. It is worth exploring the driving forces behind this trend, how it fits in your BigData stacks and how we can adopt it in our favorite tools. However building SQL engine from scratch is a daunting job and frameworks like Apache Calcite can help you with the heavy lifting. Calcite allow you to integrate SQL parser, cost-based optimizer, and JDBC with your big data system. Calcite has been used to empower many Big-Data platforms such as Hive, Spark, Drill Phoenix to name some. I will walk you through the process of building a SQL access layer for Apache Geode (In-Memory Data Grid). I will share my experience, pitfalls and technical consideration like balancing between the SQL/RDBMS semantics and the design choices and limitations of the data system. Hopefully this will enable you to add SQL capabilities to your prefered NoSQL data system.

SQL for NoSQL and how Apache Calcite can help

Christian Tzolov

What if Looker saw the queries you just executed and could predict your next query? Could it make those queries faster, by smarter caching, or aggregate navigation? Could it read your past SQL queries and help you write your LookML model? Those are some of the reasons to add relational algebra into Looker’s query engine, and why Looker hired Julian Hyde, author of Apache Calcite, to lead the effort. In this talk about the internals of Looker’s query engine, Julian Hyde will describe how the engine works, how Looker queries are described in Calcite’s relational algebra, and some features that it makes possible. A talk by Julian Hyde at JOIN 2019 in San Francisco.

Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...

Julian Hyde

Apache Calcite is a dynamic data management framework. Think of it as a toolkit for building databases: it has an industry-standard SQL parser, validator, highly customizable optimizer (with pluggable transformation rules and cost functions, relational algebra, and an extensive library of rules), but it has no preferred storage primitives. In this tutorial (given at BOSS '21 in Copenhagen as part of VLDB '21) the attendees will use Apache Calcite to build a fully fledged query processor from scratch with very few lines of code. This processor is a full implementation of SQL over an Apache Lucene storage engine. (Lucene does not support SQL queries and lacks a declarative language for performing complex operations such as joins or aggregations.) Attendees will also learn how to use Calcite as an effective tool for research. Presenters: Julian Hyde and Stamatis Zampetakis

Apache Calcite (a tutorial given at BOSS '21)

Julian Hyde

Optiq: a SQL front-end for everything

Julian Hyde

Introduction to Apache Calcite

Jordan Halterman

What if Apache Pig had a SQL front-end and query optimizer? What if Apache Calcite was able to use Pig and MapReduce to run queries? In this project, we aimed to answer both questions by adding a Pig adapter for Calcite. In this talk, we describe Calcite's adapter framework, how we used it to write a Pig adapter, and how you can use this SQL interface to Pig for interactive and long-running queries. A talk given by Eli Levine and Julian Hyde at Apache: Big Data, Miami, on May 17th, 2017.

A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite

Julian Hyde

Enterprise data is moving into Hadoop, but some data has to stay in operational systems. Apache Calcite (the technology behind Hive’s new cost-based optimizer, formerly known as Optiq) is a query-optimization and data federation technology that allows you to combine data in Hadoop with data in NoSQL systems such as MongoDB and Splunk, and access it all via SQL. Hyde shows how to quickly build a SQL interface to a NoSQL system using Calcite. He shows how to add rules and operators to Calcite to push down processing to the source system, and how to automatically build materialized data sets in memory for blazing-fast interactive analysis.

SQL on everything, in memory

Julian Hyde

What to do with all that memory in a Hadoop cluster? Should we load all of our data into memory to process it? The goal should be to put memory into its right place in the storage hierarchy, alongside disk and solid-state drives (SSD). Data should reside in the right place for how it is being used, and should be organized appropriately for where it resides. This proposed solution requires a new kind of data set called the Discardable, In-Memory, Materialized Query (DIMMQ). In this session we will talk through how we can build on existing Hadoop facilities to deliver three key underlying concepts that enable this approach.

Discardable In-Memory Materialized Queries With Hadoop

Julian Hyde

Apache Calcite: One Frontend to Rule Them All

Michael Mior

Calcite meetup-2016-04-20

Josh Elser

The revolution has happened. We are living the age of the deconstructed database. The modern enterprises are powered by data, and that data lives in many formats and locations, in-flight and at rest, but somewhat surprisingly, the lingua franca for remains SQL. In this talk, Julian describes Apache Calcite, a toolkit for relational algebra that powers many systems including Apache Beam, Flink and Hive. He discusses some areas of development in Calcite: streaming SQL, materialized views, enabling spatial query on vanilla databases, and what a mash-up of all three might look like. He also describes how SQL is being extended to handle streaming, and the challenges that will need to be solved if it is to become standard. A talk given by Julian Hyde at Lyft, San Francisco, on 2018/06/27.

Data all over the place! How SQL and Apache Calcite bring sanity to streaming...

Julian Hyde

A talk given at ACM SIGMOD 2018 in support of the paper <a href="https://arxiv.org/abs/1802.10233"> Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources</a>. Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.

Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...

Julian Hyde

In the past when Hadoop was born, the big data world were focusing on how to build systems that scales. Now the world has evolved. HBase hits 2.0, Cassandra hits 3.0, Hive hits 3.0, etc. When scalability is conquered, what's next? That’s right, usability comes into play. If we look back into the history, NoSQL is really just using divide and concur mechanism to tackle big data problems by trading off SQL capabilities. But once big data problem solved, we see more and more NoSQL and data processing engines start to build up SQL or SQL-like interfaces. Therefore, a generic SQL engine that provides core SQL capabilities such as query parsing, relational algebra, and query optimization starts to shine. In this talk, I'll walk you through the architecture, functionality, and design concept of Apache Calcite. Notice that Calcite itself is not a database, but many well known systems already incorporate Calcite as a library. For instance, Hive, Drill, Druid, Phoenix, Apex, Flink, Storm, Samza, and more. To better illustrate how Calcite works, I'll choose some of the systems and describe how they adopt Calcite and which part is enhanced by Calcite. Furthermore, I'll talk about several features that Calcite provides such as query optimization, heterogeneous data source, materialized view, and Stream SQL. From user's perspective, knowing better how these systems work behind the scene equips you with more knowledge to chose a system that ultimately suits your needs.

ONE FOR ALL! Using Apache Calcite to make SQL smart

Evans Ye

Spark sql

Freeman Zhang

Cost-based query optimization in Apache Hive 0.14

Julian Hyde

This talk, given by Maryann Xue and Julian Hyde at Hadoop Summit, San Jose on June 30th, 2016, describes how we re-engineered Apache Phoenix with a cost-based optimizer based on Apache Calcite. Apache Phoenix has rapidly become a workhorse in many organizations, providing a convenient standard SQL interface to HBase suitable for a wide variety of workloads from transactions to ETL and analytics. But Phoenix's initial query optimizer was based on static optimization procedures and thus could not choose between several potential plans or indices based on cost metrics. We describe how we rebuilt Phoenix's parser and query optimizer using the Calcite framework, improving Phoenix's performance and SQL compliance. The new architecture uses relational algebra as an intermediate language, and this enables you to switch in other engines, especially those also based on Calcite. As an example of this, we demonstrate querying a Phoenix database via Apache Drill.

Cost-based Query Optimization in Apache Phoenix using Apache Calcite

Julian Hyde

Apache Calcite overview

Julian Hyde

Fast federated SQL with Apache Calcite

Chris Baynes

A talk given by Julian Hyde at Enterprise Data World on in Washington, DC on April 2nd, 2015. With data in different systems, in different formats, and accessed via different tools, we need a lingua franca for data. Not all tools speak SQL, and data cannot be moved into a single convenient location. Relational algebra underpins SQL and many other DB languages. It is also perfect for optimizing, caching and mediating. Apache Calcite (formerly Optiq) is a framework for building and optimizing expressions in relational algebra. We show how to write queries, optimize queries using rewrite rules, and write adapters for back-end systems. We also show to configure Calcite to materialize queries, so your interactive analytics are effectively running against a fast in-memory database.

Why you care about  relational algebra (even though you didn’t know it)

Julian Hyde

La actualidad más candente (20)

SQL Now! How Optiq brings the best of SQL to NoSQL data.

SQL for NoSQL and how Apache Calcite can help

Smarter Together - Bringing Relational Algebra, Powered by Apache Calcite, in...

Apache Calcite (a tutorial given at BOSS '21)

Optiq: a SQL front-end for everything

Introduction to Apache Calcite

A smarter Pig: Building a SQL interface to Apache Pig using Apache Calcite

SQL on everything, in memory

Discardable In-Memory Materialized Queries With Hadoop

Apache Calcite: One Frontend to Rule Them All

Calcite meetup-2016-04-20

Data all over the place! How SQL and Apache Calcite bring sanity to streaming...

Apache Calcite: A Foundational Framework for Optimized Query Processing Over ...

ONE FOR ALL! Using Apache Calcite to make SQL smart

Spark sql

Cost-based query optimization in Apache Hive 0.14

Cost-based Query Optimization in Apache Phoenix using Apache Calcite

Apache Calcite overview

Fast federated SQL with Apache Calcite

Why you care about  relational algebra (even though you didn’t know it)

Destacado

Data independence

Aashima Wadhwa

physical and logical data independence

apoorva_upadhyay

Database management systems

Mohammed El Hedhly

Dbms

AbiramiK

A N S I S P A R C Architecture

Sabeeh Ahmed

DBMS an Example

Dr. C.V. Suresh Babu

Data Base Management System

Dr. C.V. Suresh Babu

Basic DBMS ppt

dangwalrajendra888

Dbms slides

rahulrathore725

Database management system presentation

sameerraaj

Destacado (10)

Data independence

physical and logical data independence

Database management systems

Dbms

A N S I S P A R C Architecture

DBMS an Example

Data Base Management System

Basic DBMS ppt

Dbms slides

Database management system presentation

Similar a Why is data independence (still) so important? Optiq and Apache Drill.

Spark Summit EU talk by Michael Nitschinger

Spark Summit

Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology and Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. The combination of the two can provide a solution to power advanced analytics for not only what has happened in the past, but make intelligent predictions about the future. Please join this webinar to learn how get the most value from your data for your data driven business. Learning Objectives: How to scale your Redshift queries with user-defined functions (UDFs) How to apply Machine learning to historical data in Amazon Redshift How to visualize your data with Amazon QuickSight Present a reference architecture for advanced analytics Who Should Attend: Application developers looking to add UDFs, or predictive analytics to their applications, database administrators that need to meet the demand of data driven organizations, decision makers looking to derive more insight from their data

AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...

Amazon Web Services

This introductory workshop is aimed at data analysts & data engineers new to Apache Spark and exposes them how to analyze big data with Spark SQL and DataFrames. In this partly instructor-led and self-paced labs, we will cover Spark concepts and you’ll do labs for Spark SQL and DataFrames in Databricks Community Edition. Toward the end, you’ll get a glimpse into newly minted Databricks Developer Certification for Apache Spark: what to expect & how to prepare for it. * Apache Spark Basics & Architecture * Spark SQL * DataFrames * Brief Overview of Databricks Certified Developer for Apache Spark

Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3

Databricks

Intro to Spark and Spark SQL

jeykottalam

This talk will explain how to leverage Elasticsearch capabilities to make your content repository scale to the sky while still relying on standard SQL based technologies and ensuring data security and integrity. The design choices behind this hybrid Elasticsearch / PgSQL architecture will be discussed and the technical integration with Elasticsearch will be demonstrated. Watch the recorded webinar: http://www.nuxeo.com/resources/scaling-the-document-repository-with-elasticsearch/

Scaling the Content Repository with Elasticsearch

Nuxeo

Spark Sql for Training

Bryan Yang

The Django Book - Chapter 5: Models

Sharon Chen

How Klout is changing the landscape of social media with Hadoop and BI

Denny Lee

Ml ops and the feature store with hopsworks, DC Data Science Meetup

Jim Dowling

Projeto-web-services-Spring-Boot-JPA.pdf

AdrianoSantos888423

PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow

Chetan Khatri

Abstract:- Of all the developers delight, none is more attractive than a set of APIs that make developers productive, that are easy to use, and that are intuitive and expressive. Apache Spark offers these APIs across components such as Spark SQL, Streaming, Machine Learning, and Graph Processing to operate on large data sets in languages such as Scala, Java, Python, and R for doing distributed big data processing at scale. In this talk, I will explore the evolution of three sets of APIs - RDDs, DataFrames, and Datasets available in Apache Spark 2.x. In particular, I will emphasize why and when you should use each set as best practices, outline its performance and optimization benefits, and underscore scenarios when to use DataFrames and Datasets instead of RDDs for your big data distributed processing. Through simple notebook demonstrations with API code examples, you'll learn how to process big data using RDDs, DataFrames, and Datasets and interoperate among them.

A Tale of Three Apache Spark APIs: RDDs, DataFrames and Datasets by Jules Damji

Data Con LA

Gimel is a data abstraction framework built on Apache Spark - providing unified Data Access via API & SQL to different technologies such as kafka, elastic, HBASE, Rest API, File, Object stores, Relational , etc. We spoke about this recently in the "cloud track" in the "Scale By The Bay" Conference. https://www.scale.bythebay.io/schedule https://sched.co/e55D Youtube - https://www.youtube.com/watch?v=cy8g2WZbEBI&ab_channel=FunctionalTV https://youtu.be/m6_0iI4XDpU

Scale By The Bay | 2020 | Gimel

Deepak Chandramouli

Hyperspace: An Indexing Subsystem for Apache Spark

Databricks

A Smarter Pig: Building a SQL interface to Pig using Apache Calcite

Salesforce Engineering

A long time ago, there was Caffe and Theano, then came Torch and CNTK and Tensorflow, Keras and MXNet and Pytorch and Caffe2….a sea of Deep learning tools but none for Spark developers to dip into. Finally, there was BigDL, a deep learning library for Apache Spark. While BigDL is integrated into Spark and extends its capabilities to address the challenges of Big Data developers, will a library alone be enough to simplify and accelerate the deployment of ML/DL workloads on production clusters? From high level pipeline API support to feature transformers to pre-defined models and reference use cases, a rich repository of easy to use tools are now available with the ‘Analytics Zoo’. We’ll unpack the production challenges and opportunities with ML/DL on Spark and what the Zoo can do

Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...

Databricks

Odtug2011 adf developers make the database work for you

Luc Bors

In-memory data grids (IMDGs) are widely used as distributed, key-value stores for serialized objects, providing fast data access, location transparency, scalability, and high availability. With its support for built-in data structures, such as hashed sets and lists, Redis has demonstrated the value of enhancing standard create/read/update/delete (CRUD) APIs to provide extended functionality and performance gains. This talk describes new techniques which can be used to generalize this concept and enable the straightforward creation of arbitrary, user-defined data structures both within single objects and sharded across the IMDG. A key challenge for IMDGs is to minimize network traffic when accessing and updating stored data. Standard CRUD APIs place the burden of implementing data structures on the client and require that full objects move between client and server on every operation. In contrast, implementing data structures within the server streamlines communication since only incremental changes to stored objects or requested subsets of this data need to be transferred. However, building extended data structures within IMDG servers creates several challenges, including, how to extend this mechanism, how to efficiently implement data-parallel operations spanning multiple shards, and how to protect the IMDG from errors in user-defined extensions. This talk will describe two techniques which enable IMDGs to be extended to implement user-defined data structures. One technique, called single method invocation (SMI), allows users to define a class which implements a user-defined data structure stored as an IMDG object and then remotely execute a set of class methods within the IMDG. This enables IMDG clients to pass parameters to the IMDG and receive a result from method execution. A second technique, called parallel method invocation (PMI), extends this approach to execute a method in parallel on multiple objects sharded across IMDG servers. PMI also provides an efficient mechanism for combining the results of method execution and returning a single result to the invoking client. In contrast to client-based techniques, this combining mechanism is integrated into the IMDG and completes in O(logN) time, where N is the number of IMDG servers. The talk will describe how user-defined data structures can be implemented within the IMDG to run in a separate process (e.g., a JVM) to ensure that execution errors do not impair the stability of the IMDG. It will examine the associated performance trade-offs and techniques that can be used to minimize overhead. Lastly, the talk will describe how popular Redis data structures, such as hashed sets, can be implemented as a user-defined data structure using SMI and then extended using both SMI and PMI to build a scalable hashed set that spans multiple shards. It will also examine other examples of user-defined data structures that can be built using these techniques.

IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...

In-Memory Computing Summit

As our need for more computing resources has accelerated, so too have the ways in which computing has evolved. The cloud has enabled us to easily scale to suit our needs. To keep pace, we need more automated way to scale our infrastructure. In this session, we discuss automatic scaling with Kubernetes, how to set it up, and—most importantly—what to monitor in order to drive your automatic scaling. This session is brought to you by AWS partner, Datadog.

Automatically Scaling Your Kubernetes Workloads - SVC209-S - Anaheim AWS Summit

Amazon Web Services

Datacamp @ Transparency Camp 2010

Knowerce

Similar a Why is data independence (still) so important? Optiq and Apache Drill. (20)

Spark Summit EU talk by Michael Nitschinger

AWS November Webinar Series - Advanced Analytics with Amazon Redshift and the...

Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3

Intro to Spark and Spark SQL

Scaling the Content Repository with Elasticsearch

Spark Sql for Training

The Django Book - Chapter 5: Models

How Klout is changing the landscape of social media with Hadoop and BI

Ml ops and the feature store with hopsworks, DC Data Science Meetup

Projeto-web-services-Spring-Boot-JPA.pdf

PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow

A Tale of Three Apache Spark APIs: RDDs, DataFrames and Datasets by Jules Damji

Scale By The Bay | 2020 | Gimel

Hyperspace: An Indexing Subsystem for Apache Spark

A Smarter Pig: Building a SQL interface to Pig using Apache Calcite

Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...

Odtug2011 adf developers make the database work for you

IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...

Automatically Scaling Your Kubernetes Workloads - SVC209-S - Anaheim AWS Summit

Datacamp @ Transparency Camp 2010

Más de Julian Hyde

A semantic layer, also known as a metrics layer, lies between business users and the database, and lets those users compose queries in the concepts that they understand. It also governs access to the data, manages data transformations, and can tune the database by defining materializations. Like many new ideas, the semantic layer is a distillation and evolution of many old ideas, such as query languages, multidimensional OLAP, and query federation. In this talk, we describe the features we are adding to Calcite to define business views, query measures, and optimize performance. A talk given at Community over Code, the annual conference of the Apache Software Foundation, in Halifax, NS, on 9th October, 2023.

Building a semantic/metrics layer using Calcite

Julian Hyde

If SQL is the universal language of data, why do we author our most important data applications (metrics, analytics, business intelligence) in languages other than SQL? Multidimensional databases and languages such as MDX, DAX and Tableau LOD solve these problems but introduce others: they require specialized knowledge, complicate the data pipeline and don’t integrate well. Is it possible to define and query business intelligence models in SQL? Apache Calcite has extended SQL to support metrics (which we call ‘measures’), filter context, and analytic expressions. With these concepts you can define data models (which we call Analytic Views) that contain metrics, use them in queries, and define new metrics in queries. In this talk by the original developer of Apache Calcite, we describe the SQL syntax extensions for metrics, and how to use them for cross-dimensional calculations such as period-over-period, percent-of-total, non-additive and semi-additive measures. We describe how we got around fundamental limitations in SQL semantics, and approaches for optimizing queries that use metrics. A talk given by Julian Hyde at Data Council, Austin, TX, on March 29, 2023.

Cubing and Metrics in SQL, oh my!

Julian Hyde

Adding measures to Calcite SQL

Julian Hyde

What would the perfect data-parallel programming language look like? It would be as expressive as a general-purpose functional programming language, as powerful and concise as SQL, and run programs just as efficiently on a laptop or a thousand-node cluster. We present Morel, a functional programming language with relational extensions, working towards that goal. Morel is implemented in the Apache Calcite community on top of Calcite’s relational algebra framework. In this talk, we describe Morel’s evolution, including how we are pushing Calcite’s capabilities with graph and recursive queries. A talk given by Julian Hyde at ApacheCon, New Orleans, October 4th 2022.

Morel, a data-parallel programming language

Julian Hyde

The perfect data parallel language has not yet been invented. SQL queries can achieve great performance and scale, but there are many general purpose algorithms that it cannot express. In Morel, we build on the functional and relational roots of MapReduce in an elegant and strongly-typed general-purpose programming language. But Morel is, in a real sense, a query language; programs are executed on relational frameworks such as Google BigQuery and Spark. In this talk, we describe the principles that drove Morel’s design, the problems that we had to solve in order to implement a hybrid functional/relational language, and how Morel can be applied to implement data-intensive systems. We also introduce Apache Calcite, the popular open source framework for query planning, and describe how Morel's compiler uses Calcite's relational algebra and rewrite rules to generate efficient plans.

Is there a perfect data-parallel programming language? (Experiments with More...

Julian Hyde

Is it easier to add functional programming features to a query language, or to add query capabilities to a functional language? In Morel, we have done the latter. Functional and query languages have much in common, and yet much to learn from each other. Functional languages have a rich type system that includes polymorphism and functions-as-values and Turing-complete expressiveness; query languages have optimization techniques that can make programs several orders of magnitude faster, and runtimes that can use thousands of nodes to execute queries over terabytes of data. Morel is an implementation of Standard ML on the JVM, with language extensions to allow relational expressions. Its compiler can translate programs to relational algebra and, via Apache Calcite’s query optimizer, run those programs on relational backends. In this talk, we describe the principles that drove Morel’s design, the problems that we had to solve in order to implement a hybrid functional/relational language, and how Morel can be applied to implement data-intensive systems. (A talk given by Julian Hyde at Strange Loop 2021, St. Louis, MO, on October 1st, 2021.)

Morel, a Functional Query Language

Julian Hyde

Apache Calcite is an open source framework for building databases, and includes a SQL parser, relational algebra, and a highly extensible query optimizer. It has achieved wide adoption, used in many commercial products, open source projects, and as a test bed for computer science research. But there is a bootstrap problem: If software is written by a community of contributors, and each contributor acts in their own self-interest, how do you get the first working version of the product? The answer is in the story of how the technology evolved, and how the community evolved with it, and in this talk we tell that story.

The evolution of Apache Calcite and its Community

Julian Hyde

What to expect when you're Incubating

Julian Hyde

Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite

Julian Hyde

A talk given by Julian Hyde at the Apache Calcite online meetup, 2021/01/20. Spatial and GIS applications have traditionally required specialized databases, or at least specialized data structures like r-trees. Unfortunately this means that hybrid applications such as spatial analytics are not well served, and many people are unaware of the power of spatial queries because their favorite database does not support them. In this talk, we describe how Apache Calcite enables efficient spatial queries using generic data structures such as HBase’s key-sorted tables, using techniques like Hilbert space-filling curves and materialized views. Calcite implements much of the OpenGIS function set and recognizes query patterns that can be rewritten to use particular spatial indexes. Calcite is bringing spatial query to the masses!

Efficient spatial queries on vanilla databases

Julian Hyde

A talk given by Julian Hyde at DataCouncil SF on April 18, 2019 How do you organize your data so that your users get the right answers at the right time? That question is a pretty good definition of data engineering — but it is also describes the purpose of every DBMS (database management system). And it’s not a coincidence that these are so similar. This talk looks at the patterns that reoccur throughout data management — such as caching, partitioning, sorting, and derived data sets. As the speaker is the author of Apache Calcite, we first look at these patterns through the lens of Relational Algebra and DBMS architecture. But then we apply these patterns to the modern data pipeline, ETL and analytics. As a case study, we look at how Looker’s “derived tables” blur the line between ETL and caching, and leverage the power of cloud databases.

Tactical data engineering

Julian Hyde

Your queries won't run fast if your data is not organized right. Apache Calcite optimizes queries, but can we make it optimize data? We had to solve several challenges. Users are too busy to tell us the structure of their database, and the query load changes daily, so Calcite has to learn and adapt. We talk about new algorithms we developed for gathering statistics on massive database, and how we infer and evolve the data model based on the queries.

Don't optimize my queries, organize my data!

Julian Hyde

A talk given by Julian Hyde at ApacheCon NA 2018 in Montreal on September 26th, 2018. Spatial and GIS applications have traditionally required specialized databases, or at least specialized data structures like r-trees. Unfortunately this means that hybrid applications such as spatial analytics are not well served, and many people are unaware of the power of spatial queries because their favorite database does not support them. In this talk, we describe how Apache Calcite enables efficient spatial queries using generic data structures such as HBase’s key-sorted tables, using techniques like Hilbert space-filling curves and materialized views. Calcite implements much of the OpenGIS function set and recognizes query patterns that can be rewritten to use particular spatial indexes. Calcite is bringing spatial query to the masses!

Spatial query on vanilla databases

Julian Hyde

A talk given by Julian Hyde at DataEngConf SF on April 17th 2018. Did you know that databases often “cheat”? Even with a scalable query engine and smart optimizer, many real-world queries would be too slow if the engine read all the data, so the engine re-writes your query to use a pre-materialized result. B-tree indexes made the first relational databases possible, and there are now many flavors of materialization, from explicit materialized views to OLAP-style caching and spatial indexes. Materialization is more relevant than ever in today’s heterogenous, distributed systems. If you are evaluating data engines, we describe what materialization features to look for in your next engine. If you are implementing an engine, we describe the features provided by Apache Calcite to design, maintain and use materializations.

Lazy beats Smart and Fast

Julian Hyde

Your queries won't run fast if your data is not organized right. Apache Calcite optimizes queries, but can we evolve it so that it can optimize data? We had to solve several challenges. Users are too busy to tell us the structure of their database, and the query load changes daily, so Calcite has to learn and adapt. We talk about new algorithms we developed for gathering statistics on massive database, and how we infer and evolve the data model based on the queries, suggesting materialized views that will make your queries run faster without you changing them. A talk given by Julian Hyde at DataEngConf NYC, Columbia University, on 2017/10/30.

Don’t optimize my queries, optimize my data!

Julian Hyde

Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data. A talk given by Julian Hyde at DataWorks Summit, San Jose, on June 14th 2017.

Data profiling with Apache Calcite

Julian Hyde

Data Profiling in Apache Calcite

Julian Hyde

Streaming is necessary to handle data rates and latency, but SQL is unquestionably the lingua franca of data. Is it possible to combine SQL with streaming, and if so, what does the resulting language look like? Apache Calcite is extending SQL to include streaming, and Apache Apex is using Calcite to support streaming SQL. In this talk, Julian Hyde describes streaming SQL in detail and shows how you can use streaming SQL in your application. He also describes how Calcite’s planner optimizes queries for throughput and latency. Julian Hyde gave this talk at Apex Big Data World, Mountain View, on April 4, 2017.

Streaming SQL

Julian Hyde

A talk given by Julian Hyde at FlinkForward, Berlin, on 2016/09/12. Streaming is necessary to handle data rates and latency, but SQL is unquestionably the lingua franca of data. Is it possible to combine SQL with streaming, and if so, what does the resulting language look like? Apache Calcite is extending SQL to include streaming, and Apache Flink is using Calcite to support both regular and streaming SQL. In this talk, Julian Hyde describes streaming SQL in detail and shows how you can use streaming SQL in your application. He also describes how Calcite’s planner optimizes queries for throughput and latency.

Streaming SQL (at FlinkForward, Berlin, 2016/09/12)

Julian Hyde

Streaming is necessary to handle IoT data rates and latency but SQL is unquestionably the lingua franca of data. Apache Samza and Apache Storm have new high-level query interfaces based on standard SQL with streaming extensions, both powered by Apache Calcite. Calcite's relational algebra allows query optimization and federation with data-at-rest in databases, memory, or HDFS. A talk given by Julian Hyde at Hadoop Summit, San Jose, on 2016/06/29.

Streaming SQL

Julian Hyde

Más de Julian Hyde (20)

Building a semantic/metrics layer using Calcite

Cubing and Metrics in SQL, oh my!

Adding measures to Calcite SQL

Morel, a data-parallel programming language

Is there a perfect data-parallel programming language? (Experiments with More...

Morel, a Functional Query Language

The evolution of Apache Calcite and its Community

What to expect when you're Incubating

Open Source SQL - beyond parsers: ZetaSQL and Apache Calcite

Efficient spatial queries on vanilla databases

Tactical data engineering

Don't optimize my queries, organize my data!

Spatial query on vanilla databases

Lazy beats Smart and Fast

Don’t optimize my queries, optimize my data!

Data profiling with Apache Calcite

Data Profiling in Apache Calcite

Streaming SQL

Streaming SQL (at FlinkForward, Berlin, 2016/09/12)

Streaming SQL

Último

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Enterprise Knowledge’s Urmi Majumder, Principal Data Architecture Consultant, and Fernando Aguilar Islas, Senior Data Science Consultant, presented "Driving Behavioral Change for Information Management through Data-Driven Green Strategy" on March 27, 2024 at Enterprise Data World (EDW) in Orlando, Florida. In this presentation, Urmi and Fernando discussed a case study describing how the information management division in a large supply chain organization drove user behavior change through awareness of the carbon footprint of their duplicated and near-duplicated content, identified via advanced data analytics. Check out their presentation to gain valuable perspectives on utilizing data-driven strategies to influence positive behavioral shifts and support sustainability initiatives within your organization. In this session, participants gained answers to the following questions: - What is a Green Information Management (IM) Strategy, and why should you have one? - How can Artificial Intelligence (AI) and Machine Learning (ML) support your Green IM Strategy through content deduplication? - How can an organization use insights into their data to influence employee behavior for IM? - How can you reap additional benefits from content reduction that go beyond Green IM?

Driving Behavioral Change for Information Management through Data-Driven Gree...

Enterprise Knowledge

Partners Life - Insurer Innovation Award 2024

The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Miguel Araújo

Building Digital Trust in a Digital Economy Veronica Tan, Director - Cyber Security Agency of Singapore Apidays Singapore 2024: Connecting Customers, Business and Technology (April 17 & 18, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

apidays

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

Discord is a free app offering voice, video, and text chat functionalities, primarily catering to the gaming community. It serves as a hub for users to create and join servers tailored to their interests. Discord’s ecosystem comprises servers, each functioning as a distinct online community with its own channels dedicated to specific topics or activities. Users can engage in text-based discussions, voice calls, or video chats within these channels. Understanding Discord Servers Discord servers are virtual spaces where users congregate to interact, share content, and build communities. Servers may revolve around gaming, hobbies, interests, or fandoms, providing a platform for like-minded individuals to connect. Communication Features Discord offers a range of communication tools, including text channels for messaging, voice channels for real-time audio conversations, and video channels for face-to-face interactions. These features facilitate seamless communication and collaboration. What Does NSFW Mean? The acronym NSFW stands for “Not Safe For Work,” indicating content that may be inappropriate for professional or public settings. NSFW Content NSFW content encompasses material that is sexually explicit, violent, or otherwise graphic in nature. It often includes nudity, profanity, or depictions of sensitive topics.

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

UK Journal

Histor y of HAM Radio presentation slide

vu2urc

Real Time Object Detection Using Open CV

Khem

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Rafal Los

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Developing An App To Navigate The Roads of Brazil

V3cube

Tech Trends Report 2024 Future Today Institute.pdf

hans926745

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

Why is data independence (still) so important? Optiq and Apache Drill.

1. Why is data independence (still) so important? Julian Hyde @julianhyde http://github.com/julianhyde/optiq http://github.com/julianhyde/optiq-splunk Apache Drill Meeting 2012/9/13

2. Data independence This is my opinion about data management systems in general. I don't claim that it is the right answer for Apache Drill. I claim that a logical/physical separation can make a data management system more widely applicable, therefore more widely adopted, therefore better. What “data independence” means in today's “big data” world.

3. About me Julian Hyde Database hacker (Oracle, Broadbase, SQLstream, LucidDB) Open source hacker (Mondrian, olap4j, LucidDB, Optiq) @julianhyde http://github.com/julianhyde

4. http://www.flickr.com/photos/torkildr/3462606643

5. http://www.flickr.com/photos/sylvar/31436961/

6. “Big Data” Right data, right time Diverse data sources / Performance / Suitable format Volume / Velocity / Variety Volume – solved :) Velocity – not one of Drill's goals (?) Variety – ?

7. Variety Variety of source formats (csv, avro, json, weblogs) Variety of storage structures (indexes, projections, sort order, materialized views) now or in future Variety of query languages (DrQL, SQL) Combine with other data (join, union) Embed within other systems, e.g. Hive Source for other systems, e.g. Drill | Cascading > Teradata Tools generate SQL

8. Use case: Optiq* at Splunk SQL interface on NoSQL system “Smart” JDBC driver – pushes processing down to Splunk * Truth in advertising: I am the author of Optiq.

9. Expression tree SELECT p.“product_name”, COUNT(*) AS c FROM “splunk”.”splunk” AS s JOIN “mysql”.”products” AS p ON s.”product_id” = p.”product_id” WHERE s.“action” = 'purchase' GROUP BY p.”product_name” Splunk ORDER BY c DESC Table: splunk Key: product_name Key: product_id Agg: count Condition: Key: c DESC action = 'purchase' scan join MySQL filter group sort scan Table: products

10. Expression tree SELECT p.“product_name”, COUNT(*) AS c FROM “splunk”.”splunk” AS s (optimized) JOIN “mysql”.”products” AS p ON s.”product_id” = p.”product_id” WHERE s.“action” = 'purchase' GROUP BY p.”product_name” Splunk ORDER BY c DESC Condition: Table: splunk action = 'purchase' Key: product_name Agg: count Key: c DESC Key: product_id scan filter MySQL join group sort scan Table: products

11. Conventional DBMS architecture JDBC client JDBC server SQL parser / validator Metadata Query optimizer Data-flow operators Data Data

12. Drill architecture DrQL client DrQL parser / validator ? Metadata Data-flow operators Data Data

13. Optiq architecture JDBC client JDBC server Optional SQL parser / Metadata validator SPI Core Query Pluggable optimizer rules 3rd 3rd Pluggable party party ops ops 3rd party 3rd party data data

14. Analogy: Compiler architecture front end C++ C Fortran middle end Optimizations back end x86 ARM Fortran

15. Conclusions Clear logical / physical separation allows a data management system to handle a wider variety of data, query languages, and packaging. Also provides a clear interface between the sub-teams working on query language and operators. A query optimizer allows new operators, and alternative algorithms and data structures, to be easily added to the system.

16. Extra material follows...

17. Writing an adapter Driver – if you want a vanity URL like “jdbc:drill:” Schema – describes what tables exist Table – what are the columns, and how to get the data. Operators (optional) – non-relational operators, if any Rules (optional, but recommended) – improve efficiency by changing the question Parser (optional) – additional source languages

Notas del editor

The obligatory “big data” definition slide. What is “big data”? It's not really about “big”. We need to access data from different parts of the organization, when we need it (which often means we don't have time to copy it), and the performance needs to be reasonable. If the data is large, it is often larger than the disks one can fit on one machine. It helps if we can process the data in place, leveraging the CPU and memory of the machines where the data is stored. We'd rather not copy it from one system to another. It needs to be flexible, to deal with diverse systems and formats. That often means that open source is involved. Some systems (e.g. reporting tools) can't easily be changed to accommodate new formats. So it helps if the data can be presented in standard formats, e.g. SQL.
It's much more efficient if we psuh filters and aggregations to Splunk. But the user writing SQL shouldn't have to worry about that. This is not about processing data. This is about processing expressions. Reformulating the question. The question is the parse tree of a query. The parse tree is a data flow. In Splunk, a data flow looks like a pipeline of Linux commands. SQL systems have pipelines too (sometimes they are dataflow trees) built up of the basic relational operators. Think of the SQL SELECT, WHERE, JOIN, GROUP BY, ORDER BY clauses.
It's much more efficient if we psuh filters and aggregations to Splunk. But the user writing SQL shouldn't have to worry about that. This is not about processing data. This is about processing expressions. Reformulating the question. The question is the parse tree of a query. The parse tree is a data flow. In Splunk, a data flow looks like a pipeline of Linux commands. SQL systems have pipelines too (sometimes they are dataflow trees) built up of the basic relational operators. Think of the SQL SELECT, WHERE, JOIN, GROUP BY, ORDER BY clauses.
Conventional database has ODBC/JDBC driver, SQL parser, . Data sources. Expression tree. Expression transformation rules. Optimizer. For NoSQL databases, the language may not be SQL, and the optimizer may be less sophisticated, but the picture is basically the same. For frameworks, such as Hadoop, there is no planner. You end up writing code (e.g MapReduce jobs).
Conventional database has ODBC/JDBC driver, SQL parser, . Data sources. Expression tree. Expression transformation rules. Optimizer. For NoSQL databases, the language may not be SQL, and the optimizer may be less sophisticated, but the picture is basically the same. For frameworks, such as Hadoop, there is no planner. You end up writing code (e.g MapReduce jobs).
In Optiq, the query optimizer (we modestly call it the planner) is central. The JDBC driver/server and SQL parser are optional; skip them if you have another language. Plug-ins provide metadata (the schema), planner rules, and runtime operators. There are built-in relational operators and rules, and there are built-in operators implemented in Java. But to access data, you need to provide at least one operator.

Why is data independence (still) so important? Optiq and Apache Drill.

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (10)

Similar a Why is data independence (still) so important? Optiq and Apache Drill.

Similar a Why is data independence (still) so important? Optiq and Apache Drill. (20)

Más de Julian Hyde

Más de Julian Hyde (20)

Último

Último (20)

Why is data independence (still) so important? Optiq and Apache Drill.

Notas del editor