SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
PROPRIETARY & CONFIDENTIAL
Why Cask?
2
@jgrayla
SIMPLE ACCESS TO POWERFUL TECHNOLOGY
Cask’s goal is to enable every developer and enterprise to

quickly and easily build and run modern data applications

using open source big data technologies like Hadoop
PROPRIETARY & CONFIDENTIAL
James Dixon (Pentaho)
Data Lake
Data streams in from sources
to fill the lake, and various
users of the lake can come to
examine, dive in, or sample

(Hadoop World NYC 2010)
Introduction to Data Lakes
Gartner
Data Lake
Enterprise-wide data
management platforms for
analyzing disparate sources of
data in its native format
Hortonworks
Data Lake
Collect everything, dive in
anywhere, give flexible access.

Maximum scale and insight
with the lowest possible
friction and cost.
Cloudera
Data Hub
A centralized, unified data source
that can quickly provide diverse
business users with the
information they need

to do their jobs.
Data
Lake
1
0
1
0
0
01 1
0
1
PROPRIETARY & CONFIDENTIAL
The Journey to Data Lakes
The Journey to Data Lakes is not Easy
Our customers are some of the

most advanced users of Hadoop

and have years invested into their journeys.
The goal of CDAP is to provide a framework

and set of abstractions to avoid the pitfalls and

long timelines that plague Hadoop projects.
CDAP drastically accelerates your

adoption and utilization of big data.
1
0
1
0
0 0
1
0
1
PROPRIETARY & CONFIDENTIAL
Raw
a.k.a. Level 0



Data that has been
left in it’s native form
without any
transformation.
Types of Water… er, Data
Defined
a.k.a. Level 1



Data that has a
defined schema and
has been wrangled
and cleansed.
Refined
a.k.a. Level 2



Data that has been
aggregated from the
source records, like
counts or models.
PROPRIETARY & CONFIDENTIAL
Analysts
Vertical Expertise
Utilizes BI Tools
No Programming
Needs UI for Access
Types of Data Users
Scientists
Mixed Expertise
Utilizes Py/R/SQL/etc.
Basic Programming

Needs Tools for Access
Developers
Horizontal Expertise
Utilizes Java/Scripting
Advanced Programming
Needs Code for Access
PROPRIETARY & CONFIDENTIAL
Data Lake Architectures
Data Reservoir
Raw + Defined Data

which is governed and
audited to ensure
compliance and security
Data
Reservoir
1
0
1
0
0
0
1
Data Pond
Raw Data copied from
existing internal data
stores and pulled from
external data sources
Data
Pond
1
0
1
0
1 0
Data Lake
Raw + Defined Data

pushed from other
systems into a centralized,
shared storage cluster
Data
Lake
1
0
1
0
1
0
PROPRIETARY & CONFIDENTIAL
Data Pond
Raw Data copied from existing internal data
stores and pulled from external data sources
SME / Enterprise Line of Business
Customer 360° View
Bring together silo’d datasets
Combine with external data sources
Ask new questions, find unknown unknowns
Data
Pond
1
0
1
0
1
0
PROPRIETARY & CONFIDENTIAL
Data Lake
Web Startup Company
Raw + Defined Data pushed from other systems
into a centralized, shared storage cluster
Log Storage and Analytics
Ingestion of data from multiple sources
Transforming and processing of data
Centralized storage and analytics of log data
Data
Lake
1
0
1
0
1
0
0
PROPRIETARY & CONFIDENTIAL
Data Reservoir
Fortune 500 Enterprise
Raw + Defined Data which is governed and audited to
ensure compliance and security enforcement
Enterprise Data Hub
Storage and processing for all enterprise data
Provide centralized auditing and enforcement
Any data available while ensuring compliance
Data
Reservoir
10
1
0
0
0
1
0
PROPRIETARY & CONFIDENTIAL
Data Lake Challenges
Manual processes requiring
hand-coding and reliance on

command-line tools
Hard to find data and

it’s lineage for data

discovery and exploration
Operationalizing processes

for production and to

maintain SLAs
Coupling of ingestion and
processing drives

architecture decisions
Ensuring data is in canonical
forms with a shared schema
usable by others
Sharing infrastructure in a

multi-tenant environment

without low-level QoS support
Multiple architectures and
technologies used by different
teams on different clusters
Guaranteeing compliance in a
system that is designed for
schema-on-read and raw data
Coding or filing tickets often
required to perform manual

ingestion and processing tasks
Data
Reservoir
1
0
1
0
0
0
1
Data
Pond
1
0
1
0
1 0
Data
Lake
1
0
1
0
1
0
PROPRIETARY & CONFIDENTIAL
CASK DATA APPLICATION PLATFORM
Integrated Framework for Building and
Running Data Applications on Hadoop
Integrates the Latest
Big Data Technologies
Supports All Major
Hadoop Distributions
Fully Open Source
and Highly Extensible
PROPRIETARY & CONFIDENTIAL14
Key Features
CASK DATA APPLICATION PLATFORM
Infrastructure
INTEGRATION
Provide an integrated
product experience

with out-of-the-box
capabilities
Architecture
STANDARDS
Define a reference
architecture to standardize
support for mixed
infrastructure
Programming
ABSTRACTIONS
Utilize abstraction layers
to encapsulate complex
patterns and insulate
developers
Production
SERVICES
Provides development tools
and runtime services to
enable production

apps and data
PROPRIETARY & CONFIDENTIAL
PROPRIETARY & CONFIDENTIAL
Self-Service Ingestion and ETL

for Hadoop Data Lakes
Built for Production
on CDAP
Rich Drag-and-Drop
User Interface
Open Source &
Highly Extensible
PROPRIETARY & CONFIDENTIAL
DISCOVER
data using user and machine
generated metadata
INGEST
any data from any source
in real-time and batch
BUILD
drag-and-drop ETL/ELT
pipelines that run on Hadoop
EGRESS
any data to any destination
in real-time and batch
PROPRIETARY & CONFIDENTIAL
Data Lakes on CDAP
Hydrator framework with
templates and plugins enables
production workflows in minutes
Never lose data by ensuring all
ingested data is tracked with

metadata and lineage
Operationalize workflows using

scheduling and SLA monitoring

with time / partition awareness
Separation of ingestion

and processing to support

any type, format and rate
Using common transformations
and a shared system for

defining and exposing schema
Multi-tenant namespacing
provides data and app isolation,
tying together infrastructure
Reference architecture ensures
a common platform across
teams, orgs, ops and security
Ensure compliance by

requiring the use of specific
transformations and validation
Self-service access through
Cask Hydrator for the discovery,
ingest and exploration of data
Data
Reservoir
1
0
1
0
0
0
1
Data
Pond
1
0
1
0
1 0
Data
Lake
1
0
1
0
1
0
Demo
CDAP Community
100% Open Source (ASL2)
Website:
http://cdap.io
Mailing List:
cdap-user@googlegroups.com
cdap-dev@googlegroups.com
IRC:
#cdap on freenode.net
CDAP Enterprise
100% Commercially Supported
Website:
http://cask.co
Contact Sales:
sales@cask.co
Contact Me:
jon@cask.co or @jgrayla

Accelerate Your
Data Lake Journey
Tap In @ cask.co
Thank You!
Jonathan Gray
jon@cask.co
@jgrayla
Questions?

Más contenido relacionado

Más de Cask Data

Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...
Cask Data
 

Más de Cask Data (10)

Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...Introducing a horizontally scalable, inference-based business Rules Engine fo...
Introducing a horizontally scalable, inference-based business Rules Engine fo...
 
About CDAP
About CDAPAbout CDAP
About CDAP
 
Transaction in HBase, by Andreas Neumann, Cask
Transaction in HBase, by Andreas Neumann, CaskTransaction in HBase, by Andreas Neumann, Cask
Transaction in HBase, by Andreas Neumann, Cask
 
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask #BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
#BDAM: EDW Optimization with Hadoop and CDAP, by Sagar Kapare from Cask
 
Building Enterprise Grade Applications in Yarn with Apache Twill
Building Enterprise Grade Applications in Yarn with Apache TwillBuilding Enterprise Grade Applications in Yarn with Apache Twill
Building Enterprise Grade Applications in Yarn with Apache Twill
 
Transactions Over Apache HBase
Transactions Over Apache HBaseTransactions Over Apache HBase
Transactions Over Apache HBase
 
Logging infrastructure for Microservices using StreamSets Data Collector
Logging infrastructure for Microservices using StreamSets Data CollectorLogging infrastructure for Microservices using StreamSets Data Collector
Logging infrastructure for Microservices using StreamSets Data Collector
 
NRT Event Processing with Guaranteed Delivery of HTTP Callbacks, HBaseCon 2015
NRT Event Processing with Guaranteed Delivery of HTTP Callbacks, HBaseCon 2015NRT Event Processing with Guaranteed Delivery of HTTP Callbacks, HBaseCon 2015
NRT Event Processing with Guaranteed Delivery of HTTP Callbacks, HBaseCon 2015
 
Brown Bag : CDAP (f.k.a Reactor) Streams Deep DiveStream on file brown bag
Brown Bag : CDAP (f.k.a Reactor) Streams Deep DiveStream on file brown bagBrown Bag : CDAP (f.k.a Reactor) Streams Deep DiveStream on file brown bag
Brown Bag : CDAP (f.k.a Reactor) Streams Deep DiveStream on file brown bag
 
HBase Meetup @ Cask HQ 09/25
HBase Meetup @ Cask HQ 09/25HBase Meetup @ Cask HQ 09/25
HBase Meetup @ Cask HQ 09/25
 

Último

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Último (20)

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 

Strata+Hadoop NY 2015: Hydrate a data lake in days with CDAP

  • 1.
  • 3. SIMPLE ACCESS TO POWERFUL TECHNOLOGY Cask’s goal is to enable every developer and enterprise to
 quickly and easily build and run modern data applications
 using open source big data technologies like Hadoop
  • 4. PROPRIETARY & CONFIDENTIAL James Dixon (Pentaho) Data Lake Data streams in from sources to fill the lake, and various users of the lake can come to examine, dive in, or sample
 (Hadoop World NYC 2010) Introduction to Data Lakes Gartner Data Lake Enterprise-wide data management platforms for analyzing disparate sources of data in its native format Hortonworks Data Lake Collect everything, dive in anywhere, give flexible access.
 Maximum scale and insight with the lowest possible friction and cost. Cloudera Data Hub A centralized, unified data source that can quickly provide diverse business users with the information they need
 to do their jobs. Data Lake 1 0 1 0 0 01 1 0 1
  • 5. PROPRIETARY & CONFIDENTIAL The Journey to Data Lakes The Journey to Data Lakes is not Easy Our customers are some of the
 most advanced users of Hadoop
 and have years invested into their journeys. The goal of CDAP is to provide a framework
 and set of abstractions to avoid the pitfalls and
 long timelines that plague Hadoop projects. CDAP drastically accelerates your
 adoption and utilization of big data. 1 0 1 0 0 0 1 0 1
  • 6. PROPRIETARY & CONFIDENTIAL Raw a.k.a. Level 0
 
 Data that has been left in it’s native form without any transformation. Types of Water… er, Data Defined a.k.a. Level 1
 
 Data that has a defined schema and has been wrangled and cleansed. Refined a.k.a. Level 2
 
 Data that has been aggregated from the source records, like counts or models.
  • 7. PROPRIETARY & CONFIDENTIAL Analysts Vertical Expertise Utilizes BI Tools No Programming Needs UI for Access Types of Data Users Scientists Mixed Expertise Utilizes Py/R/SQL/etc. Basic Programming
 Needs Tools for Access Developers Horizontal Expertise Utilizes Java/Scripting Advanced Programming Needs Code for Access
  • 8. PROPRIETARY & CONFIDENTIAL Data Lake Architectures Data Reservoir Raw + Defined Data
 which is governed and audited to ensure compliance and security Data Reservoir 1 0 1 0 0 0 1 Data Pond Raw Data copied from existing internal data stores and pulled from external data sources Data Pond 1 0 1 0 1 0 Data Lake Raw + Defined Data
 pushed from other systems into a centralized, shared storage cluster Data Lake 1 0 1 0 1 0
  • 9. PROPRIETARY & CONFIDENTIAL Data Pond Raw Data copied from existing internal data stores and pulled from external data sources SME / Enterprise Line of Business Customer 360° View Bring together silo’d datasets Combine with external data sources Ask new questions, find unknown unknowns Data Pond 1 0 1 0 1 0
  • 10. PROPRIETARY & CONFIDENTIAL Data Lake Web Startup Company Raw + Defined Data pushed from other systems into a centralized, shared storage cluster Log Storage and Analytics Ingestion of data from multiple sources Transforming and processing of data Centralized storage and analytics of log data Data Lake 1 0 1 0 1 0 0
  • 11. PROPRIETARY & CONFIDENTIAL Data Reservoir Fortune 500 Enterprise Raw + Defined Data which is governed and audited to ensure compliance and security enforcement Enterprise Data Hub Storage and processing for all enterprise data Provide centralized auditing and enforcement Any data available while ensuring compliance Data Reservoir 10 1 0 0 0 1 0
  • 12. PROPRIETARY & CONFIDENTIAL Data Lake Challenges Manual processes requiring hand-coding and reliance on
 command-line tools Hard to find data and
 it’s lineage for data
 discovery and exploration Operationalizing processes
 for production and to
 maintain SLAs Coupling of ingestion and processing drives
 architecture decisions Ensuring data is in canonical forms with a shared schema usable by others Sharing infrastructure in a
 multi-tenant environment
 without low-level QoS support Multiple architectures and technologies used by different teams on different clusters Guaranteeing compliance in a system that is designed for schema-on-read and raw data Coding or filing tickets often required to perform manual
 ingestion and processing tasks Data Reservoir 1 0 1 0 0 0 1 Data Pond 1 0 1 0 1 0 Data Lake 1 0 1 0 1 0
  • 13. PROPRIETARY & CONFIDENTIAL CASK DATA APPLICATION PLATFORM Integrated Framework for Building and Running Data Applications on Hadoop Integrates the Latest Big Data Technologies Supports All Major Hadoop Distributions Fully Open Source and Highly Extensible
  • 14. PROPRIETARY & CONFIDENTIAL14 Key Features CASK DATA APPLICATION PLATFORM Infrastructure INTEGRATION Provide an integrated product experience
 with out-of-the-box capabilities Architecture STANDARDS Define a reference architecture to standardize support for mixed infrastructure Programming ABSTRACTIONS Utilize abstraction layers to encapsulate complex patterns and insulate developers Production SERVICES Provides development tools and runtime services to enable production
 apps and data
  • 16. PROPRIETARY & CONFIDENTIAL Self-Service Ingestion and ETL
 for Hadoop Data Lakes Built for Production on CDAP Rich Drag-and-Drop User Interface Open Source & Highly Extensible
  • 17. PROPRIETARY & CONFIDENTIAL DISCOVER data using user and machine generated metadata INGEST any data from any source in real-time and batch BUILD drag-and-drop ETL/ELT pipelines that run on Hadoop EGRESS any data to any destination in real-time and batch
  • 18. PROPRIETARY & CONFIDENTIAL Data Lakes on CDAP Hydrator framework with templates and plugins enables production workflows in minutes Never lose data by ensuring all ingested data is tracked with
 metadata and lineage Operationalize workflows using
 scheduling and SLA monitoring
 with time / partition awareness Separation of ingestion
 and processing to support
 any type, format and rate Using common transformations and a shared system for
 defining and exposing schema Multi-tenant namespacing provides data and app isolation, tying together infrastructure Reference architecture ensures a common platform across teams, orgs, ops and security Ensure compliance by
 requiring the use of specific transformations and validation Self-service access through Cask Hydrator for the discovery, ingest and exploration of data Data Reservoir 1 0 1 0 0 0 1 Data Pond 1 0 1 0 1 0 Data Lake 1 0 1 0 1 0
  • 19. Demo
  • 20. CDAP Community 100% Open Source (ASL2) Website: http://cdap.io Mailing List: cdap-user@googlegroups.com cdap-dev@googlegroups.com IRC: #cdap on freenode.net CDAP Enterprise 100% Commercially Supported Website: http://cask.co Contact Sales: sales@cask.co Contact Me: jon@cask.co or @jgrayla
 Accelerate Your Data Lake Journey Tap In @ cask.co