4. Supporting the Enterprise AI Journey of
Manufacturing Financial Services
Services Consumer Goods
Technology Consulting
E-Retail Media
Healthcare Travel
Global Presence
A WIDE USER BASE
POWERED BY A STRONG ORGANIZATION
Dataikers
220
BACKED BY MAJOR PARTNERS
Customers
220+
Users
20,000
+ of customers expand
usage after first year
80%
Raised so far
$146M
Customers Across Industries
POWERING INDUSTRY LEADERS
5. The “Tower of Babel” Effect of Data Projects
The Classic Data Project Silos
Business
Analyst
DATA PREPARATION ML MODELING ML DEPLOYMENT
Data Preparation
Data Science Notebooks
& API Platforms
AutoML
Solutions
Data Scientist
Data Engineer
6. Bring Business Analysts, Engineers, and Scientists Together
Share a common environment to have an impact
DATA PREPARATION ML MODELING ML DEPLOYMENT
Business
Analyst
Data Engineer
Data Scientist
Single Collaborative, Governable and Auditable Environment
7. Leverage existing skills
and secure sustained
availability
Maximise usage of most
up-to-date technologies
Extend based on current
and future operating
requirements
Get Results Today, Build for Tomorrow
Future proof your data effort
Use your current
infrastructure and be
ready for tomorrow’s
Bokeh
8. Fortune 500 Customer Rockets through Acceleration Phase
Customer Testimony
Quarterly Evolution of Dataiku Users
Analytics
Leader
10 Projects Leaders
Scale their team to
deliver
10x Projects / Briefs /
Models / ...
Business
Analyst
500 Business Analysts
Leverage Large and
Complex Data Sources
Independent to Deliver
New Projects Accelerate
by leveraging tools
packaged by Data
Scientists
100 Data
Scientists
Focus On Complex
Data Processing
Deliver Code and
Plugins for Reuse
Data
Scientist
20 Data Engineers
Ensure availability of data
infrastructures
Operationalize, monitor
and maintain data
projects
Data
Engineer
Delivering 1,000s of analysis, insights,
models and optimized business
processes
9. Enable Self-Service Analytics and Operationalize ML
The Two Key Modes of Data Innovation
SSA
Quick answers to
unformulated questions
Directly by the end-users
Pervasive
Agile and instantaneous
Limited integration
High volume
o16n
Robust solutions to
business challenges
Organization-driven
Focused
Longer term
Fine integration
High value projects
10. How a Major Software Player Auto-Deploys 12,000 Models
Customer Testimony
Design complex recommendation
engines combining price, content and
demand logics (the final models actually
combine 3 predictive models)
Automatically generate
such recommendation engines based on
each of its seller’s data and data models
Operate models in real time and
update them with no down time, scaling
up on a fully managed platform on top of
Kubernetes An AI-enabled Layer on top of
an an existing product
Powered by Dataiku
Dataiku Customer provides a sales management software platform to 4,000 B2B clients
(including several Fortune 100 companies), and has deployed Dataiku in order to:
11. Leverage your full stack and skills
Dataiku Solution Overview: Architecture
LINUX SERVER
ON PREMISE OR MANAGED
CLOUD
CENTRALIZED
OR AD-HOC
DATA SOURCES,
DATABASES,
DATA LAKE
AVAILABLE OR SPUN-UP
PROCESSING RESOURCES
Leveraging best
storage and
compute
resources
Dataiku deployment servers for
enterprise grade
operationalization
PRODUCTION
SYSTEMS
Centralized server to
facilitate
access to data, ressources,
Browser
based
interface
VISUAL DEVELOPMENT
COMPLETE
CODING
ENVIRONMENTS
VISUALIZATIO
N
COLLABORATION AND
PROJECT
MANAGEMENT
AUDIT,
MONITORING
AND
SCHEDULING
User/task specific
interaction modes
13. Data Scientist Business Analyst Data Engineer
Machine Learning Model DeploymentData Management
MADlib
In-database
machine learning
Graph
Relationship
Analytics
Greenplum
Integrated and cleansed data,
parallel SQL processing
GPText
Fast index,
search, text
analytics
PostGIS
Location analytics
Enable In-Database Analytics & Operationalized ML
Dataiku & Pivotal® Greenplum’s Value
14. High-Performance Analytics at Petabyte Scale
▪ Dataiku leverages Pivotal® Greenplum for in-database parallel
processing of complex queries, visual analysis and charts.
Simplify Collaboration across Data Teams
▪ End-to-end project collaboration for data scientists and
engineers
▪ Self-service access to data sources
▪ Visual Development experience for building comprehensive
analytics pipelines
Mature Your Data Analytics Operations
▪ Enable self-service analytics of large datasets stored in
Pivotal® Greenplum
▪ Enforce data governance between roles and teams
▪ Enable comprehensive of machine learning pipelines and
models.
Solution Features
Dataiku & Pivotal® Greenplum’s Value
15. Dataiku + Postgres and Greenplum (example)
Order
Data
Movements
(if compatible)
Dataiku Datasets:
● Index definitions
● Incremental
SQL push
back: Charts using
SQL
Pushback
…
Storage