SlideShare una empresa de Scribd logo
1 de 66
Descargar para leer sin conexión
De Big Data a AI pasando por Silicon Valley
Luciano Resende
IBM – CODAIT – Silicon Valley, California
1© 2019 IBM Corporation
About me - Luciano Resende
2
Open Source AI Platform Architect – IBM – CODAIT
• Senior Technical Staff Member at IBM, contributing to open source for over 10 years
• Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache
Toree, Apache Spark among other projects related to AI/ML platforms
lresende@us.ibm.com
https://www.linkedin.com/in/lresende
@lresende1975
https://github.com/lresende
© 2018 IBM Corporation© 2019 IBM Corporation
3
Learn
Open Source @ IBM
Program touches
78,000
IBMers annually
Consume
Virtually all
IBM products
contain some
open source
• 40,363 pkgs
Per Year
Contribute
• >62K OS Certs
per year
• ~10K IBM
commits per
month
Connect
> 1000
active IBM
Contributors
Working in key OS
projects
IBM Open Source Participation
© 2019 IBM Corporation
4
IBM Open Source Participation
IBM generated open source innovation
• 137 Code Open (dWO) projects w/1000+ Github projects
• 4 graduates: Node-Red, OpenWhisk, SystemML,
Blockchain fabric to full open governance in the last year
• developer.ibm.com/code/open/code/
Community
• IBM focused on 18 strategic communities
• Drive open governance in “Centers of Gravity”
• IBM Leaders drive key technologies and assure freedom
of action
The IBM OS Way is now open sourced
• Training, Recognition, Tooling
• Organization, Consuming, Contributing
© 2019 IBM Corporation
5
IBM’s history of strong AI leadership
1997: Deep Blue
• Deep Blue became the first machine to beat a world chess
champion in tournament play
2011: Jeopardy!
• Watson beat two top
Jeopardy! champions
1968, 2001: A Space Odyssey
• IBM was a technical
advisor
• HAL is “the latest in
machine intelligence”
2018: Open Tech, AI & emerging
standards
• New IBM centers of gravity for AI
• OS projects increasing exponentially
• Emerging global standards in AI
© 2019 IBM Corporation
2018: Project Debater
Center for Open Source
Data and AI Technologies
CODAIT
codait.org
2018 / © 2018 IBM Corporation
codait (French)
= coder/coded
https://m.interglot.com/fr/en/codait
CODAIT aims to make AI solutions
dramatically easier to create, deploy,
and manage in the enterprise
Relaunch of the Spark Technology
Center (STC) to reflect expanded
mission
6© 2019 IBM Corporation
Center for Open Source Data
and AI Technologies
IBM Developer / © 2019 IBM Corporation 7
Using AI
- Model Asset
Exchange
- Data Asset
Exchange
AI
Frameworks Building AI AI Platforms Trusted AI
- Fairness
- Robustness
- Transparency
and
Accountability
- Explainability
AI Examples Today
8© 2019 IBM Corporation
Home Automation & Security
- Multiple connected or
standalone devices
- Controlled by Voice
- Amazon Echo (Alexa)
- Google Home
- Apple HomePod (Siri)
9
© 2019 IBM Corporation
Autonomous Driving
In 2016, Google's self-driving
car system has been officially
recognized as a driver in the US,
paving the way for the
legalization of autonomous
vehicles.
Doordash is currently testing
self-driving robots for food
delivery.
10
https://www.dezeen.com/2016/02/12/google-self-driving-car-artficial-intelligence-system-recognised-as-driver-usa/
https://medium.com/@DoorDash/welcoming-our-newest-robots-to-the-doordash-fleet-with-marble-e752a85d6602
© 2019 IBM Corporation
AMAZON Go
AMAZON GO – No lines, no
checkout, just grab and go
11
© 2019 IBM Corporation
But how simple is to
apply AI to your
Application?
12© 2019 IBM Corporation
“cat”
A simple Deep Learning Model
13May 17, 2018 / © 2018 IBM Corporation
Dense
(3×8)
Dense
(8×6)
Input
(3)
Output
(2)
Dense
(6×4)
Dense
(4×2)
Neural Network
Graph
Weights
(not to scale)
Driver Program
© 2019 IBM Corporation
Example: Get an Image Classifier
14
Step 1: Find a suitable neural
network graph.
– Need to read some papers
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
Example: Get an Image Classifier
15
Step 2: Find code to generate
the neural network graph
May 17, 2018 / © 2018 IBM Corporation
TensorFlow code to build ResNet50 neural network graph
© 2019 IBM Corporation
Example: Get an Image Classifier
16
Step 3: Find some pre-trained
weights for your graph
May 17, 2018 / © 2018 IBM Corporation
Caffe2 ResNet50 model weights
Example: Get an Image Classifier
17
Step 4: Find example code
that performs model
inference
May 17, 2018 / © 2018 IBM Corporation
TensorFlow code for training and batch inference on ResNet50
© 2019 IBM Corporation
Example: Get an Image Classifier
18
Step 5: Write your own code to
perform model inference on one
image at a time
Step 6: Package your inference
code, graph creation code, and pre-
trained weights together
Step 7: Deploy your package
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
Model Marketplaces
19
Collections of well-
understood deep learning
models
Provide a central place to find
known-good implementations
of these models
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
IBM Model Asset eXchange
MAX is a one-stop shop open source
ecosystem for data scientists and AI
developers to share and consume models that
use machine learning engines, such
as TensorFlow, PyTorch and Caffe2.
It also provides a standard approach to
classify, annotate, and deploy these models
for prediction and inferencing.
MAX
https://developer.ibm.com/
code/exchanges/models/
May 17, 2018 / © 2018 IBM Corporation 20© 2019 IBM Corporation
© 2019 IBM Corporation
22© 2019 IBM Corporation
23
© 2019 IBM Corporation
Leveraging MAX
25
I am an application engineer and want to
augment my application/solution with AI.
• Use MAX pre-trained and ready to use
models.
• Deploy collocated with your application
as a docker container or in a
Kubernetes environment
• Integrate the simple to use Inference
REST API
• Use the demo applications as an
example on how to use the apis
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
Learning from MAX
26
I am an data scientist and want to learn
from MAX serving and deployments
patterns.
• All MAX code is available in
github.com/IBM/MAX*.
• Understand and reuse MAX’s inference
code in your own projects as allowed
per open source license
• Understand and reuse MAX’s
packaging and deployment patterns
based on containers and easily
deployable in Kubernetes and apply to
your models
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
MAX Summary
27
Free, open-source models.
Wide variety of domains.
Multiple deep learning frameworks.
Vetted and tested code and IP.
Build and deploy a container based web
service in 30 seconds.
Start training on Watson Studio in
minutes.
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
The IBM Data Asset eXchange
28
Also known as DAX.
A place to find curated free
and open datasets under
open data licenses.
Part of developer.ibm.com.
The MAX Named Entity Tagger
29
A model that identifies
mentions of named entities
like persons, organizations in
English-language text.
Trained by Nick Pentreath on
the CODAIT team
Most difficult part: Finding
usable training data
Groningen Meaning Bank
30
A project at the University of
Groningen to create an open
data set for training linguistic
models like named entity
taggers.
Public domain data with
public domain annotations,
assembled by a 10-person
team with help from online
volunteers.
We needed to make further
modifications to pass IBM’s
own controls.
Contracts Proposition Bank
31
A collection of annotated
sentences drawn from IBM’s
public contracts, annotated
with
Created by IBM Research.
Used by IBM researchers to
train better SRL parsers for
the legal documents domain.
Available on DAX.
IBM’s Open Data
32
IBM Research has produced
dozens, perhaps hundreds, of
open data sets.
The data is not kept in one place.
IBM is working to improve this.
– Initiatives within IBM Research
– DAX
– The Community Data License Agreement
The Community Data License Agreement
http://cdla.io
33
Linux Foundation initiative to
create a new legal framework
that meets the needs of AI
data sets.
IBM is a major supporter.
The Community Data License Agreement
http://cdla.io
34
Two licenses written
specifically for AI data
• CDLA-Sharing: “Copyleft”
license analogous the GPL
• CDLA-Permissive: Similar to
BSD license
Both licenses distinguish clearly
between use (analysis,
modeling) and modification of
the data set.
IBM Data Asset eXchange (DAX)
35
• Curated free and open datasets under open data licenses
• Standardized dataset formats and metadata
• Ready for use in enterprise AI applications
• Complement to the Model Asset eXchange (MAX)
Data Asset eXchange
ibm.biz/data-asset-exchange
Model Asset eXchange
ibm.biz/model-exchange
Is AI Fair?
And Transparent?
36© 2019 IBM Corporation
Unwanted bias and algorithmic fairness
Machine learning, by its very nature, is always a form of statistical discrimination
Discrimination becomes
objectionable when it
places certain privileged
groups at systematic
advantage and certain
unprivileged groups at
systematic disadvantage
Illegal in certain contexts
© 2019 IBM Corporation
38
AI Fairness 360
Toolbox:
Fairness metrics (30+)
Fairness metric
explanations
Bias mitigation
algorithms (10)
AIF360AIF360 toolkit is an open-source library to
help detect and remove bias in machine
learning models.
The AI Fairness 360 Python package includes
a comprehensive set of metrics for datasets
and models to test for biases, explanations for
these metrics, and algorithms to mitigate bias
in datasets and models.
https://github.com/IBM/AIF360
https://developer.ibm.com/patterns/ensuring-
fairness-when-processing-loan-applications/
© 2019 IBM Corporation
39
AIF360 Demo: http://aif360.mybluemix.net
© 2019 IBM Corporation
Adversarial Attacks
Defending Machine
Learning Systems
40© 2019 IBM Corporation
Adversarial Attacks
41Sources: Explaining and Harnessing Adversarial Examples
Robust Physical-World Attacks on Deep Learning Visual Classification
© 2019 IBM Corporation
Adversarial Attacks
42Sources: Explaining and Harnessing Adversarial Examples
Robust Physical-World Attacks on Deep Learning Visual Classification
© 2019 IBM Corporation
Adversarial Attacks - Hiding from Surveillance
43https://www.technologyreview.com/f/613409/how-to-hide-from-the-ai-surveillance-state-with-a-color-printout/© 2019 IBM Corporation
IBM Adversarial Robustness
Toolbox
ART
ART is a library dedicated to adversarial
machine learning. Its purpose is to allow rapid
crafting and analysis of attack and defense
methods for machine learning models. The
Adversarial Robustness Toolbox provides an
implementation for many state-of-the-art
methods for attacking and defending
classifiers.
44
https://github.com/IBM/adversarial-robustness-toolbox
https://developer.ibm.com/patterns/integrate-
adversarial-attacks-model-training-pipeline/
Toolbox
Evasion attacks (11)
Defenses (9)
Detection methods for
adversarial samples &
poisoning attacks
Robustness metrics
© 2019 IBM Corporation
45
ART Demo: https://art-demo.mybluemix.net/
© 2019 IBM Corporation
Building your models
interactively with
Jupyter Stack
46© 2019 IBM Corporation
Jupyter Notebooks
Notebooks are interactive
computational environments,
in which you can combine
code execution, rich text,
mathematics, plots and rich
media.
47
© 2019 IBM Corporation
JupyterLab
JupyterLab is the next generation
UI for the Jupyter Ecosystem.
Bring all the previous
improvements into a single unified
platform plus more!
Provides a modular, extensible
architecture
Retains backward compatibility
with the old notebook we know
and love
48
© 2019 IBM Corporation
Jupyter Notebook
Simple, but Powerful
As simple as opening a web
page, with the capabilities of
a powerful, multilingual,
development environment.
Interactive widgets
Code can produce rich
outputs such as images,
videos, markdown, LaTeX
and JavaScript. Interactive
widgets can be used to
manipulate and visualize
data in real-time.
Language of choice
Jupyter Notebooks have
support for over 50
programming languages,
including those popular in
Data Science, Data
Engineer, and AI such as
Python, R, Julia and Scala.
Big Data Integration
Leverage Big Data platforms
such as Apache Spark from
Python, R and Scala.
Explore the same data with
pandas, scikit-learn,
ggplot2, dplyr, etc.
Share Notebooks
Notebooks can be shared
with others using e-mail,
Dropbox, Google Drive,
GitHub, etc
49
Enterprise Requirements
Multiuser, Self Service, Secure
Scale to support Analytics Workloads
- Processing large amount of data
in a distributed fashion.
Support for Heterogenic AI
Workloads
- Resource intensive workloads
- Heterogenous frameworks (isolation required)
- Sharing of hardware resources (GPUs/TPUs)
IBM Developer / © 2019 IBM Corporation 50
Vanilla Jupyter Notebook
Kernel
Kernel
Kernel
Kernel
Kernel
Single user sharing the same
privileges
- Users can see and control each other
process using Jupyter administrative
utilities
Not Scalable
- Jupyter Kernels running as local
process where resources are limited
by what is available on the one single
node that runs all Kernels and
associated Spark drivers 8 8 8 8
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF
SIMULTANEOUS KERNELS
IBM Developer / © 2019 IBM Corporation 51
JupyterHub
JupyterHub brings the power of
notebooks to groups of users.
It gives users access to
computational environments
and resources, in a self-service
fashion, without burdening the
users with installation and
maintenance tasks.
52
© 2019 IBM Corporation
Jupyter Enterprise
Gateway Jupyter Enterprise Gateway at IBM Code
https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/
Jupyter Enterprise Gateway source code at GitHub
https://github.com/jupyter/enterprise_gateway
Jupyter Enterprise Gateway Documentation
http://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Supported Kernels
Supported Platforms
53
A lightweight, multi-tenant, scalable
and secure gateway that enables
Jupyter Notebooks to share resources
across an Apache Spark or Kubernetes
cluster for Enterprise/Cloud use cases
© 2019 IBM Corporation
Spectrum Conductor
+ +
Jupyter Enterprise Gateway
Features
Optimized Resource Allocation
– Utilize resources on all cluster nodes by running kernels
as Spark applications in YARN Cluster Mode.
– Pluggable architecture to enable support for additional
Resource Managers
Enhanced Security
– End-to-End secure communications
Multiuser support with user impersonation
– Enhance security and sandboxing by enabling user
impersonation when running kernels (using Kerberos).
– Individual HDFS home folder for each notebook user.
– Use the same user ID for notebook and batch jobs.
Kernel
Kernel Kernel
Kernel
Kernel
Kernel
Kernel
16
32
48
64
0
10
20
30
40
50
60
70
80
4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap)
Cluster Size (32GB Nodes)
MAXIMUM NUMBER OF
SIMULTANEOUS KERNELS
54
© 2019 IBM Corporation
Jupyter Enterprise Gateway 2.x
AI Workloads with Containers
– Current version : 2.1.0
• Innovations around Container Environments
• Support vanilla kernels, Spark on K8s, Docker Swarm
– Distributed kernels as individual containers in both Docker Swarm
or Kubernetes environment
• Provided kernel images for:
– Python (IPython), Python w/ Spark, Python w/
Tensorflow, and Python w/ Tensorflow and GPUs, Scala
(Toree) w/ Spark, R (IRKernel), R w/ Spark
– JupyterHub integration.
– Dynamic Configurable (reloadable configuration)
– Deployment with helm,
– Jinja templates for kernel configuration
55IBM Developer / © 2019 IBM Corporation
Jupyter Enterprise Gateway - Kubernetes
Jupyter Enterprise Gateway - Kubernetes
Jupyter Enterprise Gateway & JupyterHub
Leveraging
AI Platforms
for model training
58© 2019 IBM Corporation
Enterprise Machine Learning
Training/Deploying Models requires a lot of DevOPS
60May 17, 2018 / © 2018 IBM Corporation
Model Serving
Monitoring
Resource
Management
Configuration
Hyperparameter
Optimization
Reproducibility
© 2019 IBM Corporation
AI Platforms
61
Aims to enable the Data Scientist to train their AI Models (e.g. Deep Neural
Networks) in a consistent way independent of the framework in use or
resources required for the job.
Leverages Kubernetes platform ability to easy management of
containerized applications with the benefit of Elasticity and Quality of
Services as well as sharing of restrict accelerated hardware
May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
End to end ML platform on Kubernetes.
Initially originated at Google.
Key Projects
– Model Training and Hyper
parameter optimization
– Model Serving
– Model Management
– Pipelines:
• Combine components into
complex workflows
– Metadata
• Collect data from multiple components
Kubeflow
Overall community, and IBM’s presence in Kubeflow
• Commits in
KubeFlow
compared with
other companies
• IBM is 2nd
• or 3rd largest
contributor in the
past 12 months
• IBM maintainers
(approvers/review
ers) in Katib
Kubeflow Serving,
(HPO+Training),
Manifests,
Pipelines etc.
https://www.stackalytics.com/unaffiliated?project_type=kubeflow-group
IBMers contributing to:
• 590+ Commits
• 924K Lines of
Code
https://www.stackalytics.com/unaffiliated?project_type=kubeflow-group&company=ibm
© 2018 IBM Corporation
Model Asset Exchange
https://developer.ibm.com/code/exchanges/models/
Data Asset Exchange
https://developer.ibm.com/exchanges/data/
AI Fairness 360
https://github.com/IBM/AIF360
Adversarial Robustness Toolbox
https://github.com/IBM/adversarial-robustness-toolbox
Jupyter Enterprise Gateway
https://github.com/jupyter/enterprise_gateway
Kubeflow
https://github.com/kubeflow
65
Open Source Resources
Thank you!
@lresende1975
© 2019 IBM Corporation
© 2018 IBM Corporation 66

Más contenido relacionado

La actualidad más candente

Open Collaboration in a Digital World | Find your place in the future
Open Collaboration in a Digital World | Find your place in the futureOpen Collaboration in a Digital World | Find your place in the future
Open Collaboration in a Digital World | Find your place in the futureDeborah Bryant
 
Open Source and Standards Communities Coming Together to Solve Real World Pro...
Open Source and Standards Communities Coming Together to Solve Real World Pro...Open Source and Standards Communities Coming Together to Solve Real World Pro...
Open Source and Standards Communities Coming Together to Solve Real World Pro...All Things Open
 
AI ML by Silver Touch Tech Lab
AI ML by Silver Touch Tech LabAI ML by Silver Touch Tech Lab
AI ML by Silver Touch Tech LabSilverTouchTechLab
 
AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017
AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017
AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017Amazon Web Services
 
No sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architecturesNo sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architecturesNicholas Goodman
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsKai Wähner
 
DevOps + DataOps = Digital Transformation
DevOps + DataOps = Digital Transformation DevOps + DataOps = Digital Transformation
DevOps + DataOps = Digital Transformation Delphix
 
The Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in CloudThe Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in CloudAll Things Open
 
Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...
Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...
Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...Thabang Mashologu
 
Watson AI platform for business - IBM Cloud
Watson AI platform for business - IBM CloudWatson AI platform for business - IBM Cloud
Watson AI platform for business - IBM CloudSarmad Ibrahim
 
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...SnapLogic
 
Knowledge and Scalability Through Graph Composition
Knowledge and Scalability Through Graph CompositionKnowledge and Scalability Through Graph Composition
Knowledge and Scalability Through Graph CompositionNeo4j
 
新生利用图书馆讲座
新生利用图书馆讲座新生利用图书馆讲座
新生利用图书馆讲座xiaobiye
 
Nimbix AI Cloud and PowerAI
Nimbix AI Cloud and PowerAINimbix AI Cloud and PowerAI
Nimbix AI Cloud and PowerAILeo Reiter
 
Kubernetes and Container Technologies from Cloud Native Computing Foundation
Kubernetes and Container Technologies from Cloud Native Computing FoundationKubernetes and Container Technologies from Cloud Native Computing Foundation
Kubernetes and Container Technologies from Cloud Native Computing FoundationCloud Standards Customer Council
 
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...polochau
 
5 Reasons not to use Dita from a CCMS Perspective
5 Reasons not to use Dita from a CCMS Perspective5 Reasons not to use Dita from a CCMS Perspective
5 Reasons not to use Dita from a CCMS PerspectiveMarcus Kesseler
 
Connect Faster with SnapLogic at Workday Rising
Connect Faster with SnapLogic at Workday RisingConnect Faster with SnapLogic at Workday Rising
Connect Faster with SnapLogic at Workday RisingSnapLogic
 

La actualidad más candente (20)

Open Collaboration in a Digital World | Find your place in the future
Open Collaboration in a Digital World | Find your place in the futureOpen Collaboration in a Digital World | Find your place in the future
Open Collaboration in a Digital World | Find your place in the future
 
Open Source and Standards Communities Coming Together to Solve Real World Pro...
Open Source and Standards Communities Coming Together to Solve Real World Pro...Open Source and Standards Communities Coming Together to Solve Real World Pro...
Open Source and Standards Communities Coming Together to Solve Real World Pro...
 
AI ML by Silver Touch Tech Lab
AI ML by Silver Touch Tech LabAI ML by Silver Touch Tech Lab
AI ML by Silver Touch Tech Lab
 
AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017
AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017
AWS Executive Insight Event – Frankfurt: January 25 – 26, 2017
 
No sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architecturesNo sql now2011_review_of_adhoc_architectures
No sql now2011_review_of_adhoc_architectures
 
Digital transformation and AI @Edge
Digital transformation and AI @EdgeDigital transformation and AI @Edge
Digital transformation and AI @Edge
 
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming AnalyticsR, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
R, Spark, Tensorflow, H20.ai Applied to Streaming Analytics
 
DevOps + DataOps = Digital Transformation
DevOps + DataOps = Digital Transformation DevOps + DataOps = Digital Transformation
DevOps + DataOps = Digital Transformation
 
The Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in CloudThe Growing Research that Open Source Owns the Future in Cloud
The Growing Research that Open Source Owns the Future in Cloud
 
Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...
Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...
Eclipse MicroProfile: Accelerating Cloud-Native Application Development with ...
 
Watson AI platform for business - IBM Cloud
Watson AI platform for business - IBM CloudWatson AI platform for business - IBM Cloud
Watson AI platform for business - IBM Cloud
 
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
Webinar: It's the 21st Century - Why Isn't Your Data Integration Loosely Coup...
 
Knowledge and Scalability Through Graph Composition
Knowledge and Scalability Through Graph CompositionKnowledge and Scalability Through Graph Composition
Knowledge and Scalability Through Graph Composition
 
新生利用图书馆讲座
新生利用图书馆讲座新生利用图书馆讲座
新生利用图书馆讲座
 
Nimbix AI Cloud and PowerAI
Nimbix AI Cloud and PowerAINimbix AI Cloud and PowerAI
Nimbix AI Cloud and PowerAI
 
On Demand BI
On Demand BIOn Demand BI
On Demand BI
 
Kubernetes and Container Technologies from Cloud Native Computing Foundation
Kubernetes and Container Technologies from Cloud Native Computing FoundationKubernetes and Container Technologies from Cloud Native Computing Foundation
Kubernetes and Container Technologies from Cloud Native Computing Foundation
 
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
Towards Secure and Interpretable AI: Scalable Methods, Interactive Visualizat...
 
5 Reasons not to use Dita from a CCMS Perspective
5 Reasons not to use Dita from a CCMS Perspective5 Reasons not to use Dita from a CCMS Perspective
5 Reasons not to use Dita from a CCMS Perspective
 
Connect Faster with SnapLogic at Workday Rising
Connect Faster with SnapLogic at Workday RisingConnect Faster with SnapLogic at Workday Rising
Connect Faster with SnapLogic at Workday Rising
 

Similar a From Data to AI - Silicon Valley Open Source projects come to you - Madrid meetup

Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examplesLuciano Resende
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeLuciano Resende
 
Libera la potenza del Machine Learning
Libera la potenza del Machine LearningLibera la potenza del Machine Learning
Libera la potenza del Machine LearningJürgen Ambrosi
 
Continuous Deployment for Deep Learning
Continuous Deployment for Deep LearningContinuous Deployment for Deep Learning
Continuous Deployment for Deep LearningDatabricks
 
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3DataWorks Summit
 
Ai pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksAi pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksLuciano Resende
 
IBM Developer Model Asset eXchange
IBM Developer Model Asset eXchangeIBM Developer Model Asset eXchange
IBM Developer Model Asset eXchangeNick Pentreath
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...Antje Barth
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...Alok Singh
 
Lfai governance board 20191031 v3
Lfai governance board 20191031 v3Lfai governance board 20191031 v3
Lfai governance board 20191031 v3ISSIP
 
IBM Anuncia la adquisición de Red Hat
IBM Anuncia la adquisición de Red HatIBM Anuncia la adquisición de Red Hat
IBM Anuncia la adquisición de Red HatJuan Sabaris
 
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."Gustavo Cuervo
 
IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.
IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.
IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.OW2
 
G107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904bG107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904bTony Pearson
 
The IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for businessThe IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for businessAleksandar Francuz
 
Drive responsibly: Innovate on cloud that is Open by design
Drive responsibly: Innovate on cloud that is Open by designDrive responsibly: Innovate on cloud that is Open by design
Drive responsibly: Innovate on cloud that is Open by designAngel Diaz
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark newAnam Mahmood
 
BP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTF
BP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTFBP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTF
BP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTFChristian Güdemann
 

Similar a From Data to AI - Silicon Valley Open Source projects come to you - Madrid meetup (20)

Open Source AI - News and examples
Open Source AI - News and examplesOpen Source AI - News and examples
Open Source AI - News and examples
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
Libera la potenza del Machine Learning
Libera la potenza del Machine LearningLibera la potenza del Machine Learning
Libera la potenza del Machine Learning
 
Continuous Deployment for Deep Learning
Continuous Deployment for Deep LearningContinuous Deployment for Deep Learning
Continuous Deployment for Deep Learning
 
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3Optimizing your SparkML pipelines using the latest features in Spark 2.3
Optimizing your SparkML pipelines using the latest features in Spark 2.3
 
Ai pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooksAi pipelines powered by jupyter notebooks
Ai pipelines powered by jupyter notebooks
 
IBM Developer Model Asset eXchange
IBM Developer Model Asset eXchangeIBM Developer Model Asset eXchange
IBM Developer Model Asset eXchange
 
How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...How to build containerized architectures for deep learning - Data Festival 20...
How to build containerized architectures for deep learning - Data Festival 20...
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
Lfai governance board 20191031 v3
Lfai governance board 20191031 v3Lfai governance board 20191031 v3
Lfai governance board 20191031 v3
 
IBM Anuncia la adquisición de Red Hat
IBM Anuncia la adquisición de Red HatIBM Anuncia la adquisición de Red Hat
IBM Anuncia la adquisición de Red Hat
 
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
 
IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.
IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.
IBM Keynote presentation, OW2con'19, June 12-13, 2019, Paris.
 
G107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904bG107980 top-it-trends-atlanta-v1904b
G107980 top-it-trends-atlanta-v1904b
 
The IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for businessThe IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for business
 
Drive responsibly: Innovate on cloud that is Open by design
Drive responsibly: Innovate on cloud that is Open by designDrive responsibly: Innovate on cloud that is Open by design
Drive responsibly: Innovate on cloud that is Open by design
 
NodeConf EU 2015 Keynote
NodeConf EU 2015 Keynote NodeConf EU 2015 Keynote
NodeConf EU 2015 Keynote
 
Center of Excellence
Center of Excellence Center of Excellence
Center of Excellence
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
 
BP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTF
BP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTFBP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTF
BP207: Don't Reinvent the Wheel - (Re)use Open Source Software From OpenNTF
 

Más de Luciano Resende

A Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfA Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfLuciano Resende
 
Using Elyra for COVID-19 Analytics
Using Elyra for COVID-19 AnalyticsUsing Elyra for COVID-19 Analytics
Using Elyra for COVID-19 AnalyticsLuciano Resende
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Luciano Resende
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayLuciano Resende
 
Scaling notebooks for Deep Learning workloads
Scaling notebooks for Deep Learning workloadsScaling notebooks for Deep Learning workloads
Scaling notebooks for Deep Learning workloadsLuciano Resende
 
Jupyter Enterprise Gateway Overview
Jupyter Enterprise Gateway OverviewJupyter Enterprise Gateway Overview
Jupyter Enterprise Gateway OverviewLuciano Resende
 
IoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache BahirIoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache BahirLuciano Resende
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirLuciano Resende
 
Building analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsBuilding analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsLuciano Resende
 
Building iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache BahirBuilding iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache BahirLuciano Resende
 
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache SparkAn Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache SparkLuciano Resende
 
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017Luciano Resende
 
What's new in Apache SystemML - Declarative Machine Learning
What's new in Apache SystemML  - Declarative Machine LearningWhat's new in Apache SystemML  - Declarative Machine Learning
What's new in Apache SystemML - Declarative Machine LearningLuciano Resende
 
Big analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel GatewayBig analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel GatewayLuciano Resende
 
Jupyter con meetup extended jupyter kernel gateway
Jupyter con meetup   extended jupyter kernel gatewayJupyter con meetup   extended jupyter kernel gateway
Jupyter con meetup extended jupyter kernel gatewayLuciano Resende
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirLuciano Resende
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceLuciano Resende
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningLuciano Resende
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende
 

Más de Luciano Resende (20)

A Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfA Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdf
 
Using Elyra for COVID-19 Analytics
Using Elyra for COVID-19 AnalyticsUsing Elyra for COVID-19 Analytics
Using Elyra for COVID-19 Analytics
 
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
Elyra - a set of AI-centric extensions to JupyterLab Notebooks.
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
 
Scaling notebooks for Deep Learning workloads
Scaling notebooks for Deep Learning workloadsScaling notebooks for Deep Learning workloads
Scaling notebooks for Deep Learning workloads
 
Jupyter Enterprise Gateway Overview
Jupyter Enterprise Gateway OverviewJupyter Enterprise Gateway Overview
Jupyter Enterprise Gateway Overview
 
IoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache BahirIoT Applications and Patterns using Apache Spark & Apache Bahir
IoT Applications and Patterns using Apache Spark & Apache Bahir
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache Bahir
 
Building analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernelsBuilding analytical microservices powered by jupyter kernels
Building analytical microservices powered by jupyter kernels
 
Building iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache BahirBuilding iot applications with Apache Spark and Apache Bahir
Building iot applications with Apache Spark and Apache Bahir
 
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache SparkAn Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
An Enterprise Analytics Platform with Jupyter Notebooks and Apache Spark
 
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
The Analytic Platform behind IBM’s Watson Data Platform - Big Data Spain 2017
 
What's new in Apache SystemML - Declarative Machine Learning
What's new in Apache SystemML  - Declarative Machine LearningWhat's new in Apache SystemML  - Declarative Machine Learning
What's new in Apache SystemML - Declarative Machine Learning
 
Big analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel GatewayBig analytics meetup - Extended Jupyter Kernel Gateway
Big analytics meetup - Extended Jupyter Kernel Gateway
 
Jupyter con meetup extended jupyter kernel gateway
Jupyter con meetup   extended jupyter kernel gatewayJupyter con meetup   extended jupyter kernel gateway
Jupyter con meetup extended jupyter kernel gateway
 
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache BahirWriting Apache Spark and Apache Flink Applications Using Apache Bahir
Writing Apache Spark and Apache Flink Applications Using Apache Bahir
 
How mentoring can help you start contributing to open source
How mentoring can help you start contributing to open sourceHow mentoring can help you start contributing to open source
How mentoring can help you start contributing to open source
 
SystemML - Declarative Machine Learning
SystemML - Declarative Machine LearningSystemML - Declarative Machine Learning
SystemML - Declarative Machine Learning
 
Luciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conferenceLuciano Resende's keynote at Apache big data conference
Luciano Resende's keynote at Apache big data conference
 
Asf icfoss-mentoring
Asf icfoss-mentoringAsf icfoss-mentoring
Asf icfoss-mentoring
 

Último

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 

Último (16)

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 

From Data to AI - Silicon Valley Open Source projects come to you - Madrid meetup

  • 1. De Big Data a AI pasando por Silicon Valley Luciano Resende IBM – CODAIT – Silicon Valley, California 1© 2019 IBM Corporation
  • 2. About me - Luciano Resende 2 Open Source AI Platform Architect – IBM – CODAIT • Senior Technical Staff Member at IBM, contributing to open source for over 10 years • Currently contributing to : Jupyter Notebook ecosystem, Apache Bahir, Apache Toree, Apache Spark among other projects related to AI/ML platforms lresende@us.ibm.com https://www.linkedin.com/in/lresende @lresende1975 https://github.com/lresende © 2018 IBM Corporation© 2019 IBM Corporation
  • 3. 3 Learn Open Source @ IBM Program touches 78,000 IBMers annually Consume Virtually all IBM products contain some open source • 40,363 pkgs Per Year Contribute • >62K OS Certs per year • ~10K IBM commits per month Connect > 1000 active IBM Contributors Working in key OS projects IBM Open Source Participation © 2019 IBM Corporation
  • 4. 4 IBM Open Source Participation IBM generated open source innovation • 137 Code Open (dWO) projects w/1000+ Github projects • 4 graduates: Node-Red, OpenWhisk, SystemML, Blockchain fabric to full open governance in the last year • developer.ibm.com/code/open/code/ Community • IBM focused on 18 strategic communities • Drive open governance in “Centers of Gravity” • IBM Leaders drive key technologies and assure freedom of action The IBM OS Way is now open sourced • Training, Recognition, Tooling • Organization, Consuming, Contributing © 2019 IBM Corporation
  • 5. 5 IBM’s history of strong AI leadership 1997: Deep Blue • Deep Blue became the first machine to beat a world chess champion in tournament play 2011: Jeopardy! • Watson beat two top Jeopardy! champions 1968, 2001: A Space Odyssey • IBM was a technical advisor • HAL is “the latest in machine intelligence” 2018: Open Tech, AI & emerging standards • New IBM centers of gravity for AI • OS projects increasing exponentially • Emerging global standards in AI © 2019 IBM Corporation 2018: Project Debater
  • 6. Center for Open Source Data and AI Technologies CODAIT codait.org 2018 / © 2018 IBM Corporation codait (French) = coder/coded https://m.interglot.com/fr/en/codait CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise Relaunch of the Spark Technology Center (STC) to reflect expanded mission 6© 2019 IBM Corporation
  • 7. Center for Open Source Data and AI Technologies IBM Developer / © 2019 IBM Corporation 7 Using AI - Model Asset Exchange - Data Asset Exchange AI Frameworks Building AI AI Platforms Trusted AI - Fairness - Robustness - Transparency and Accountability - Explainability
  • 8. AI Examples Today 8© 2019 IBM Corporation
  • 9. Home Automation & Security - Multiple connected or standalone devices - Controlled by Voice - Amazon Echo (Alexa) - Google Home - Apple HomePod (Siri) 9 © 2019 IBM Corporation
  • 10. Autonomous Driving In 2016, Google's self-driving car system has been officially recognized as a driver in the US, paving the way for the legalization of autonomous vehicles. Doordash is currently testing self-driving robots for food delivery. 10 https://www.dezeen.com/2016/02/12/google-self-driving-car-artficial-intelligence-system-recognised-as-driver-usa/ https://medium.com/@DoorDash/welcoming-our-newest-robots-to-the-doordash-fleet-with-marble-e752a85d6602 © 2019 IBM Corporation
  • 11. AMAZON Go AMAZON GO – No lines, no checkout, just grab and go 11 © 2019 IBM Corporation
  • 12. But how simple is to apply AI to your Application? 12© 2019 IBM Corporation
  • 13. “cat” A simple Deep Learning Model 13May 17, 2018 / © 2018 IBM Corporation Dense (3×8) Dense (8×6) Input (3) Output (2) Dense (6×4) Dense (4×2) Neural Network Graph Weights (not to scale) Driver Program © 2019 IBM Corporation
  • 14. Example: Get an Image Classifier 14 Step 1: Find a suitable neural network graph. – Need to read some papers May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 15. Example: Get an Image Classifier 15 Step 2: Find code to generate the neural network graph May 17, 2018 / © 2018 IBM Corporation TensorFlow code to build ResNet50 neural network graph © 2019 IBM Corporation
  • 16. Example: Get an Image Classifier 16 Step 3: Find some pre-trained weights for your graph May 17, 2018 / © 2018 IBM Corporation Caffe2 ResNet50 model weights
  • 17. Example: Get an Image Classifier 17 Step 4: Find example code that performs model inference May 17, 2018 / © 2018 IBM Corporation TensorFlow code for training and batch inference on ResNet50 © 2019 IBM Corporation
  • 18. Example: Get an Image Classifier 18 Step 5: Write your own code to perform model inference on one image at a time Step 6: Package your inference code, graph creation code, and pre- trained weights together Step 7: Deploy your package May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 19. Model Marketplaces 19 Collections of well- understood deep learning models Provide a central place to find known-good implementations of these models May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 20. IBM Model Asset eXchange MAX is a one-stop shop open source ecosystem for data scientists and AI developers to share and consume models that use machine learning engines, such as TensorFlow, PyTorch and Caffe2. It also provides a standard approach to classify, annotate, and deploy these models for prediction and inferencing. MAX https://developer.ibm.com/ code/exchanges/models/ May 17, 2018 / © 2018 IBM Corporation 20© 2019 IBM Corporation
  • 21. © 2019 IBM Corporation
  • 22. 22© 2019 IBM Corporation
  • 23. 23
  • 24. © 2019 IBM Corporation
  • 25. Leveraging MAX 25 I am an application engineer and want to augment my application/solution with AI. • Use MAX pre-trained and ready to use models. • Deploy collocated with your application as a docker container or in a Kubernetes environment • Integrate the simple to use Inference REST API • Use the demo applications as an example on how to use the apis May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 26. Learning from MAX 26 I am an data scientist and want to learn from MAX serving and deployments patterns. • All MAX code is available in github.com/IBM/MAX*. • Understand and reuse MAX’s inference code in your own projects as allowed per open source license • Understand and reuse MAX’s packaging and deployment patterns based on containers and easily deployable in Kubernetes and apply to your models May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 27. MAX Summary 27 Free, open-source models. Wide variety of domains. Multiple deep learning frameworks. Vetted and tested code and IP. Build and deploy a container based web service in 30 seconds. Start training on Watson Studio in minutes. May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 28. The IBM Data Asset eXchange 28 Also known as DAX. A place to find curated free and open datasets under open data licenses. Part of developer.ibm.com.
  • 29. The MAX Named Entity Tagger 29 A model that identifies mentions of named entities like persons, organizations in English-language text. Trained by Nick Pentreath on the CODAIT team Most difficult part: Finding usable training data
  • 30. Groningen Meaning Bank 30 A project at the University of Groningen to create an open data set for training linguistic models like named entity taggers. Public domain data with public domain annotations, assembled by a 10-person team with help from online volunteers. We needed to make further modifications to pass IBM’s own controls.
  • 31. Contracts Proposition Bank 31 A collection of annotated sentences drawn from IBM’s public contracts, annotated with Created by IBM Research. Used by IBM researchers to train better SRL parsers for the legal documents domain. Available on DAX.
  • 32. IBM’s Open Data 32 IBM Research has produced dozens, perhaps hundreds, of open data sets. The data is not kept in one place. IBM is working to improve this. – Initiatives within IBM Research – DAX – The Community Data License Agreement
  • 33. The Community Data License Agreement http://cdla.io 33 Linux Foundation initiative to create a new legal framework that meets the needs of AI data sets. IBM is a major supporter.
  • 34. The Community Data License Agreement http://cdla.io 34 Two licenses written specifically for AI data • CDLA-Sharing: “Copyleft” license analogous the GPL • CDLA-Permissive: Similar to BSD license Both licenses distinguish clearly between use (analysis, modeling) and modification of the data set.
  • 35. IBM Data Asset eXchange (DAX) 35 • Curated free and open datasets under open data licenses • Standardized dataset formats and metadata • Ready for use in enterprise AI applications • Complement to the Model Asset eXchange (MAX) Data Asset eXchange ibm.biz/data-asset-exchange Model Asset eXchange ibm.biz/model-exchange
  • 36. Is AI Fair? And Transparent? 36© 2019 IBM Corporation
  • 37. Unwanted bias and algorithmic fairness Machine learning, by its very nature, is always a form of statistical discrimination Discrimination becomes objectionable when it places certain privileged groups at systematic advantage and certain unprivileged groups at systematic disadvantage Illegal in certain contexts © 2019 IBM Corporation
  • 38. 38 AI Fairness 360 Toolbox: Fairness metrics (30+) Fairness metric explanations Bias mitigation algorithms (10) AIF360AIF360 toolkit is an open-source library to help detect and remove bias in machine learning models. The AI Fairness 360 Python package includes a comprehensive set of metrics for datasets and models to test for biases, explanations for these metrics, and algorithms to mitigate bias in datasets and models. https://github.com/IBM/AIF360 https://developer.ibm.com/patterns/ensuring- fairness-when-processing-loan-applications/ © 2019 IBM Corporation
  • 40. Adversarial Attacks Defending Machine Learning Systems 40© 2019 IBM Corporation
  • 41. Adversarial Attacks 41Sources: Explaining and Harnessing Adversarial Examples Robust Physical-World Attacks on Deep Learning Visual Classification © 2019 IBM Corporation
  • 42. Adversarial Attacks 42Sources: Explaining and Harnessing Adversarial Examples Robust Physical-World Attacks on Deep Learning Visual Classification © 2019 IBM Corporation
  • 43. Adversarial Attacks - Hiding from Surveillance 43https://www.technologyreview.com/f/613409/how-to-hide-from-the-ai-surveillance-state-with-a-color-printout/© 2019 IBM Corporation
  • 44. IBM Adversarial Robustness Toolbox ART ART is a library dedicated to adversarial machine learning. Its purpose is to allow rapid crafting and analysis of attack and defense methods for machine learning models. The Adversarial Robustness Toolbox provides an implementation for many state-of-the-art methods for attacking and defending classifiers. 44 https://github.com/IBM/adversarial-robustness-toolbox https://developer.ibm.com/patterns/integrate- adversarial-attacks-model-training-pipeline/ Toolbox Evasion attacks (11) Defenses (9) Detection methods for adversarial samples & poisoning attacks Robustness metrics © 2019 IBM Corporation
  • 46. Building your models interactively with Jupyter Stack 46© 2019 IBM Corporation
  • 47. Jupyter Notebooks Notebooks are interactive computational environments, in which you can combine code execution, rich text, mathematics, plots and rich media. 47 © 2019 IBM Corporation
  • 48. JupyterLab JupyterLab is the next generation UI for the Jupyter Ecosystem. Bring all the previous improvements into a single unified platform plus more! Provides a modular, extensible architecture Retains backward compatibility with the old notebook we know and love 48 © 2019 IBM Corporation
  • 49. Jupyter Notebook Simple, but Powerful As simple as opening a web page, with the capabilities of a powerful, multilingual, development environment. Interactive widgets Code can produce rich outputs such as images, videos, markdown, LaTeX and JavaScript. Interactive widgets can be used to manipulate and visualize data in real-time. Language of choice Jupyter Notebooks have support for over 50 programming languages, including those popular in Data Science, Data Engineer, and AI such as Python, R, Julia and Scala. Big Data Integration Leverage Big Data platforms such as Apache Spark from Python, R and Scala. Explore the same data with pandas, scikit-learn, ggplot2, dplyr, etc. Share Notebooks Notebooks can be shared with others using e-mail, Dropbox, Google Drive, GitHub, etc 49
  • 50. Enterprise Requirements Multiuser, Self Service, Secure Scale to support Analytics Workloads - Processing large amount of data in a distributed fashion. Support for Heterogenic AI Workloads - Resource intensive workloads - Heterogenous frameworks (isolation required) - Sharing of hardware resources (GPUs/TPUs) IBM Developer / © 2019 IBM Corporation 50
  • 51. Vanilla Jupyter Notebook Kernel Kernel Kernel Kernel Kernel Single user sharing the same privileges - Users can see and control each other process using Jupyter administrative utilities Not Scalable - Jupyter Kernels running as local process where resources are limited by what is available on the one single node that runs all Kernels and associated Spark drivers 8 8 8 8 0 10 20 30 40 50 60 70 80 4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap) Cluster Size (32GB Nodes) MAXIMUM NUMBER OF SIMULTANEOUS KERNELS IBM Developer / © 2019 IBM Corporation 51
  • 52. JupyterHub JupyterHub brings the power of notebooks to groups of users. It gives users access to computational environments and resources, in a self-service fashion, without burdening the users with installation and maintenance tasks. 52 © 2019 IBM Corporation
  • 53. Jupyter Enterprise Gateway Jupyter Enterprise Gateway at IBM Code https://developer.ibm.com/code/openprojects/jupyter-enterprise-gateway/ Jupyter Enterprise Gateway source code at GitHub https://github.com/jupyter/enterprise_gateway Jupyter Enterprise Gateway Documentation http://jupyter-enterprise-gateway.readthedocs.io/en/latest/ Supported Kernels Supported Platforms 53 A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across an Apache Spark or Kubernetes cluster for Enterprise/Cloud use cases © 2019 IBM Corporation Spectrum Conductor + +
  • 54. Jupyter Enterprise Gateway Features Optimized Resource Allocation – Utilize resources on all cluster nodes by running kernels as Spark applications in YARN Cluster Mode. – Pluggable architecture to enable support for additional Resource Managers Enhanced Security – End-to-End secure communications Multiuser support with user impersonation – Enhance security and sandboxing by enabling user impersonation when running kernels (using Kerberos). – Individual HDFS home folder for each notebook user. – Use the same user ID for notebook and batch jobs. Kernel Kernel Kernel Kernel Kernel Kernel Kernel 16 32 48 64 0 10 20 30 40 50 60 70 80 4 Nodes 8 Nodes 12 Nodes 16 NodesMaxKernels(4GBHeap) Cluster Size (32GB Nodes) MAXIMUM NUMBER OF SIMULTANEOUS KERNELS 54 © 2019 IBM Corporation
  • 55. Jupyter Enterprise Gateway 2.x AI Workloads with Containers – Current version : 2.1.0 • Innovations around Container Environments • Support vanilla kernels, Spark on K8s, Docker Swarm – Distributed kernels as individual containers in both Docker Swarm or Kubernetes environment • Provided kernel images for: – Python (IPython), Python w/ Spark, Python w/ Tensorflow, and Python w/ Tensorflow and GPUs, Scala (Toree) w/ Spark, R (IRKernel), R w/ Spark – JupyterHub integration. – Dynamic Configurable (reloadable configuration) – Deployment with helm, – Jinja templates for kernel configuration 55IBM Developer / © 2019 IBM Corporation
  • 57. Jupyter Enterprise Gateway - Kubernetes Jupyter Enterprise Gateway & JupyterHub
  • 58. Leveraging AI Platforms for model training 58© 2019 IBM Corporation
  • 60. Training/Deploying Models requires a lot of DevOPS 60May 17, 2018 / © 2018 IBM Corporation Model Serving Monitoring Resource Management Configuration Hyperparameter Optimization Reproducibility © 2019 IBM Corporation
  • 61. AI Platforms 61 Aims to enable the Data Scientist to train their AI Models (e.g. Deep Neural Networks) in a consistent way independent of the framework in use or resources required for the job. Leverages Kubernetes platform ability to easy management of containerized applications with the benefit of Elasticity and Quality of Services as well as sharing of restrict accelerated hardware May 17, 2018 / © 2018 IBM Corporation© 2019 IBM Corporation
  • 62. End to end ML platform on Kubernetes. Initially originated at Google. Key Projects – Model Training and Hyper parameter optimization – Model Serving – Model Management – Pipelines: • Combine components into complex workflows – Metadata • Collect data from multiple components Kubeflow
  • 63. Overall community, and IBM’s presence in Kubeflow • Commits in KubeFlow compared with other companies • IBM is 2nd • or 3rd largest contributor in the past 12 months • IBM maintainers (approvers/review ers) in Katib Kubeflow Serving, (HPO+Training), Manifests, Pipelines etc. https://www.stackalytics.com/unaffiliated?project_type=kubeflow-group
  • 64. IBMers contributing to: • 590+ Commits • 924K Lines of Code https://www.stackalytics.com/unaffiliated?project_type=kubeflow-group&company=ibm
  • 65. © 2018 IBM Corporation Model Asset Exchange https://developer.ibm.com/code/exchanges/models/ Data Asset Exchange https://developer.ibm.com/exchanges/data/ AI Fairness 360 https://github.com/IBM/AIF360 Adversarial Robustness Toolbox https://github.com/IBM/adversarial-robustness-toolbox Jupyter Enterprise Gateway https://github.com/jupyter/enterprise_gateway Kubeflow https://github.com/kubeflow 65 Open Source Resources Thank you! @lresende1975 © 2019 IBM Corporation
  • 66. © 2018 IBM Corporation 66