The initial version of a maturity roadmap to help guide businesses when adopting AI technology into their workflow. IBM Watson Studio is referenced as an example of technology that can help in accelerating the adoption process.
2. About Me
Focus / Passion
• AI, Cognitive, Emerging Technology
• Analytics
• Data (Architecture, Modeling, Integration)
• Cloud Service Architecture
• Applying the above to real-world business problems
Education & Certification
• M.S. Software Engineering
• B.S, Physics
• Data Mgmt, AI, Cloud, Docker, DevOps, …
Proud Member of
the IBM WolfPack
David Solomon
Technical
Evangelist, IBM
dsdlsolomo
@dlsolomo
Team-wolfpack
8. Gain value
from your data,
without limits
Access your data
All sources and all types
Flexibility
Support all data
types, all workloads,
all consumption models
Machine Learning
Make better decisions,
provide smarter capabilities
Democratize access
Provide data-driven decisions
to everyone
Simplicity
A unified experience in
managing your data
landscape
Cloud journey
Support your data regardless
of location
Essential elements of a hybrid data management strategy
11. 11
Introducing an AI Readiness Maturity Model
Insight
Hindered
Hindsight-
Driven
Data-
Driven
Insight-
Driven
AI-Driven
• Minimal data mgmt.
• Spreadsheets are
primary data tool
• Minimal standards
• Minimal Governance
• Centralized DBs for
critical data
• Some governance
• Siloed use of
unstructured data
• Data integration and
governance practice
• Organized use of
unstructured data
• Siloed Data Science
practices
• Data Science
practices in place
• Hybrid-data mgmt.
practice in place
• Leverage both cloud
and on-prem data
• Fully data-driven
business
• Access to all
required AI training
data
Data
Readiness
• Spreadsheet
analysis
• Desktop BI tools
• Minimal standards
• Soiled practices
• Focus on descriptive
analytics (What
Happened?)
• Standardized
reporting formats
• Diagnostic analytics
• Siloed use of
predictive analytics
• Siloed use of
Machine Learning
models
• Standard use of
Machine Learning
• Predictive analytics
• Siloed use of
prescriptive
analytics
• Prescriptive
analytics
• Fully insight-driven
business
Analytics
Readiness
Hindered
Business
Outcomes
Operational
Efficiency and
Cost Savings
Competitiveness Competitive
Advantage
Market Leader
• None
• Siloed
Experimentation
• Limited use for
siloed applications
• Initial production AI
applications
• Some alignment of
AI with business
strategy
• Standard AI practice
• Full alignment of AI
with business
strategy
AI
Capability
13. …and has grown to an entire portfolio of cognitive technologies
Retrieve and Rank
Language
• Conversation
• Document
Conversion
• Language
Translator
• Natural Language
Classifier
• Natural Language
Understanding
• Personality Insights
• Retrieve and Rank
• Tone Analyzer
Speech
• Speech to Text
• Text to Speech
Vision
• Visual Recognition
Data Insights
• Discovery
• Discovery News
• Watson Knowledge
Studio
Natural Language
Classifier
Tone Analyzer
14. Tools & Infrastructure
• Need an environment
that enables a “fail
fast” approach
• Discrete tools present
barriers to productivity
Governance
• If the data isn’t secure,
self-service isn’t a
reality
• Challenge
understanding data
lineage and getting to
a system of truth
Skills
• Data Science skills are
in low supply and high
demand
• Nurturing new data
professionals is
challenging
Data
• Data resides in silos &
difficult to access
• Unstructured and
external data wasn’t
considered
14
Why are enterprises struggling to
capture the value of AI?
How can these challenges be tackled in a timely manner?
15. Watson Studio
Supporting the end-to-end AI workflow
Prepare Data
for Analysis
Build and Train
ML/DL Models
Deploy Models
Monitor, Analyze
and Manage
Search and Find
Relevant Data
Connect &
Access Data
• Connect and
discover content
from multiple data
sources in the
cloud or on
premises.
• Bring structured
and unstructured
data to one toolkit.
• Clean and prepare
your data with Data
Refinery, a tool to
create data
preparation
pipelines visually.
• Use popular open
source libraries to
prepare
unstructured data.
• Democratize the
creation of ML and DL
models. Design your
AI models
programmatically or
visually with the most
popular open source
and IBM ML/DL
frameworks
• Leverage transfer
learning on pre-
trained models using
Watson tools to
adapt to your business
domain.
• Train at scale on
GPUs and
distributed compute
• Deploy your models
easily and have
them scale
automatically for
online, batch or
streaming use
cases
• Monitor the
performance of the
models in
production and
trigger automatic
retraining and
redeployment of
models.
• Find data
(structured,
unstructured) and
AI assets (e.g.,
ML/DL models,
notebooks, Watson
Data Kits) in the
Knowledge
Catalog
15
16. Her Job:
Builds AI application that meet the
requirements of the business.
What she does:
• Starts PoCs which includes
gathering content, dialog
building and model training
• Focus is on app building for the
team or company to use. Will
handle ML Ops as needed
Sometimes known as:
Front-end, back-end, full stack,
mobile or low-code developer
Tanya
Domain Expert
Her Job:
To transfer knowledge to Watson for
a successful user experience.
What she does:
• Range of domain knowledge and
uses that to teach Watson and
develop a custom models
• As Tanya gains more experience
she optimizes her knowledge to
teach Watson to design better
end-user experiences.
Sometimes known as:
Subject matter expert, content
strategist.
His Job:
Transform data into knowledge for
solving business problems.
What he does:
•Runs experiments to build custom
models that solve business problems.
•Use techniques such as Machine
Learning or Deep Learning and
works with Tanya to validate success
of trained models.
Watson Studio
Built for AI teams – enabling team productivity and collaboration
Sometimes known as:
ML/DL engineer, Modeler, Data Miner
Ed
Data Engineer
His Job:
Architects how data is organized
and ensures operability
What he does:
• Builds data infrastructure and ETL
pipelines. Works with Spark,
Hadoop, and HDFS.
• Works with data scientist to
transform research models into
production quality systems.
Sometimes known as:
Data infrastructure engineer
Mike
Data Scientist
Deb
The Developer
16
17. Watson Studio
Comprehensive set of tools for the end-to-end AI workflow
Model Lifecycle Management
Machine Learning Runtimes Deep Learning Runtimes
Authoring Tools
Cloud Infrastructure as a Service
Watson
API
Tools
Model
Builder
• Most popular open source frameworks
• IBM best-in-class frameworks
• Create, collaborate, deploy, and monitor
• Best of breed open source & IBM tools
• Code (R, Python or Scala) and no-code/visual
modeling tools
• Fully managed service
• Container-based resource management
• Elastic pay as you go CPU/GPU power
Data
Refinery
17
18. Watson Studio
Differentiating Capabilities
• Data Scientists, Subject Matter experts,
Business Analysts & Developers all in one
environment to accelerate innovation,
collaboration and productivity
• Built-in learning to get started or go the
distance with advanced tutorials
Integrated Collaboration Environment
• Best in-breed open source and IBM tools that
support the end-to-end AI lifecycle
• Choice of code or no-code tools to build and
train your own ML/DL models or easily train
and customize pre-trained Watson APIs
Choice of Tools for the full AI lifecycle
• Use Watson smarts and recommendations
for the best algorithms to use given your
data, OR
• Use the rich capabilities and controls to fine
tune your models
Support for all levels of expertise
• Monitor batch training experiments then
compare cross-model performance without
worrying about log transfers and scripts to
visualize results.
• You focus on designing your neural networks.
We’ll manage and track your assets.
Experiment centric DL workflow
• Deploy models into production then monitor
them to evaluate performance.
• Capture new data for continuous learning and
retrain models so they continually adapt to
changing conditions.
Model lifecycle & management
• Intelligent discovery of data and AI assets
that enables reuse & improves productivity
• Seamlessly integrated for productive use with
Machine Learning and Data science
• Powerful governance tools to control and
protect access to data
Integrated with Knowledge Catalog
18
For too long – data has been held captive within our systems of record. Isolated by the rigidity of platform/application/workload choices, segregated by business line, business function, and data type or initial usage.
The result is splintered views of segmented data that’s difficult to access on the whole, and impossible to attempt to gain true analytical insight from…..
And even this only speaks to the snapshot today and current models. The challenges are compounded as businesses look to change, grow, iterate practices, innovate, or disrupt markets.
Attempts at data science, machine learning, and deep learning are made moot by the fact that insights are only as good as the access to supporting data – which again is too fragmented to provide full value.
We believe, that in order to change this paradigm, a hybrid data management strategy should contain the elements here:
Access to all data regardless of source or type
The flexibility to support changing workloads and consumption cases
Possess intelligent analytics such as machine learning AT the data source
And…
Provide access to insights across the business, its functions, and to all users for better decision making
# # #
You need three essential elements on your journey to digital transformation.
You need to know your data. Typically this means building a 360-degree view of your focus area—for example, a 360-degree view of your customer. You need to gather your internal data and may also need to include external data from social media, click stream, census or other relevant sources.
This data must also be accessible by all users and/or applications that need it. This could mean making data globally accessible or running applications in the cloud. Consider that an application may need to access data from multiple data sources, so providing a common access layer is important to reduce application coding.
2. You need to be able to trust your data. Well-governed data provides confidence in not just the data itself, but the outcomes from analytics, reports and other tasks based on that data. There are two key points to data governance: First, you must have the ability to ensure the data is secure and adheres to compliance regulations. And second, you must have the ability to govern the data so your users can find and access information themselves, at the exact time they need it.
3. You must be able to use your data as a source for insights and intelligence. This means having not only the right skills and tools in place to surface insights, but also the right technology to learn from the data and improve accuracy each time that data is analyzed.
Three years ago Watson made it’s debut on the US Quiz show, Jeopardy, in a very public proof point of radical new technology. Jeopardy was the result of an IBM Grand Challenge – putting top scientists to work on a seemingly impossible task. IBM undertakes Grand Challenges every decade or so. The last grand challenge was “Deep Blue” in 1997 - a chess-playing computer that won the second six-game match against world champion Garry Kasparov by two wins to one with three draws.
Whether you attended one of the many IBM watch parties, watched the show at home, viewed it on YouTube later, or just read the newspapers, you witnessed history. Watson bested the 2 top champions, including Ken Jennings, who won 74 games and over $3M – the longest winning streak in J history.
Not only did Watson win, but in doing so it ushered in a whole new era of computing.
Additional Background:
What fascinated the IBM researchers was how Jeopardy was the ultimate test of IT capabilities because it relied on many human cognitive abilities traditionally seen beyond the capability of computers, such as:
The ability to discern double meanings of words, puns, rhymes, and inferred hints.
Extremely rapid responses (sifting through 200 million pages of information - in the span of seconds)
The ability to process vast amounts of information to make complex and subtle logical connections
A team of 15 IBM researchers working in collaboration with a pool of top universities as a “Deep QA” project. For the Watson team, replicating the human capabilities was an enormous challenge, moving beyond keyword searches and queries of structured data to asking questions and accessing and assessing a vast amount of unstructured data to find the best answer. But IBM that knew the solution to this challenge had the potential to change the way businesses use information and make decisions.