Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Patrocina Colabora
DataOps
El ciclo de despliegue continuo
en el análisis de datos
Olivier Perard| Data Scientist en Oracle
DataOps
Definitions
VP Technology Strategy, MapR
DataOps is an agile methodology for developing and deploying data-intensi...
DataOps
Gartner
Data & Analytics Summit 2018
DataOps, la plataforma de base de datos de nube privada como servicio (dbPaaS...
COMPARING
DEVOPS AND
DATAOPS
WHAT’S DIFFERENT OR THE
SAME?
Developers &
Architects
Data Engineers
Data
Scientists
Security...
DataOps
Brings Flexibility & Focus
Expands DevOps to include data-heavy roles
Organized around data-related goals
Better c...
DataOps
AN AGILE METHODOLOGY
FOR DATA-DRIVEN
ORGANIZATIONS
AXIOMS:
Continuous model deployment
Promote repeatability
Promo...
DataOps 7
Analyze and VisualizeStore and ProcessConnect and Integrate
Structured
Data
Unstructured
Data
1010101
01010 Sand...
Data Science
Platforms CLOUD PROVIDERS
ETL & DATA
ENGINEERING VERTICAL
APPLICATIONS
BI & VISUALIZATION
TOOLS
SECURIT
Y
INF...
DataOps
Approach Advantages
Data Self-Service
• Data Scientists need to develop Use Cases
quickly using the enterprise’s d...
DataOps
Continuous Model
Deployment
Key Building Blocks for Agility:
• Unified data platform
• Data governance
• Self-serv...
DataOps
Storage
Compute
Data
Lab
Sand
box
Data
Pod
DataOps
Data Platform Deployment
Oracle GitHub OCI Ansible Modules
Oracle Database 12c
Jupyter
Zeppelin OML
1
2
Data Integ...
DataOps
Data-Driven Architecture
Traditional and Modern
Legacy, Custom, Mainframe, SaaS, Microservices, …
Source: Oracle I...
DataOps
Cloud Native & Open Source
Community
Artificial
Intelligence Block Chain Internet of
Things
Container Native Micro...
DataOps
Data Stream
Data Preparation
Data Replication
Data ETLLogs
Oracle Cloud Infrastructure
Analytics
Consumers
Data Pl...
DataOps
Data Stream
Lineage
Pipeline
Quality
Speed
Efficiency
Oracle Data
Science
Data Science Requires a Comprehensive Platform to Simplify Operations
and Deliver Value at Scale
• Acc...
Oracle Data
Science
Projects LifeCycle
Reproducibility
Data
Versioning
Code
Versioning
Model
Versioning
Environment
Manage...
Oracle Data
Science
Modules
Collaborative
Integrated
Enterprise-Grade
Oracle Data Science Cloud
Oracle PaaS & IaaS
Project...
Oracle Data
Science
Environment complexity
Oracle Data
Science
Configure, Train & Deploy
Oracle PaaS
Language
Image
Video
HREmotion
Easy Deployment
3
Deploy
Model
Tr...
Oracle Data
Science
Build & Train
DEV
TEST
PROD
Oracle Data
Science
Deploy
DEV
TEST
PROD
DataOps
Conclusiones
Multi-Model Data Access
Interoperability
Data preparation and pipeline
Automation
Elasticity
Multidim...
Patrocina Colabora
Muchas Gracias
Olivier Perard
https://twitter.com/oracle_es?lang=es
Próxima SlideShare
Cargando en…5
×

DevOps Spain 2019. Olivier Perard-Oracle

93 visualizaciones

Publicado el

Ponencia. DataOps. El ciclo de despliegue continuo en el análisis de datos

Publicado en: Tecnología
  • Sé el primero en comentar

DevOps Spain 2019. Olivier Perard-Oracle

  1. 1. Patrocina Colabora DataOps El ciclo de despliegue continuo en el análisis de datos Olivier Perard| Data Scientist en Oracle
  2. 2. DataOps Definitions VP Technology Strategy, MapR DataOps is an agile methodology for developing and deploying data-intensive applications, including data science and machine learning. A DataOps workflow supports cross-functional collaboration and fast time to value. http://www.gartner.com/it-glossary/data-ops/ A hub for collecting and distributing data, with a mandate to provide controlled access to systems of record for customer and marketing performance data, while protecting privacy, usage restrictions, and data integrity.. Tamr CEO Andy Palmer DataOps is an enterprise collaboration framework that aligns data-management objectives with data-consumption ideals to maximize data-derived value. Nexla CEO DataOps is the function within an organization that controls the data journey from source to value.
  3. 3. DataOps Gartner Data & Analytics Summit 2018 DataOps, la plataforma de base de datos de nube privada como servicio (dbPaaS) y la gestión de datos habilitados para el aprendizaje automático. DataOps es una nueva práctica sin estándares ni frameworks Nick Heudecker, vicepresidente de investigación de Gartner
  4. 4. COMPARING DEVOPS AND DATAOPS WHAT’S DIFFERENT OR THE SAME? Developers & Architects Data Engineers Data Scientists Security & Governance Operations DataOps DevOps DataOps
  5. 5. DataOps Brings Flexibility & Focus Expands DevOps to include data-heavy roles Organized around data-related goals Better collaboration and communication between roles
  6. 6. DataOps AN AGILE METHODOLOGY FOR DATA-DRIVEN ORGANIZATIONS AXIOMS: Continuous model deployment Promote repeatability Promote productivity -- focus on core competencies Promote agility Promote self-service Data is central to disruptive enterprise applications • Lightweight, stateless functions do not represent the majority of workloads Data science and machine learning are an important paradigm • Scientists become active users -- no longer just application developers • Iterative workflow with different data usage patterns Data volumes continue to grow Moving data is a performance bottleneck DataOps Goals:
  7. 7. DataOps 7 Analyze and VisualizeStore and ProcessConnect and Integrate Structured Data Unstructured Data 1010101 01010 Sandboxes Data lakes Varying data types Quick and actionable business insights Focus on algorithms, not infrastructure Data available from structured and unstructured sources Data marts / warehouses DATA PLATFORM DATA Stream DATA ANALYTICS
  8. 8. Data Science Platforms CLOUD PROVIDERS ETL & DATA ENGINEERING VERTICAL APPLICATIONS BI & VISUALIZATION TOOLS SECURIT Y INFRASTRUCTU RE LIBRARIE S TOOL S DATA PLATFORMS DATA SCIENCE PLATFORMS
  9. 9. DataOps Approach Advantages Data Self-Service • Data Scientists need to develop Use Cases quickly using the enterprise’s data without any restrictions from IT. Improved efficiency and better use of Team’s time • Deploy Analytic platform in one click Faster Time-to-Value Improve productivity • Implement use cases in parallel using the same data, but with dedicated platforms to each analytic teams. Storage Compute LIBRARI ES TOO LS DATA SCIENCE PLATFORMS
  10. 10. DataOps Continuous Model Deployment Key Building Blocks for Agility: • Unified data platform • Data governance • Self-service data and compute access • Multitenancy and resource management Data Engineering Model Development Model Management Model Deployment Model Monitoring & Rescoring
  11. 11. DataOps Storage Compute Data Lab Sand box Data Pod
  12. 12. DataOps Data Platform Deployment Oracle GitHub OCI Ansible Modules Oracle Database 12c Jupyter Zeppelin OML 1 2 Data Integration CDC / ETL 3 Data Lab
  13. 13. DataOps Data-Driven Architecture Traditional and Modern Legacy, Custom, Mainframe, SaaS, Microservices, … Source: Oracle Insight Data Platform Analytics • Advanced Analytics • Self-service • Predictive Data Science • Machine Learning • Deep Learning Modern Data Platform Security & Compliance X Data Applications Real-time Analytics • Real-time Marketing • Fraud detection • Exec Dashboarding Real-time Real-time Services {OOP} SparklineData • Accessing multiple source of data (Technologies, Silos/Locations, Clouds) … • … with high performances … • … for broader Cross Multi-model queries/algorithms on real-time data as well as historical data Applications BigData SQL
  14. 14. DataOps Cloud Native & Open Source Community Artificial Intelligence Block Chain Internet of Things Container Native Microservices Open Serverless Computing DevOps Prometeus Open Source Cloud Native Innovation Open Source Cloud Native Development ISTIO Cloud-Native and Community Driven Innovation Open Source Managed and Autonomous Cloud Native
  15. 15. DataOps Data Stream Data Preparation Data Replication Data ETLLogs Oracle Cloud Infrastructure Analytics Consumers Data Platform BI NL / AI Data Integration CDC / ETL Discovering Structuring Cleaning Enriching Validating Deploying
  16. 16. DataOps Data Stream Lineage Pipeline Quality Speed Efficiency
  17. 17. Oracle Data Science Data Science Requires a Comprehensive Platform to Simplify Operations and Deliver Value at Scale • Accelerate use of proper tools, frameworks and infrastructure • Overcome restricted skillsets with a simple, collaborative platform • Quickly leverage predictive analytics to drive positive business outcomes Collaborate securely Power business Work in standardized environments A Robust, Easy-to-Use Data Science Platform Removes Barriers to Deploying Valuable Machine Learning Models in Production Manage data and tools
  18. 18. Oracle Data Science Projects LifeCycle Reproducibility Data Versioning Code Versioning Model Versioning Environment Management Model Deployment Operationalize Models as Scalable APIs Model Management Monitor and Optimize Model Performance Data Exploration Collaborative Data Analysis / Feature Engineering Model Build and Train with Open Source Frameworks Collaborators ∙ Data Scientists ∙ Business Stakeholders ∙ App Developers ∙ IT Admins Business Analyst/Leader Defining business problem and objective of analyses Data Engineer Prepare data, build pipelines, and provide data access for analytical or operational uses. IT Admin Oversees underlying process, architecture, operations, resource constraints. Data Scientist Analyze data using statistical methods and coding languages like Python, R, Scala Application Developer Deploy data science models into applications. Build data products.
  19. 19. Oracle Data Science Modules Collaborative Integrated Enterprise-Grade Oracle Data Science Cloud Oracle PaaS & IaaS Projects Notebooks Open Source Languages & Libraries Version Control Use Case Templates Model Build & Train Self-Service Scalable Compute (OCI) Object Store Catalog Data Lake Streaming Autonomous Data Warehouse Model Deployment Model Monitoring Access Controls & Security Project driven UI enables teams to easily work together on end-to-end modeling workflows with self-service access to data and resources Support for latest open source tools, version control, and tight integration with OCI and Oracle Big Data Platform A fully managed platform built to meet the needs of the modern enterprise
  20. 20. Oracle Data Science Environment complexity
  21. 21. Oracle Data Science Configure, Train & Deploy Oracle PaaS Language Image Video HREmotion Easy Deployment 3 Deploy Model Train Data Definitio n Model Test Publish API Data Select Code Noteboo k 2 Train • Frameworks • AI libraries • Samples • GPU clusters • Connect to data • Auto scale, updates • HS network, storage •Object Stores •Database CS •Spark Easy Data Access + 1 Configure Autonomous Setup Model Sharing Model Library APIsModel Analytics IT Persona DevOps Data Scientist Data Scientist Easy Development Easy setup
  22. 22. Oracle Data Science Build & Train DEV TEST PROD
  23. 23. Oracle Data Science Deploy DEV TEST PROD
  24. 24. DataOps Conclusiones Multi-Model Data Access Interoperability Data preparation and pipeline Automation Elasticity Multidimensional agility Automated governance Next Generation Platform for All Data Complete, Integrated, Open AI and Machine Learning ALL IN ONE ORACLE PROVIDES
  25. 25. Patrocina Colabora Muchas Gracias Olivier Perard https://twitter.com/oracle_es?lang=es

×