Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML), Presented @ GAIC Jan 18 2018, Santa Clara

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 38 Anuncio

Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML), Presented @ GAIC Jan 18 2018, Santa Clara

Descargar para leer sin conexión

Presented at: Global Big AI Conference, Santa Clara, Jan 2018 Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML)

Presented at: Global Big AI Conference, Santa Clara, Jan 2018 Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML)

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML), Presented @ GAIC Jan 18 2018, Santa Clara (20)

Anuncio

Más reciente (20)

Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML), Presented @ GAIC Jan 18 2018, Santa Clara

  1. 1. Developing and deploying AI solutions on the cloud using Team Data Science Process (TDSP) and Azure Machine Learning (AML) Global Artificial Intelligence Conference January 18, 2018, Santa Clara, CA Deck available @: https://aka.ms/tdsp-presentations
  2. 2. Agenda Process Tools (algos, IDEs, workbench) Platform (compute, storage, ...) Process, tools and platforms for AI solutions
  3. 3. Agile and iterative process to develop, deploy and manage AI applications in the cloud https://aka.ms/tdsp tdsp-feedback@microsoft.com Business Understanding Data acquisition & understanding Modeling (ML/AI) Deployment
  4. 4. The opportunity and challenge of data science in enterprises Opportunity: 17% had a well-developed Predictive/Prescriptive Analytics program in place, while 80% planned on implementing such a program within five years – Dataversity 2015 Survey Challenge: Only 27% of the big data projects are regarded as successful – CapGenimi 2014 Tools & data platforms have matured - Still a major gap in executing on the potential
  5. 5. One reason: Process challenge in Data Science Organization Collaboration Quality Knowledge Accumulation Agility Global Teams • Geographic Locations Team Growth • Onboard New Members Rapidly Varied Use Cases • Industries and Use Cases Diverse DS Backgrounds • DS have diverse backgrounds, experiences with tools, languages
  6. 6. Why is a process useful for Data Science? A process is a detailed sequence of activities necessary to perform specific business tasks It is used to standardize procedures and establish best practices Technology and tools are changing rapidly. A standardized process can provide continuity and stability of work-flow. - Based on discussions with Luis Morinigo, Dir. IoT, NewSignature
  7. 7. Data Science can borrow processes from DevOps
  8. 8. TDSP objective Integrate DevOps with data science workflows to improve collaboration, quality, robustness and efficiency in data science projects o Infrastructure as Code (IaC) o Build o Test o CI / CD o Release management o Performance monitoring
  9. 9. TDSP features for data science teams Standardized Data Science Lifecycle Project Structure, Templates & Roles Infrastructure & Toolkits Re-usable Data Science Utilities Project Execution (incl. DevOps components)
  10. 10. Data Science lifecycle  Primary stages: Lifecycle
  11. 11. TDSP lifecycle stages can be integrated with specific deliverables & checkpoints Business Understanding • Project Objective • Data, Target & Feature Definition • Data Dictionary Data acquisition and understanding Modeling Deployment Lifecycle
  12. 12. Project roles & tasks Personas: account managers, PMs, DSs, SWEs, architects  Governance and Project Management  AI Developers Structure,templates,roles NOTE: Specific roles depend on organizations
  13. 13. Shared and distributed infrastructure & toolkits Infrastructure&toolkits
  14. 14. Agile work planning and execution template o Use Agile work planning & execution template (data science specific) https://commons.wikimedia.org Scrum framework Projectexecutionguidelines
  15. 15. Collaborative development guidelines Remote Local LocalLocal Local TDSP git Template Integrated Agile planning & code development Projectexecutionguidelines
  16. 16. Integrating DevOps in projects Projectexecutionguidelines
  17. 17. Re-usable data science utilities: Analytics Interactive data exploration and reporting – IDEAR (Python, R, MRS) o Data quality assessment o Getting business insights from the data o Association between variables o Generating standardized data quality reports automatically Clustering Distribution assessment https://github.com/Azure/Azure-TDSP-Utilities DSutilities
  18. 18. Re-usable data science utilities: Analytics - modeling Automated modeling and reporting AMAR (R) Predicted vs. Actual (multiple algorithms) Feature Importance (multiple algorithms) https://github.com/Azure/Azure-TDSP-Utilities DSutilities
  19. 19. TDSP documentation: https://aka.ms/tdsp
  20. 20. TDSP trainings 20
  21. 21. TDSP in action: E2E worked-out samples  Azure Machine Learning
  22. 22. Adoption: How to stage (if needed) Level1 - One git repository per project - Standard directory structure - Standardized templates like charter, exit reports - Planning and tracking of work items Level2 - Customize templates to fit team needs - Create shared team utility repo (like IDEAR, AMAR) Leve3 - Develop process to graduate code from projects to the shared team utility repo - Develop E2E worked-out templates - Use mature work planning and tracking system (e.g. Agile) Level4 - Link git branch with work items - Code review - Manage and version model and data assets - Develop automated testing framework - Develop automated CI/CD
  23. 23. Adoption: Customers  Microsoft internal  Microsoft consulting services (MCS)  AI & R Cloud Platform: Algorithm and data sciences team  Windows Devices DS team  …  … • External partners • New Signature • BlueGranite
  24. 24. https://docs.microsoft.com/azure/machine-learning/preview
  25. 25. Apps + insightsSocial LOB Graph IoT Image CRM INGEST STORE PREP & TRAIN MODEL & SERVE Data orchestration and monitoring Data lake and storage Hadoop/Spark/SQL and ML Azure Machine Learning IoT Azure Machine Learning 25
  26. 26. Bring AI everywhere Build with the tools and platforms you know Build, deploy, and manage at scale Boost productivity with agile development Benefit from the fastest AI developer cloud 26
  27. 27. Build, deploy, and manage at scale  Build and deploy everywhere – cloud, on-premises, edge, and in-data  Deploy in minutes with data driven management and retraining of all your models  Prototype locally then scale up and out with VMs, Spark clusters, and GPUs  Real time, high through-put insights everywhere, including Excel integration 27
  28. 28. Deployment and management of models as HTTP services Container-based hosting of real time and batch processing Management and monitoring through Azure (e.g., AppInsights) First class support for SparkML, Python, CNTK, TF, TLC, R, extensible to support others (Caffe, MXnet) Service authoring in Python and .NET Core Manage models 28
  29. 29. DOCKER Single node deployment (cloud/on-prem) Azure Container Service Azure IoT Edge Spark clusters Deploy everywhere 29
  30. 30. Boost productivity with agile development collaboration and sharing with notebooks and Git version control and reproducibility metrics, lineage, run history, asset management 30
  31. 31. Build with tools and platforms you know  Choose between visual drag-and-drop or code-first authoring your favorite IDEs any framework or library with the most popular languages  Train quicker and easier with industry-leading Spark and GPUs 31
  32. 32. Use your favorite IDE Leverage all types of platforms and tools/libraries Use what you want U S E T H E M O S T P O P U L A R I N N O VAT I O N S U S E A N Y TO O L U S E A N Y F R A M E W O R K O R L I B R A RY 32
  33. 33. Using TDSP with Azure Machine Learning
  34. 34. TDSP – AML worked-out samples in AI 35 https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/walkthroughs
  35. 35. Sample: Sentiment-specific word embeddings (SSWE) improves classification of sentiments https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/predict-twitter-sentiment http://www.aclweb.org/anthology/P14-1146
  36. 36. Summary  A well-defined data science process is critical to bridge the gap between opportunities and challenges, and put AI models in production to impact businesses  A combination of appropriate process (e.g. TDSP), tools (e.g. AML) and data- platforms (e.g. DSVM) is important for efficient development and deployment of AI solutions  TDSP – process  AML – tool  A process such as TDSP can be used with a variety of tools and data platforms  Resources are available to use TDSP and AML together, along with other data platforms on Azure to efficiently build and deploy AI solutions on the cloud
  37. 37. https://aka.ms/tdsp tdsp-feedback@microsoft.com Deck available @: https://aka.ms/tdsp-presentations

×