Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

AI Orange Belt - Session 2

734 visualizaciones

Publicado el

Harness the power of Artificial Intelligence abilities - session 2 of AI Orange Belt training for business managers & executives.
www.aiblackbelt.com

Publicado en: Educación

AI Orange Belt - Session 2

  1. 1. AI Orange Belt Session 2 - Harness the power of AI abilities 1
  2. 2. AI ORANGE BELT SKILLS © PROPERTY OF AI BLACK BELT ORANGE BELT The prerequisites : what is AI, how does it work in real life How to manage and implement an artificial intelligence project DEFINITION PROJECT 2
  3. 3. • AI is a form of advanced computer science, that learns from data in order to expand its generalization abilities on narrow tasks, as opposed to regular software hardcoded instructions • AI can be subdivided into supervised learning - the bulk of modern applications, unsupervised learning - grouping for visualization and exploration purpose mainly, and reinforcement learning - difficult to implement but powerful in some optimization with actions cases • The list of tasks AI can solve can broadly be divided into : classification, prediction, clustering, outlier detection, recommandation, data generation • The different subdomain of applications can be determined by the data input/output types : vision, NLP (text&speech), structured classic, robotics What have we seen last time? 3
  4. 4. Supervised Learning 4
  5. 5. if color == "green": return "apple" elif color == "orange": return "orange" else return "banana" “apple” “orange” “banana” “apple” 5 Can you write a computer program that does that ?
  6. 6. “apple” “orange” “banana” “banana” if shape == "round": if color == "green" return "apple" else color == "orange" return "orange" else return "banana" 6
  7. 7. “apple” “orange” “banana” “banana” if shape == "round": if color == "green" return "apple" elif color == "orange" return "orange" else return "banana" 7
  8. 8. “apple” “orange” “banana” “banana” ? “apple” 8
  9. 9. Input (A) Output (B) Object Detection Applications email Intent detection Text Classification audio Text transcript Speech Recognition French Dutch Machine Translation Object verification or Identification Anomaly Detection 9
  10. 10. Unsupervised Learning 10
  11. 11. Reinforcement Learning 11
  12. 12. • AI is a form of advanced computer science, that learns from data in order to expand its generalisation abilities on narrow tasks, as opposed to regular software hardcoded instructions • AI can be subdivided into supervised learning - the bulk of modern applications, unsupervised learning - grouping for visualisation and exploration purpose mainly, and reinforcement learning - difficult to implement but powerful in some optimisation with actions cases • The list of tasks AI can solve can broadly be divided into : classification, prediction, clustering, outlier detection, recommandation, data generation • The different subdomain of applications can be determined by the data input/output types : vision, NLP (text&speech), structured classic, robotics What have we seen last time? 12
  13. 13. •Understand the limits of AI and the main biases when it comes to create intelligent machines in real life •Lifecycle of an AI application, and how it differs from regular workflows •How to detect opportunities / use cases, and evaluate their impact on the revenue of the company. Cost per task, revenue per task •Team management, project management (create and deploy) and data management Our plan for today - the real world 13
  14. 14. 14
  15. 15. What barriers are faced at work ? 15
  16. 16. 01 03 02 06 04 05 MaintenanceIdentify DeployData EvaluateModel Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 16
  17. 17. Key steps of a machine learning project Echo / Alexa 1. Collect data 2. Train model Iterate many times until good enough 3. Deploy model Get data back Maintain / update the model 01 03 02 06 04 05 MaintenanceIdentify DeployData EvaluateModel Source : deeplearning.ai 17
  18. 18. Key steps of a machine learning project Self-driving car 1. Collect data image position of other cars 2. Train model 3. Deploy model Get data back Maintain model 01 03 02 06 04 05 MaintenanceIdentify DeployData EvaluateModel Source : deeplearning.ai 18
  19. 19. 01 Select the right question Choose the performance metric Decide the level of explainability Identify Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 19
  20. 20. Discover what’s possible What would be helpful for your business? 20
  21. 21. Anything you can do with 1 second of thought, can probably be automated today 21
  22. 22. “The toy arrived two days late, so I wasn’t able to give it to my nephew for his birthday. Can I return it ?” “Refund request” Refund/Shipping/OrderInput text “Oh sorry to hear that! I hope you nephew had a good birthday. Yes, we can help with ... Complex personalised empathetic response Input text “Yes you can. The refund procedure is ...” Simple responseInput text 22
  23. 23. Diagnose pneumonia on ~ 10.000 images Diagnose pneumonia from 10 images of a medical textbook Ask to perform on new type of data 23
  24. 24. Take a (deep) look at your work Break down your workflow and your business unit 24
  25. 25. 25
  26. 26. Baby food ingredient: safe or spoiled? Patient: ideal medication dosage? Email: spam or ham? Recorded phone call to call center: issue topic? Bottle of wine: will I like it or not? Steering wheel: left or right? Photo: which animal? Game piece: which location on the board? Start of a sentence: end of that sentence? Stock: tomorrow’s price? Transaction: legitimate or fraudulent? Data center cooling system: warmer or cooler? Machine: when will it need maintenance? Inventory: when to restock? Scene description: pixels in a visual rendering? Today’s temperature: tomorrow’s temperature? Auction: how much to bid? Movie: will you like it or not? Live lecture: text captions? Poem: what does it sound like out loud? Image of an invoice: total amount in dollars? Service request: waiting time? Expense report: budget category? Sound recording: correct text captions? Song lyrics: language? Sentence in English: same meaning in Chinese? Form incorrectly filled out: correct fields? Clothing item: skirt or blouse or …? Video: which actors? Video game: joystick motion? Toilet user: did they wash their hands? Idea 1 Ask simple guesswork labelling question 26
  27. 27. 27
  28. 28. Idea 2 find the ROI of (cheap) prediction Level 1: as an optimisation tool Level 2: as an improvement / help / recommandation Level 3: as a new feature / product 28
  29. 29. 29
  30. 30. Discover Opportunities- Brainstorming framework 1. Think about automating tasks rather than jobs! 2. What are the main drivers of business value? 3. What are the main pain points in your business ? 4. How much data is needed ? Is my data clean ? Are we mature in terms of data ? 30
  31. 31. What AI can do Valuable cases for your business AI experts Domain experts Cross-functional team 31
  32. 32. 32 32
  33. 33. 33 33
  34. 34. 34 34
  35. 35. Real life case studies Fromcorebusinesstolow-hangingfruits 35 35
  36. 36. Recommandations “35 percent of what consumers purchase on Amazon come from product recommendations” https://www.mckinsey.com/industries/retail/our-insights/how-retailers-can-keep-up-with-consumers 36 36
  37. 37. Amazon Go “Amazon.com Inc. may open up to 3,000 Amazon Go outlets by 2021” https://www.bloomberg.com/news/articles/2018-09-19/amazon-is-said-to-plan-up-to-3-000-cashierless-stores-by-2021 37 37
  38. 38. Amazon Echo 38
  39. 39. Amazon Robotics 39
  40. 40. Amazon Prime Air 40
  41. 41. 41 41
  42. 42. Choose the performance metric What are you willing to lose? 42
  43. 43. North star metric? 43
  44. 44. Optimizing vs satisficing metric Possible metrics Cost = accuracy – 0.5 x running time Or Maximize accuracy Subject to running time <= 100 ms Use case 1 : Cat classifier Use case 2 : Detect trigger words for Amazon Alexa Device Possible metrics Accuracy Or Maximize accuracy Subject to <= 1 FP for every 24 hours 44
  45. 45. 01 02 Select the right question Choose the performance metric Decide the level of explainability Identify Find the right data Structure annotate data Clean Data Data Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 45
  46. 46. 46
  47. 47. What data looks like? 47
  48. 48. 48
  49. 49. 49
  50. 50. 50
  51. 51. 51
  52. 52. Data exploration 52
  53. 53. 53
  54. 54. 54
  55. 55. Most applications require lots of data 55
  56. 56. Where will you get it? Then prioritise by availability, accessibility & cost - existing data sources - data enrichment (feature engineering) - data augmentation - data generation - manual data labeling - create new data sources (e.g. sensors) - Public data, scraping, etc 56
  57. 57. Data labeling 57
  58. 58. If the benefit of the performance increase outweighs the cost of acquiring more data, get more data! Diminishing returns mean that more data won’t help What amount of data ? 58
  59. 59. Hire, Crowdsource, Service Three choices 59
  60. 60. Data flywheel strategy use users to gather more (noisy) data 60
  61. 61. Example : reCAPTCHA OCR Image Classifier 61
  62. 62. Synthetic data 62
  63. 63. Could you think of other features that you could obtain that would help with this task? Feature engineering 63
  64. 64. Increased execution risk if data of not good quality 64
  65. 65. Garbage in, garbage out 65
  66. 66. What do you see here ? 66
  67. 67. Bias in a typical ML paradigm 67
  68. 68. Bias in a typical ML paradigm 68
  69. 69. Selection bias 69 69
  70. 70. Selection bias 70 70
  71. 71. Outgroup homogeneity bias They are alike, we are diverse 71 71
  72. 72. Confirmation bias 72 72
  73. 73. Correlation fallacy 73 73
  74. 74. Automation bias Don’t trust the machine 74 74
  75. 75. Design with Fairness in mind ! Consider the problem ! Ask experts ! Train the models to account for bias ! Interpret outcomes ! Publish with context 75
  76. 76. ⊙ Big data is always better, but not necessary ⊙ Clean data better than a lot of messy data ⊙ Small data is almost always enough to make progress (activate feedback loop) ⊙ If no data, don’t give up, if can be generated or augmented! ⊙ Design ML model with fairness in mind Go talk to a ML Engineer to figure it out Data Consideratio ns 76
  77. 77. Exercice: find your data 77
  78. 78. 01 03 02 Select the right question Choose the performance metric Decide the level of explainability Identify Find the right data Structure annotate data Clean Data Data Select the right algorithm Tune the model Model Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 78
  79. 79. Main steps for model training 1. Select your model family (and your performance metric) 2. Split you dataset into Train/Dev/Test set 3. Train model on training set 4. Take care of overfitting vs underfitting 5. Tuning hyper parameters 6. Select best model 79
  80. 80. 1) Select the appropriate family of models 80
  81. 81. 81
  82. 82. 2) Dataset splitting into Train/Dev/Test 82
  83. 83. 3) Train model on training set (Fit & tune model) 83
  84. 84. 4) Take care of overfitting vs underfitting 84
  85. 85. 5) Tuning hyperparameters (with cross-validation) 85
  86. 86. 5) Tuning hyperparameters (with cross-validation) 86
  87. 87. 6) Selecting best model For each algorithm (i.e. regularized regression, random forest, etc.): For each set of hyperparameter values to try: Perform cross-validation using the training set. Calculate cross-validated score. 87
  88. 88. Checkpoint quizz ! Pick one: better data or fancier algorithms ? ! When should you split your dataset into training and test sets, and why? ! What's the key difference between model parameters and hyperparameters? ! Explain how cross-validation helps you "tune" your models? 88
  89. 89. 89
  90. 90. 90
  91. 91. 01 03 02 04 Select the right question Choose the performance metric Decide the level of explainability Identify Find the right data Structure annotate data Clean Data Data Decide on an acceptable error Test on the right scope Evaluate Select the right algorithm Tune the model Model Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 91
  92. 92. North star metric? 92 92
  93. 93. Acceptable rate (comparting to human level performance) 93
  94. 94. Explainability of the model ● Depending on the machine learning model used, the results could be : ● Very simple to interpret: Like decision trees ● Very difficult to interpret: Like deep-learning neural networks 94
  95. 95. Explainability of the model On a decision tree, set of rules are well defined 95
  96. 96. Explainability of the model On a deep-learning neural network, interpretability of weights is difficult. 96
  97. 97. Explainability of the model We could still use more sophisticated technique to partially understand their predictions. This is an example on logo detection algorithms Image Grad-cam Image Grad-cam 97
  98. 98. Performance of the model TP TNFP FN YES NO YES NO Predicted Actual Confusion Matrix 98
  99. 99. Performance of the model 1. 4.3. 2. YES NO YES NO Predicted Actual How confusion matrix can help understand the model performance Imagine you have a medical problem, do you go see your doctor? 1. If you should and you did, the fee is 25€ 2. If you should and you didn’t, it gets worse and you will see a specialist, the fee is 70 € 3. If you shouldn't and you did, you still pay 25€ 4. If you shouldn't and you didn’t, you do not pay anything OK OK Loose 25 € Loose 45 € 99
  100. 100. Performance of the model 200 10020 40 YES NO YES NO Predicted Actual How confusion matrix can help understand the model performance Which ML model is better, according to confusion matrices ? Loose 25 € Loose 45 € 210 8535 30 YES NO YES NO Predicted Actual Loose 25 € Loose 45 € Loose 20 * 25 € + 40 * 45 € = 2 300 € Loose 35 * 25 € + 30 * 45 € = 2 225 € 100
  101. 101. Accuracy metric Let us speak in terms of seeing your doctor: ● Accuracy: Over all the choices (see or not your doctor) you make, how many of them were correct? !""#$%"& = () + (+ () + ,+ + ,) + (+ TP TNFP FN YES NO YES NO Predicted Actual 101
  102. 102. Precision & Recall metrics Let us speak in terms of seeing your doctor: ● Recall: Over all the times you should go see your doctor, how many times you really went? !"#$%% = '( '( + *+ ● Precision: Over all the times you did go see your doctor, how many of times you really needed to see him? (,"#-.-/0 = '( '( + *( TP TNFP FN YES NO YES NO Predicted Actual 102
  103. 103. Accuracy VS Precision & Recall ● The accuracy is not used when the problem is not balanced. ● If 99% of your data are just one class ● An accuracy of 99% is just a majority vote ● Precision and recall are more useful in this case since you can focus on each class individually 103
  104. 104. Which ML method is preferred ? Use Case : Customer Churn Target action A : phone call to potential churning customer Target action B : send generous discount to potential churners Which method is preferred for each target ? 104
  105. 105. 105
  106. 106. Performance of the model 75% Precision 64% Volume 52% Volume 49% Volume 75% Volume 80% Precision 85% Precision 90% Precision 106
  107. 107. Performance of the model Manual labeling cost / cost of no action Cost of error 107
  108. 108. Payout as a function of Threshold 108
  109. 109. Exercice - regression / classification 109
  110. 110. 01 03 02 04 05 Select the right question Choose the performance metric Decide the level of explainability Identify Use the right architecture Have the talents in place Deploy Find the right data Structure annotate data Clean Data Data Decide on an acceptable error Test on the right scope Evaluate Select the right algorithm Tune the model Model Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 110
  111. 111. Basics of data engineering A crash course 111
  112. 112. 112
  113. 113. ・Identifying fraudulent claims so that they can select claims for deeper manual investigation; they have a business goal of reducing fraud by 5% this year. ・Predicting weather patterns so that they can advise customers to protect their vehicles by bringing them inside when there’s a high chance of storms — thereby reducing vehicle damage claims by 2%. ・Upselling other insurance products to the customer based on the products they already have. The goal is to increase the conversion rate for online upselling by 3%. Still think vertical – 3 use cases 113
  114. 114. 114
  115. 115. 01 03 02 06 04 05 Monitoring & Updates Have the right talents & solutions Maintenance Select the right question Choose the performance metric Decide the level of explainability Identify Use the right architecture Have the talents in place Deploy Find the right data Structure annotate data Clean Data Data Decide on an acceptable error Test on the right scope Evaluate Select the right algorithm Tune the model Model Applied AI Lifecycle © PROPERTY OF AI BLACK BELT 115
  116. 116. A refined way of seeing it.. 116
  117. 117. 117

×