Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam 2018

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 103 Anuncio

Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam 2018

Descargar para leer sin conexión

In this talk Gerbert will give an overview of Artificial Intelligence, outline the current state of the art in research and explain what it takes to actually do an AI project. Using practical cases and tools he will give you insight in the phases of an AI project and explain some of the problems you might encounter along the way and how you might be able to solve them.

In this talk Gerbert will give an overview of Artificial Intelligence, outline the current state of the art in research and explain what it takes to actually do an AI project. Using practical cases and tools he will give you insight in the phases of an AI project and explain some of the problems you might encounter along the way and how you might be able to solve them.

Anuncio
Anuncio

Más Contenido Relacionado

Similares a Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam 2018 (20)

Anuncio

Más de Codemotion (20)

Más reciente (20)

Anuncio

Artificial Intelligence in practice - Gerbert Kaandorp - Codemotion Amsterdam 2018

  1. 1. Gerbert Kaandorp
  2. 2. Discover value Deploy solutions Accelerate teams Compile a strategic roadmap of viable business cases Implement scalable solutions with maximum business impact Inherit skills & best practices with expert coaching BrainCreators applies decades of experience in artificial intelligence to business challenges across all verticals
  3. 3. Trusted by
  4. 4. ● Intro to AI ● Neural Networks ● State of the art ● Data ● Autoencoders ● Hardware ● Software ● Cases Today
  5. 5. AI is in the air...
  6. 6. In perspective..
  7. 7. AI History
  8. 8. What is Artificial Intelligence? Jacques de Vaucanson (1739)
  9. 9. 1936 Alan Turing
  10. 10. Turing Test
  11. 11. 1968 Apollo Mission IBM System/360 Model 75s 3.5 M$ (~size of car) 100 operations / sec
  12. 12. 1980s Artificial Strategy “It’s you against the computer”
  13. 13. "AI is whatever hasn't been done yet" Larry Tesler the guy who invented copy & paste while working at Xerox Research (1973-1976)
  14. 14. What is AI ? Deep learning Neural Networks Machine learning Robotics Big data Self-learning Cognitive modeling Artificial Intelligence Prediction Recognition Data Analytics Classification Semantic reasoning Regression Natural Language Processing
  15. 15. AI and Machine Learning
  16. 16. Biological Neuron Neural Networks
  17. 17. Artificial Neuron Neural Networks
  18. 18. 1 neuron 1960s Neural Networks
  19. 19. 1 neuron 2012 Deep Neural Networks
  20. 20. LeNet 28×28 (1998) http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/ AlexNet 224×224 (2012) GoogLeNet 224×224 (9/2014) Resnet (n=9, 56 Layers) 28×28 (12/2015)
  21. 21. http://www.asimovinstitute.org/neural-network-zoo/ Lots of architectures
  22. 22. The source of most of these projects is freely available.. ..but usually the data is not!
  23. 23. Unreasonable effectiveness of data Source: Scaling to Very Very Large Corpora for Natural Language Disambiguation (2001 Microsoft) Data beats algorithms!
  24. 24. Unreasonable effectiveness of data If the product is free, you are the product ● - free email, free image storage, free maps, free video storage, free search, free mobile phone OS, free video calls, free translations ● - free social media channel, free messages, free image storage, free video storage ● - shopping search data, voice data (Alexa), music & video taste data (Prime Music / Video), fashion (Echo Look) ● - free search, Office 365 usage, professional network data (LinkedIn)
  25. 25. Unreasonable effectiveness of data Source: The Internet of Things: Getting Ready to Embrace Its Impact on the Digital Economy, IDC 2016
  26. 26. So what if you are not Source: https://techcrunch.com/2017/09/30/ai-hype-has-peaked-so-whats-next/ or
  27. 27. menu
  28. 28. Data Scraping Data Data Augmentation Data Simulation Proprietary Data Public Data
  29. 29. Proprietary Data Scraping Data Data Augmentation Data Simulation Proprietary Data Public Data
  30. 30. Proprietary data Unstructured Structured Labeled
  31. 31. Proprietary data ~ 1M Articles ~ 178M words ~ 258K unique words Automatic discovery of categories
  32. 32. Public Data Scraping Data Data Augmentation Data Simulation Proprietary Data Public Data
  33. 33. ● Starting point for any project ● Essential to know what is available for commercial use and what not ● Good sources (list would be endless): ○ http://publicdata.eu/dataset.html (~48K datasets) ○ https://catalog.data.gov/dataset (~197K datasets) ○ https://datahub.io/dataset (~12K datasets) ○ http://lod-cloud.net/ (~1K datasets) ○ https://open.nasa.gov/open-data/ Publicly available Data http://lod-cloud.net/versions/2017-02-20/lod.svg
  34. 34. Public data
  35. 35. Scraping Data Scraping Data Data Augmentation Data Simulation Proprietary Data Public Data
  36. 36. Scraping Data ● Lots of relevant data can be found on specific websites ● Structure of data available on target sites allows for some level of automatic tagging
  37. 37. Data gathered from 100s of scraped webshops
  38. 38. Millions of products gathered
  39. 39. Data Augmentation Scraping Data Data Augmentation Data Simulation Proprietary Data Public Data
  40. 40. Data Augmentation ● Automatically identify company sending an envelope given a single scan
  41. 41. Generated training data for each logo ● Domain knowledge ● Constrained domain ● Realistic modifications easy to generate
  42. 42. Learning cycles 1. Input logos 2. Augmented dataset 3. CNN training 4. Performance evaluation 5. Iterative refinement
  43. 43. Data Simulation Scraping Data Data Augmentation Data Simulation Proprietary Data Public Data
  44. 44. Microsoft AirSim Drone Simulator DeepDrive Universe (GTA) Data simulation - drones / automotive
  45. 45. menu
  46. 46. Comparing products by weight and size is trivial to most of us Autoencoders
  47. 47. How would you compare them quantitatively by how they look? Autoencoders
  48. 48. Color? Shape? A sense of similarity to other products? Autoencoders
  49. 49. Read the pixels into a network that gets smaller at each step, compressing the representation Autoencoders Encoder
  50. 50. Then decompress the image to predict itself as output Autoencoders Decoder
  51. 51. Repeat for every image in the set and repeat the process thousands of times Image 1 Image 2 Image 3 Autoencoders
  52. 52. We can then take the embedding (‘code’) representation and use it as a measure of the images similarity to other images Autoencoders
  53. 53. Learn a compact representations of the data Autoencoders
  54. 54. Arithmetics on word embeddings space vec("king") - vec("man") + vec("woman") = vec("queen") Autoencoders
  55. 55. Exploiting embeddings for labeling Clustering Sorting
  56. 56. BrainMatter© platform
  57. 57. BrainMatter© platform
  58. 58. BrainMatter© platform
  59. 59. BrainMatter© platform
  60. 60. BrainMatter© platform
  61. 61. BrainMatter© platform
  62. 62. 1011 neurons 104 synapses per neuron 1016 “operations” per second 100 peta flops? Cortex: 2.500 cm 2, 2 mm thick 1.4 kg, 1.7 liters 250 million neurons per mm3 . 180,000 km of “wires” 25 Watts Hardware: The Human Brain
  63. 63. 0.2 tera-FLOPS 6 cores 95 Watt Hardware: CPU Core i7 8700k
  64. 64. 10.6 tera-FLOPS 3584 cores 250 Watts Hardware: GPU NVIDIA 1080-TI GPU
  65. 65. 500 tera-FLOPS 4x5120 cores 1500 Watts Hardware: Dedicated Solution NVIDIA DGX- STATION (4x V100)
  66. 66. NVIDIA Partner :)
  67. 67. Hardware: CLOUD 93.014 tera-FLOPS
  68. 68. Hardware: AI chips Startup valuated at ~900M Google TPU Tensor Processing Unit Intel Nervana Neural Network Processor Graphcore: IPU Intelligent Processing Unit ~110M funding Microsoft Project Brainwave
  69. 69. Containerization IT needs control ● Portability (on-premise, cloud) ● Data Security / Network Isolation ● Agility and elasticity ● Standardized environments (dev, test, production) ● Higher resource utilization Data Scientists needs flexibility ● Faster development lifecycles ● Different set of tools ● Different versions ● Default packaging ● Repeatable builds
  70. 70. ML Frameworks
  71. 71. Google Tensorflow Pros: ● Computational graph abstraction. ● TensorBoard for visualization. Cons: ● Computational graph abstraction. ● Lack of pre-trained models. ● Not completely open-source.
  72. 72. Microsoft Cognitive Toolkit Pros: ● It is very flexible. ● Allows for distributed training. ● Supports C++, C#, Java, and Python. ● Significant Recurrent Neural Network modelling capabilities Cons: ● It is implemented in a new language, Network Description Language (NDL). ● Lack of visualizations.
  73. 73. Berkeley Caffe Pros: ● Supports Python and MATLAB ● Great performance. ● Allows for the training of models without writing code. Cons: ● Bad for recurrent networks. ● Not great with new architectures.
  74. 74. PyTorch Pros: ● Great development and debugging experience ● Love all things Pythonic Cons: ● Lack of visualizations. ● Deployment facebook, twitter, nvidia, salesforce
  75. 75. “We decided to marry PyTorch and Caffe2 which gives the production-level readiness for PyTorch”
  76. 76. Things to consider ● Ecosystem and code availability code examples, latest research available ● Research versus production hardened for serving at scale ● Mobile support fast kernels for ARM, Metal, etc. ● Language bindings support for R, Scala, Java ● Programming style imperative versus declarative ● Compute and memory footprint platform scalability ● Scalability and performance efficient multi-GPU and multi-instance support
  77. 77. Logistics: delivery optimization Industry: steel sheet quality control Medical: stroke diagnostics Cases
  78. 78. ● Manual identification of address data for 15% of total volume ● 4% delivered to wrong address ● Geographical location of delivery points imprecise ● Delivery window too coarse Before application of AI Case: Logistics
  79. 79. ● Fuzzy logic address matching ● GPS delivery point prediction ● Time window estimation & optimisation ● Automated location mapping (inc. po-boxes) ● Trained on historic data and self learning AI under the hood Case: Logistics
  80. 80. ● Manual correction reduced to <2% of total volume ● Delivery failures reduced by 50% ● 2000 man hours saved per month ● Improved customer service through better time windows Results Case: Logistics
  81. 81. ● A major European steel producer ● Total of 7.1 million tonnes of steel products in 2016 ● High quality sheet and strip steel ● Automotive, packaging, and construction sectors General Case: Steel sheet quality control
  82. 82. ● Kilometers of steel sheet each day ● Accurate quality assessment enables more profitable trading ● Defects need to be detected to prevent machine breaks ● Manual inspection supported by automatic camera system Initial Situation Case: Steel sheet quality control
  83. 83. ● Infrared cameras inspect moving steel sheet on conveyor belts ● Basic image processing detects regions of interest ● Manual inspection often needed ● Accuracy can still be improved Camera system Case: Steel sheet quality control
  84. 84. ● Up to 50 different defect types ● 5 million (!) new images each day ● Currently only 25 thousand annotated images available in total ● Severely imbalanced data sets ● Manual annotation is costly Data sets Case: Steel sheet quality control
  85. 85. ● Deep Learning for robust image classification ● Ai & Active Learning approach for efficient image annotation ● Integration in existing systems ● Knowledge transfer to customer’s own tech team ● Already more than 90% accurate Solution Case: Steel sheet quality control
  86. 86. ● 46.000 patients affected by a stroke / year in the NL ● Limited time to decide which hospital to send patient to ● ~6 minutes for a skilled radiologist to identify stroke location ● Small hospitals do not always have trained radiologists on staff Case: Stroke diagnostics Initial situation
  87. 87. Case: Stroke diagnostics Initial situation ● Imitate the process of expert radiologists ● Train a deep neural network on 3D volumes from left/right hemispheres ● Compare intensities and local brain structures to discern affected from healthy regions
  88. 88. Case: Stroke diagnostics Result ● 95% classification accuracy in detecting “blocks” with an occlusion present ● Complete process in under 30 secs ● Visualization tools to localize area of interest
  89. 89. http://www.braincreators.com
  90. 90. BrainCreators http://www.braincreators.com Prinsengracht 697 1017JV Amsterdam +31 (0)20 369 7260 Contact

×