Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Deep learning introduction

Cargando en…3

Eche un vistazo a continuación

1 de 44 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Deep learning introduction (20)


Más reciente (20)

Deep learning introduction

  1. 1. Deep Learning History, Introduction and Opportunities Jan 2016
  2. 2. What I am going to cover in this talk? • General view of AI, machine learning and deep learning. • Understand basics of deep learning . • Some exciting opportunities for applying deep learning.
  3. 3. Artificial Intelligence • What is intelligence? Why to create it artificially? • Strong artificial intelligence • Agent and Environment • Intelligence is the capacity to learn and solve problems • Ability to interact with the real world • Reasoning and Planning • Learning and Adaptation
  4. 4. Poster boy of AI – IBM Deep Blue • ~200 million moves / second = 3.6 * 1010 moves in 3 minutes • 3 min corresponds to ~7 plies of uniform depth minimax search • 1 sec corresponds to 380 years of human thinking time • 32-node RS6000 SP multicomputer, 16 chess chips, 32 GB opening & endgame database
  5. 5. Artificial Intelligence Impact • Complex but repetitive movements with confined cognition of the environment. • Searching in large possible answers. • Predicting based on what seen so far in the environment.
  6. 6. Evolution of AI • Machines that search and eliminate irrelevant possibilities. • Machines storing knowledge about the world and then use sored knowledge for answering. • Machines learning to generalize what it has learned by examples seen.
  7. 7. Learning by examples • Humans are good pattern matchers at unconscious level. We all learn by examples. • Learning from examples = Learning from data • What you are learning? A model. • How computer scientist is going to create it? Probability and Mathematics. • Learning = tuning the model. • How to tune it? How to make it best possible ? Error.
  8. 8. Applications so far… • image recognition • voice recognition • image search • effective text search • marketing targeting • sales prediction • optimization of advertisements • store shelf or space planning • movements of the stock market Yes, machine learning is powerful !!!
  9. 9. Its all about features. • More Data. • Advanced algorithms. • Feature engineering – Ultimately its as smart as features. Finding the correct features is critical in the success. Data Features Model
  10. 10. Machine learning engineer’s fears • A machine learning algorithm can only work well on data with the assumption that training data represents all the real data available. If unseen data has different distribution, the learned model does not generalize well. • What you see is not always what you will get next. • There is no reason*. • I need data in the format I like.
  11. 11. Pause and think. • Machine can't recognize what knowledge it should use when it is assigned a task. • Machine can't understand a concept that puts knowledge pieces together, it is at the mercy of chunks of examples fed in. • Machine can't find out which features should be considered while learning from examples.
  12. 12. Intuitive Example • Imagine that you don’t speak a word of Chinese, but your company is moving you to China next month. Company will sponsor Chinese speaking lesson for you once you are there, but you want to prepare yourself before you go. • You decide to listen to Chinese radio station  • For a month, you bombard yourself with Chinese radio. • You don’t know the meaning of Chinese words. • Lets think that somehow your brain develops capacity to understand few commonly occurring patterns without meaning. In other words, you have developed a different level of representation for some part of Chinese by becoming more tuned to its common sounds and structures. • Hopefully, when you arrive in China, you’ll be in a better position to start the lessons. Example loosely taken from Lecture series by Prof. Abu Mustafa
  13. 13. Welcome to deep learning • Learn features without being explicit - automatic feature extraction. • Multiple linear and non-linear transformations. • Build hierarchy of notable features into more informative features, keep doing it. • Work with very large number of examples. Modern data sets are enormous. • Beat the benchmarks.
  14. 14. Biology Neuron • The brain is composed of lot of interconnected neurons. Each neuron is connected to many other neurons. • Neurons transmit signals to each other. • Whether a signal is transmitted is an all-or-nothing event (threshold). • Strength of the signal is sent, depends on the strength of the bond (synapse) between two neurons. Neurons (10^11 ) synapses (10^14) connect the neurons Brains learns by 1) Altering strength between neurons 2) Creating/deleting connections
  15. 15. Artificial Neuron x1 x2 . . . . . . xn y = wixi x0 w0 w1 w2 wn Activation Σ Some activation functions: Step function/threshold function Sigmoid function
  16. 16. Neural Net one example (slide credit: Eric Xing, CMU)
  17. 17. Back propagation idea • Treat the problem as one of minimizing errors between the example label and the network output, given the example and network weights as input • Error(example) = (true value – calculated value from inputs)2 • Sum this error term over all examples • E(w) =  Error = i (yi – f(xi,w))2 • Minimize errors using an optimization algorithm • Stochastic gradient descent is typically used. Forward pass: signal = activity = y Backward pass: signal = dE/dx
  18. 18. Back propagation algorithm • Initialize all weights to small random numbers. • Until stopping condition (# epochs or no errors), do • For each training input, do 1. Input the training example to the network and propagate computations to output 2. Error = Compare actual value to calculated value 3. Adjust weights according to the delta rule, propagating the errors back; The weights will be nudged closer so that the network learns to give the desired output. The weights will begin to converge to a point where error across multiple training inputs is minimum.
  19. 19. Back propagation thoughts • Is powerful - can learn any function, given enough hidden units. • Has the standard problem of generalization vs. Memorization. With too many units, the network will tend to memorize the input and not generalize well. Some schemes exist to “prune” the neural network. • Networks require extensive training, many parameters to fiddle with. Can be extremely slow to train. May not find the best possible combination of weights. • Inherently parallel algorithm, ideal for multiprocessor hardware. • Despite these, is a very powerful algorithm that has seen widespread successful deployments.
  20. 20. Do more… • Create columns of artificial neurons • Connect the columns. Create depth. • Go deep. How deep you can go? • Keep feeding massive amounts of data. And labels too… • Give more days to learn. • Use machines good at multiplying large matrices. • At the end… tune it! tune it!
  21. 21. Learning Representations
  22. 22. Multiple levels of abstraction • Layer 1: presence/absence of edge at particular location & orientation. • Layer 2: motifs formed by particular arrangements of edges; allows small variations in edge locations • Layer 3: assemble motifs into larger combinations of familiar objects • Layer 4 and beyond: higher order combinations Key Idea: the layers are not designed by an engineer, but learned from data using a general-purpose learner.
  23. 23. Features by labels Examples of learned object parts from object categories Faces Cars Elephants Chairs
  24. 24. AlexNet – Classify Images Human performance 5.1% error
  25. 25. How does it look like? 1 layer hidden layer network Deep network Biggest NN so far: ~10^4 neurons, ~10^8 connections
  26. 26. Deep Nets Go deeper
  27. 27. Deep Learning Impact Computer Vision Image recognition (e.g. Tagging faces in photos) Audio Processing Voice recognition (e.g. Voice based search, Siri) Natural Language Processing automatic translation Pattern detection (e.g. Handwriting recognition)
  28. 28. C for Cat… Learning DL way • Google scientists created one of the largest deep neural networks by connecting 16,000 computer processors. They presented this network called Google Brain with 10 million digital images found in YouTube videos, what did Google’s Brain learn after viewing these images for three days?
  29. 29. Latest buzz Alpha Go • DeepMind’s AlphaGo beats Lee Sedol in Go • AlphaGo used 40 search threads, 48 CPUs, and 8 GPUs • AlphaGo learned using a general-purpose algorithm that allowed it to interpret the game’s patterns. • AlphaGo program applied deep learning.
  30. 30. Anatomy of deep nets • Batches and Epochs • Layers and stacking • Preprocessing • Objective function and Optimizer • Activations • Initialization • train - model - test
  31. 31. What it can solve? • Classification • Classify visual objects, Identify objects - faces in images and video • Classify audio and text • Prediction • Predict the probability that a customer will choose a product. • Forecast demand for a product. • Predict what happens next in videos? • Generation • Generate pictures and paintings, cool artsy stuff. • Generate writing – write headlines, articles and novels. • Give captions
  32. 32. ML in automotive industry • Identify and navigate roads and obstructions in real-time for autonomous driving. • Predict failure and recommend proactive maintenance on vehicle components. • In vehicle recommendation engine. • Discover anomalies across fleet of vehicle sensor data to identify potential failure risks.
  33. 33. ML in manufacturing • Predict failure and recommend proactive maintenance for production and moving equipment. • Predict supply chain failures and demand cycles. • Detect product defects visually.
  34. 34. ML in stores and e-commerce • Optimize in-store product assortment to maximize sales. • Personalize product recommendations and advertising to target individual consumers. • Classify visual features from in-store video. • Product search.
  35. 35. ML in finance • Personalize product offerings to target individual consumers. • Fraud detection. • Optimize branch/ATM network based on diverse signals of demand. • Predict asset price movements based on greater data. • Predict risk of churn for individual customers/clients and recommend renegotiation strategy. • Loan. How much? How long? Customize.
  36. 36. ML in agriculture • Customize growing techniques specific to individual plot characteristics. • Optimize pricing in real time based on future market, weather, and other forecasts. • Predict yield for farming or production leveraging IoT sensor data. • Predict new high-value crop strains based on past crops, weather/soil trends, and other data. • Construct detailed map of farm characteristics based on aerial video. • Intrusion detection from video.
  37. 37. ML in energy • Predict failure and recommend proactive maintenance for mining, drilling, power generation, and moving equipment. • Replicate human-made decisions to control room environments to reduce cost. • Optimize energy scheduling/dispatch of power plants based on energy pricing, weather, and other real-time data. • Predict energy demand.
  38. 38. ML in healthcare • Diagnose known diseases from scans, biopsies, audio, and other data. • Predict personalized health outcomes to optimize recommended treatment. • Identify fraud, waste, and abuse patterns in clinical and operations data. • Detect major trauma events from wearables sensor data and signal emergency response. • Optimize design of clinical trials. • Predict outcomes from fewer or diverse (e.g., animal testing) experiments • Identify target patient subgroups that are underserved (e.g., not diagnosed).
  39. 39. ML in public service and social sector • Optimize public resource allocation for urban development to improve quality of life. (e.g., reduce traffic, minimize pollution) • Replicate back-office decision processes for applications, permits and tax auditing. • Predict individualized educational and career paths to maximize engagement and success. • Predict risk of failure for physical assets (e.g., military, infrastructure) and recommend proactive maintenance. • Predict risk of illicit activity or terrorism using historical crime data, intelligence data, and other available sources (e.g., predictive policing).
  40. 40. ML in media • Discover new trends in consumption patterns. Serve content and advertisements. • Optimize pricing for services/offerings based on customer-specific data.
  41. 41. ML in telecom • Predict regional demand trends for voice/data/other traffic. • Discover new trends in consumer behaviour using mobile data and other relevant data.
  42. 42. ML in logistics • Read addresses/bar codes in mail/parcel sorting • Identify performance and risk for drivers/pilots through driving patterns. • Personalize loyalty programs and promotional offerings to individual customers. • Predict failure and recommend proactive maintenance for planes, trucks, and other moving equipment. • Optimize pricing and scheduling based on real-time demand updates.
  43. 43. Acknowledgements • Images and slides taken from various deep learning courses. • Use cases in various industries taken from Mckinsey Analytics survey. • This presentation and is created for deep learning audience for no monetary benefits. Please inform the uploader if you want some part to be taken out.
  44. 44. Obtaining an understanding of the human mind is one of the final frontiers of modern science. Thanks Adwait Bhave