Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Applications of Machine Learning at USC

2.360 visualizaciones

Publicado el

Applications of Machine Learning at USC presentation by Alex Tellez

- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata

Publicado en: Software

Applications of Machine Learning at USC

  1. 1. APPLICATIONS OF MACHINE LEARNING AlexTellez + Amy Wang + H2OTeam USC, 4/8/2015
  2. 2. AGENDA 1. Introduction to Big Data / ML 2. What is H2O.ai? 3. Use Cases: 4. Data Science Competition a) Beat Bill Belichick b) Fight Crime in Chicago c) Whiskey Recommendation Engine d) Bordeaux Wine Vintage
  3. 3. 1. INTROTO BIG DATA / ML BIG DATA IS LIKE TEENAGE SEX: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it… Dan Ariely, Prof. @ Duke
  4. 4. BIGVS. SMALL DATA When you try to open file in excel, excel CRASHES SMALL = Data fits in RAM BIG = Data does NOT fit in RAM Basically… Big Data is data too big to process using conventional methods (e.g. excel, access)
  5. 5. V +V +V Today, we have access to more data than we know what to do with! 1) Wearables (fitbit, iWatch, etc) 2) Click streams from web visitors 3. Sensor readings 4. Social Media Outlets (e.g. twitter, facebook, etc) Volume - Data volumes are becoming unmanageable Variety - More data types being captured Velocity - Data arrives rapidly and must be processed / stored
  6. 6. THE HOPE OF BIG DATA 1. Data contains information of great business / personal value Examples: a) Predicting future stock movements = $$$ b) Netflix movie recommendations = Better experience = $$$ 2. IF you can extract those insights from the data, you can make better decisions Enter, Machine Learning (ML)… So how the hell do you do it?
  7. 7. MACHINE LEARNING The Wikipedia Definition: …a scientific discipline that explores the construction and study of algorithms that can learn from data. Such algorithms operate by building a model…. ZZZzzzzzZZZzzzzzz My Definition: The development, analysis, and application of algorithms that enable machines to: make predictions and / or better understand data 2 Types of Learning: SUPERVISED + UNSUPERVISED
  8. 8. SUPERVISED LEARNING What is it? Examples of supervised learning tasks: 1. ClassificationTasks - Benign / Malignant tumor 2. RegressionTasks - Predicting future stock market prices 3. Image Recognition - Highlighting faces in pictures Methods that infer a function from labeled training data. Key task: Predicting ________ . (Insert your task here)
  9. 9. UNSUPERVISED LEARNING What is it? Examples of unsupervised learning tasks: 1. Clustering - Discovering customer segments 2.Topic Extraction - What topics are people tweeting about? 3. Information Retrieval - IBM Watson: Question + Answer Methods to understand the general structure of input data where no predictions is needed. 4.Anomaly Detection - Detecting irregular heart-beats NO CURATION NEEDED!
  10. 10. 2.WHAT IS H2O? What is H2O? (water, duh!) It is ALSO an open-source, parallel processing engine for machine learning. What makes H2O different? Cutting-edge algorithms + parallel architecture + ease-of-use = Happy Data Scientists / Analysts
  11. 11. TEAM @ H2O.AI 16,000 commits H2O World Conference 2014
  12. 12. COMMUNITY REACH 120 meetups in 2014 11,000 installations 2,000 corporations First Friday Hack-A-Thons
  13. 13. TRY IT! Don’t take my word for it…www.h2o.ai Simple Instructions 1. CD to Download Location 2. unzip h2o file 3. java -jar h2o.jar 4. Point browser to: localhost:54321 GUI R
  14. 14. 3. USE CASES (LOTS OF EM) BEAT BILL BELICHICK
  15. 15. TB + BB Bill Belichick Tom Brady + = 15 years together 3 Super Bowls
  16. 16. PASS OR RUN? On any given offensive play… Coach Bill can either call a PASS or a RUN What determines this? Game situation Opposing team Time remaining, etc, etc Yards to go (until 1st down) Basically, LOTS of stuff. Personnel
  17. 17. BUT WHAT IF?? Question: Can we try to predict whether the next play will be PASS or RUN using historical data? Approach: Download every offensive play from Belichick-Brady era since 2000 Use various Machine Learning approaches to model PASS / RUN Disclaimer: I’m not a Seahawks fan! Extract known features to build model inputs
  18. 18. DATA COLLECTION Data: 13 years of data (2002 -2013 season) 194 games total 14,547 total offensive plays (excludes punts, kickoffs, returns) Response Variable: PASS / RUN Model Inputs: Quarter, Minutes, Seconds, OpposingTeam, Down, Distance, Line of Scrimmage, NE-Score, OpposingTeam Score, Season, Formation, Game Status (is NE losing / winning / tied)
  19. 19. FIGHTING CRIME IN CHICAGO Spark + H2O
  20. 20. OPEN CRIME DATA Crime Dataset: Crimes from 2001 - Present Day ~ 4.6 million crimes
  21. 21. THE WINDY CITY Harvest Chicago Weather data since 2001
  22. 22. SOCIOECONOMIC FACTORS Crimes segmented into Community Area IDs Percent of households below poverty, unemployed, etc.
  23. 23. SPARK + H2O Weather CrimesCensusWeatherWeather Data munging Spark SQL join Deep Learning Evaluate models GOAL: For a given crime, predict if an arrest is more / less likely to be made!
  24. 24. JOIN DATASETS crime data weather data census data Using Spark, we join 3 datasets together to make one mega dataset!
  25. 25. DATAVISUALIZATION arrest rate season of crime temperature during crime community crime is committed in
  26. 26. SPLIT DATA INTOTEST/TRAIN SETS training set arrest rate test set arrest rate train model on this segment, 80% of data validate the model on this segment (remaining 20%) ~40% of crimes lead to arrest
  27. 27. DEEP LEARNING Problem: For a given crime, is an arrest more / less likely? Deep Learning: A multi-layer feed-forward neural network that starts w/ an input layer (crime + weather data) followed by multiple layers of non-linear transformations
  28. 28. HOW’D WE DO? nice! ~ 10 mins
  29. 29. SINGLE-MALT SCOTCH Single-Malt Scotch A whiskey made at one particular distillery from a mash that only uses malted grain (barley) Solid Standards: Must be aged at least 3 years in oak casks Many famous distilleries produced in northern regions of Scotland
  30. 30. OF COURSE,THERE’S A DATASET FORTHAT! THE Single Malt Dataset 85 distilleries from Northern Scotland 12 descriptor features: E.g. Sweetness, Smoky,Tobacco, Honey, Spicy, Malty, etc Each descriptor rated 0 (weak) to 4 (strong) Problem: Can we build a whiskey recommendation engine based on whiskeys I have tried (and liked!) already?
  31. 31. DIMENSIONALITY REDUCTION + K-MEANS First, let’s reduce the 12 features to a lower dimensional space using a linear transformation (Principal Components Analysis) 7 principal components explain ~ 85% of the variance in dataset Then let’s use a clustering algorithm to determine unique whiskeys using the new PCA’d dataset 11 clusters are appropriate Pipe out the cluster assignments and start buying whiskey!
  32. 32. MODEL RESULTS I ENJOY: OTHER WHISKEYS THAT CLUSTER WITH THESE:
  33. 33. OTHER POPULAR BRANDS APPARENTLY, LOTS OF PEOPLE LIKE: OTHER WHISKYES THAT CLUSTER WITH THESE:
  34. 34. AUTOENCODER + H2O Input Output Hidden Features Information Flow x1 x2 x3 x4 x1 x2 x3 x4 Dogs, Dogs and Dogs
  35. 35. ANOMALY DETECTION OFVINTAGE YEAR BORDEAUX WINE
  36. 36. BORDEAUX WINE Largest wine-growing region in France + 700 Million bottles of wine produced / year ! Some years better than others: Great ($$$) vs.Typical ($) Last Great years: 2010, 2009, 2005, 2000
  37. 37. GREATVS.TYPICALVINTAGE? Question: Can we study weather patterns in Bordeaux leading up to harvest to identify ‘anomalous’ weather years >> correlates to Great ($$$) vs.Typical ($)Vintage? The Bordeaux Dataset (1952 - 2014 Yearly) Amount of Winter Rain (Oct > Apr of harvest year) Average Summer Temp (Apr > Sept of harvest year) Rain during Harvest (Aug > Sept) Years since last Great Vintage
  38. 38. AUTOENCODER + ANOMALY DETECTION ML Workflow: 1)Train autoencoder to learn ‘typical’ vintage weather pattern 2) Append ‘great’ vintage year weather data to original dataset 3) IF great vintage year weather data does NOT match learned weather pattern, autoencoder will produce high reconstruction error (MSE) ‘en primeur of en primeur’ - Can we use weather patterns to identify anomalous years >> indicates great vintage quality? Goal:
  39. 39. RESULTS (MSE > 0.10) Mean  Square  Error 1961V 2009V 2005V 2000V 1990V 1989V 1982V 2010V
  40. 40. 2014 BORDEAUX?? Mean  Square  Error 2014  ?2013
  41. 41. 4. DATA SCIENCE COMPETITION Apply / Learn More @: apps.h2o.ai Checkout ourYouTube Channel for last year’s talks @ H2O World

×