Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

1415 track 1 wu_using his laptop

674 visualizaciones

Publicado el

#EMSNYCDAY2

Publicado en: Marketing
  • Sé el primero en comentar

1415 track 1 wu_using his laptop

  1. 1. 10/24/2017 1 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD the black art of machine learning Michael Wu, PhD (@mich8elwu) chief scientist @ lithium tech 2017.10.31 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD Michael Wu, PhD (@mich8elwu) chief scientist @ lithium tech 2017.09.28 @mich8elwu 2
  2. 2. 10/24/2017 2 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • data  info  insight  buying calcium, zinc, magnesium, cotton balls, and switching to unscented lotions + soaps is a predictor of pregnancy • decision  action  coupons for moms, timed to specific stages of pregnancy • result  ↗ revenue $44B (2002) → $67B (2010) THE POWER OF BIG DATA + DATA SCIENCE btw, did you know your daughter is pregnant? big data + analytics 7 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • data  info  insight  filling out an loan application with only capital or lower case letter is predictive of loan default • decision  action  augment traditional underwriting regression model w/ thousands of variables & 10+ models • result  ↘ loan default rate by 40%  ↗ market share by 25% THE POWER OF BIG DATA + DATA SCIENCE 8
  3. 3. 10/24/2017 3 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD 9 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • data has huge amount of statistical redundancy  duplication  spatial + temporal correlation  collinearity (causality) • much info we extract from the data are not insightful • insights must be  interpretable  relevant  novel (not already known) DATA ≠ INFORMATION ≠ INSIGHT big that’s not statistically redundant = information data that’s not already known = insight 10
  4. 4. 10/24/2017 4 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • relevant data: signal vs. noise • relevance is context specific  who: one man’s signal is another man’s noise NOT ALL DATA/INFORMATION ARE RELEVANT information data insight relevant to me relevant to you noise 14 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • relevant data: signal vs. noise • relevance is context specific  who: one man’s signal is another man’s noise  when?  where?  what’s relevant is determined by the problem you are trying to solve or the question you are trying to answer NOT ALL DATA/INFORMATION ARE RELEVANT information data insight relevant to me when I am traveling in Istanbul today noise context is usually specified in the problem/question 15
  5. 5. 10/24/2017 5 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD big data is very noisy 17 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • how do people use data before big data? WHY IS BIG DATA SO NOISY? data is almost always relevant problem/question Q data collection data data is collected specifically to address the problem/question 18
  6. 6. 10/24/2017 6 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • enables data capture/storage before we have a question WHAT HAPPENS W/ BIG DATA TECHNOLOGIES? most big data will be irrelevant (only a tiny % of it will be relevant) data collection problem/question Q data is collected irrespective of any specific problem / question / purpose must find the “relevant data” whenever we got a problem/question 19 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • for all data (any data): data ≥ information ≥ insight • for big data: data information insight • “a single grain of rice can tip the scale” • “1 bit of insightful info. may be the difference between victory and defeat” DATA ≠ INFORMATION ≠ INSIGHT information data insight >> >> 20
  7. 7. 10/24/2017 7 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD • look beyond what’s relevant  look at what you thought were the irrelevant data/info • don’t look too far beyond your relevance boundary  it’s costly and wasteful  hard to establish causality • you might not find anything, but when you do, it will be insightful  zest finance WHERE DO YOU LOOK IN YOUR BIG DATA TO FIND INSIGHTS? information data insight noise relevant signal 21 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD big data infor- mation DO BIZ REALLY WANT BIG DATA? insight business needs hadoop hivehbase pig big data tech. noSQL impalaspark storm … hugegap 22
  8. 8. 10/24/2017 8 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD big data infor- mation THE BIG DATA GAP: FROM DATA TO INSIGHTS insight ? 23 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD big data infor- mation FROM DATA TO INSIGHTS insight data scientist is currently the only way companies know how to fill this gap 24
  9. 9. 10/24/2017 9 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD so what do data scientists do? 29 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD DATA SCIENCE INPUT + OUTPUT 30 input output
  10. 10. 10/24/2017 10 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD STEP 1: GET THE DATA data scientist is ~50% data janitor normalization: type, range, unit, format, foreign key ref … exception handling: spam, missing data, incomplete data … dedupe, metadata tagging … POS tagging entity detection sampling + sample selection special handling for rich media … 31 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD RAW DATA USUALLY DON’T PERFORM WELL raw data text, image, sound, video directly measured data, etc. “Hello, how are you?” 072 101 108 108 111 044 032 104 111 119 032 097 114 101 032 121 111 117 063 can a machine tell a bird from a plane? how? 34
  11. 11. 10/24/2017 11 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD RAW DATA USUALLY DON’T PERFORM WELL raw data text, image, sound, video directly measured data, etc. 35 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD RAW DATA USUALLY DON’T PERFORM WELL raw data text, image, sound, video directly measured data, etc. bird (0) plane (1) probabilityofbird/plane any pixel’s color/intensity ~50% ~50% the info in a pixel is not discriminating enough for this task 36
  12. 12. 10/24/2017 12 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD probabilityofbird/plane RAW DATA USUALLY DON’T PERFORM WELL raw data text, image, sound, video directly measured data, etc. bird (0) plane (1) anotherpixel’scolor/intensity any pixel’s color/intensity 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 any pixel can be part of a bird or a plane. 37 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD are bad features 38
  13. 13. 10/24/2017 13 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD raw data are not only noisy and “dirty,” they are bad features! 39 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD FEATURES AND FEATURE ENGINEERING raw data text, image, sound, video directly measured data, etc. features any information you derive from the raw data and make explicit to the learning algorithm namesageloan amountincome normalized defaultrate ~10% ~10% any raw data loan income frequency of late (or early) payment, income (or spent) volatility (stdev), married or not, have kids or not … debt income avg. monthly spent income use of proper capitalization in the application # saves before submitting, avg. time between saving (or opening) the application, date, time, day of week when filling the application … hair color, eye color, height, weight … where did they fill out the application, sunny or rainy when filling the application … online application x = name, age, ID info, loan amount, income, spending + payment habit… any any feature normalized defaultrate the info doesn’t even have to be in the raw data, they just have to be derivable height feature engineering the extraction of implicit (or externally derived) information in (from) the raw data feature engineering feature engineering the extraction of implicit (or externally derived) information in (from) the raw data 40
  14. 14. 10/24/2017 14 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD occurrenceprobabilityofbeak occurrence probability of stabilizer COMING UP WITH BETTER FEATURES raw data text, image, sound, video directly measured data, etc. birds: have beak, have eyes, have feets, have feathers … plane: have stabilizers, have engines, have windows … feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm 41 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD occurrenceprobabilityofbeak occurrence probability of stabilizer COMING UP WITH BETTER FEATURES raw data text, image, sound, video directly measured data, etc. birds: have beak, have eyes, have feets, have feathers … plane: have stabilizers, have engines, have windows … model obtained by optimizing some objective function (error, likelihood, etc.) + model validation features any information you derive from the raw data and make explicit to the learning algorithm 42 machine learningfeature engineering statistics
  15. 15. 10/24/2017 15 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD data science is ~25% handcrafting … of features 44 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD “Coming up with features is difficult, time-consuming, requires expert knowledge. Applied machine learning is basically feature engineering.” —Andrew Ng hand crafted features are: - domain specific, - task specific, - not generalizable 45
  16. 16. 10/24/2017 16 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD CAN WE LEARN “GOOD FEATURES” DIRECTLY FROM THE DATA? 47 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD CAN WE LEARN “GOOD FEATURES” DIRECTLY FROM THE DATA? raw data text, image, sound, video directly measured data, etc. birds: have beak, have eyes, have bird feet, have feathers … plane: have stabilizers, have engines, have windows … feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm 48
  17. 17. 10/24/2017 17 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD CAN WE LEARN “GOOD FEATURES” DIRECTLY FROM THE DATA? raw data text, image, sound, video directly measured data, etc. birds: have beak, have eyes, have bird feet, have feathers … plane: have stabilizers, have engines, have windows … feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm shapes edges pixels 49 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD traditional machine learning handcrafted by experts work for most (80%) of the problems in business faces DEEP LEARNING raw data text, image, sound, video directly measured data, etc. feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm deep learning deep neural network automatically learned from the data with different levels of abstraction .... input= layer3 layer2 layer1 carselephantschairsfaces +cars +airplanes, +motorbikes combination of pixels → edges combination of edges → object parts combination of parts → the object 50
  18. 18. 10/24/2017 18 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD traditional machine learning handcrafted by experts work for most (80%) of the problems in business DEEP LEARNING raw data text, image, sound, video directly measured data, etc. feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm deep learning deep neural network automatically learned from the data with different levels of abstraction google brain: 16,000 cpu 1,000,000,000+ connections 10,000,000 training images from youtube extraordinarily generalizable: makes machine behaves & think more like human, but requires lots of data to train success stories: computer vision: image labeling, search … audio signal processing: speaker ID, speech recognition (speech-text) … text processing: machine translation, etc. interesting problems in the industry: —sentiment analysis —actionability & intention prediction —fraud, spam detection … 51 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD domain expertise WHAT DO DATA SCIENTIST DO? raw data text, image, sound, video directly measured data, etc. feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm computer science math + statistics communication data visualization, storytelling, translation of data to business insights, decisions, and action domain expertise plumbing cleaning janitoring handcrafting 52
  19. 19. 10/24/2017 19 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD domain expertise WHAT DO DATA SCIENTIST DO? raw data text, image, sound, video directly measured data, etc. feature engineering model obtained by optimizing some objective function (error, likelihood, etc.) + model validation statistics features any information you derive from the raw data and make explicit to the learning algorithm computer science math + statistics communication data visualization, storytelling, translation of data to business insights, decisions, and action domain expertise domain expertise math + statistics computer science data science plumbing cleaning janitoring handcrafting 54 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD thank you, q&a, + follow me twitter: @mich8elwu linkedin.com/in/MichaelWuPhD 135
  20. 20. 10/24/2017 20 c o n f i d e n t i a l twitter: @mich8elwu linkedin.com/in/MichaelWuPhD want to dig deeper? sos sos2 http://pages.lithium.com/science-of-social http://www.lithium.com/library/science-of-social-2 136

×