Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Minne analytics presentation 2018 12 03 final compressed

Cargando en…3

Eche un vistazo a continuación

1 de 42 Anuncio

Minne analytics presentation 2018 12 03 final compressed

Descargar para leer sin conexión

Monday was another great conference by MinneAnalytics! #MinneFRAMA was a great success with over 1,100 attendees at Science Museum of Minnesota. Alison Rempel Brown is a great host! A Teradata colleague told me that her post about my presentation "blew up" with hits and she got over 2K views, and 60+ likes. I'm proud to be a part of this great #datascience organization brining #machinelearning and #artificialintelligence #analytics to our #bigdata clients. If you want my slides, here they are.

Monday was another great conference by MinneAnalytics! #MinneFRAMA was a great success with over 1,100 attendees at Science Museum of Minnesota. Alison Rempel Brown is a great host! A Teradata colleague told me that her post about my presentation "blew up" with hits and she got over 2K views, and 60+ likes. I'm proud to be a part of this great #datascience organization brining #machinelearning and #artificialintelligence #analytics to our #bigdata clients. If you want my slides, here they are.


Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Minne analytics presentation 2018 12 03 final compressed (20)


Más reciente (20)

Minne analytics presentation 2018 12 03 final compressed

  1. 1. 1 The Road Ahead Artificial Intelligence, Machine Learning, Data Analytics, and Visualization Bonnie K. Holub, Ph.D., Principal Data Scientist Midwest Geo Data Science Lead December 3, 2018
  2. 2. 2 Introductions: Bonnie Holub, PhD Created over $1B value for companies PhD Artificial Intelligence Career: correlating disparate sets of Big Data for actionable results Entrepreneur:, KPMI, Advenitum Labs, ArcLight Inc. Researcher & Academic: University of Minnesota, University of St. Thomas Graduate Programs in Software, Carnegie Mellon University Business Professional: Teradata, Cognizant, Honeywell, PwC, Korn Ferry, Ucare, Object Partners
  3. 3. 3 ©2018 Teradata • Agenda Item 1 • Agenda Item 2 • Agenda Item 3 • Agenda Item 4 • Agenda Item 5 • Agenda Item 6 • Agenda Item 7 • Agenda Item 8 The road ahead: where are we coming from?
  4. 4. 4 Abstract “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...” Professor Dan Aierly (Duke University) This talk will discuss working examples of how some of the 12 million worldwide Teradata users are contributing to $10 trillion dollars worth of revenue through 11 trillion annual queries utilizing just under one zettabyte of data. She will cover hyper- segmentation of large data sets, fraud detection, preventive maintenance and techniques working systems employ to dig deep, aim high and manage operations at scale.
  5. 5. 5 AI/ML is everywhere
  6. 6. 6 NFL Next Gen Stats • Fantasy League
  7. 7. 7 NFL Next Gen Stats
  8. 8. 8 AI/ML is everywhere
  9. 9. 9
  10. 10. 10 Sprint Super Bowl 2018 Commercial • • Evelyn
  11. 11. 11
  12. 12. 12
  13. 13. 13 Exponential Growth: Moore’s Law Source: Moravec (Robot: Mere Machine to Transcendent Mind, 2000).
  14. 14. 14 Exponential Growth Source: O’Keefe, Brian, “The Smartest (or the Nuttiest) Futurist on Earth,” Fortune, 2007/05/14, innovation-artificial-intelligence/.
  15. 15. 15 Exponential Growth Detailed • Today’s smartphone has the same computing power as the whole US government in 1983. • 3D printing is the only technology where a more complex object doesn’t cost more to make. • 6 US states now have licensed autonomous vehicles (cars that drive themselves). • The average lifespan of an S&P 500 company has gone from 67 years in the 1920’s to 12 years today. • Changes to autonomous cars means that a current 3 year old will not get a drivers license (as cars will drive themselves in 14 years) and it will prevent 30,000 road deaths in the US Source:, downloaded 2018/02/09
  16. 16. 16 Oxford Study - The Future of Employment: how susceptible are jobs to computerization? Source: Frey, Carl Benedikt, Osborne, Michael A. “The Future of Employment: how susceptible are jobs to computerization?”, Oxford Martin School, 9/17/2013. FIGURE III. The distribution of BLS 2010 occupational employment over the probability of computerisation, along with the share in low, medium and high probability categories. Note that the total area under all curves is equal to total US employment.
  17. 17. 17 Source: jietang,, 2013.
  18. 18. 18 Technologies Artificial Intelligence Machine Learning Deep Learning Data Science/ Analytics Data Mining
  19. 19. 19 Context Summarized • There is a lot of hype. • There are various (data based) predictions. • There will be vast business and social upheaval due to accelerating change. Our challenge: ride the wave of change, don’t be swamped by it. (I promise you it will be thrilling!) © 2018 Bonnie K. Holub
  20. 20. 20 ©2018 Teradata • Agenda Item 1 • Agenda Item 2 • Agenda Item 3 • Agenda Item 4 • Agenda Item 5 • Agenda Item 6 • Agenda Item 7 • Agenda Item 8 The road ahead: where are we coming from? The road ahead: where are we headed? The road ahead: how can we get there first (and safely)?
  21. 21. 21 Practical Steps to Take Themes: • Hiring/Onboarding/ Retaining Talent • Business Intimacy • Balance • Outsourcing • Attitudes • Leadership • Problem Choice • Open source Alexander Linden, Carlie Idoine, Peter Krensky, Neil Chandler , “15 Inisghts for Managing Data Science Teams,” Gartner, 2018/10/19.
  22. 22. 22 Data Science Discipline Needs &Expertise Required Statistical Skill Usability Knowledge (HCI) Technology Skills (including a Historical Perspective) Business Understanding Implementation Expertise (Ops Savvy) © 2018 Bonnie K. Holub Arun Batchu & Bonnie Holub, Private Conversations, 2018/11/05. Discipline needs: • Best Practices /Playbooks • Implementation Patterns • Explanation AI (DARPA)
  23. 23. 23 MSP Leaders to Follow • Arun Batchu • Sona Maniyan • Patrick Sanchez • Stephen Thompson, League of Extraordinary Algorithms Meetup © 2018 Bonnie K. Holub
  24. 24. 24 © 2018 Bonnie K. Holub
  25. 25. 25 Tools the Market is Choosing “ KDnuggets Analytics/Data Science 2016 Software Poll: top 10 most popular tools in 2016 ,” data-mining-data-science-software.html, downloaded 2018/11/13.
  26. 26. 26 Animations • Hans Rosling • Time Magazine Top 100 in 2012. • Swedish public health researcher. • star (you should watch his talks…) • 200 Countries, 200 Years, 4 Minutes - The Joy of Stats - BBC Four • (4 minutes) • “Wealth and Health of Nations” © 2011-2012 ArcLight, Inc.
  27. 27. 27 Data Science SQUADs Monthly Subscription An agile cross-functional team that executes to achieve a customer business outcome. Right person for the job as project and customer needs shift. AI & ML Solutions Outcome Focused Detailed scoped Data Science, Artificial Intelligence or Machine Learning projects focused on delivering value tied to outcome. Accelerators Library of IP We draw upon our ever increasing portfolio of accelerators, algorithms, code and frameworks to accelerate delivery. Data Science Data Foundation BI & Cognitive Design Analytics Software Dev Architecture Our World-Class Technologists Cross-Functional Squads To Accelerate Delivery SQUADs
  28. 28. 28 © 2018 Teradata Data Science Accelerators
  29. 29. 29 © 2018 Teradata Hyper Segmentation Business Challenge What We Did? By leveraging the long tail of customer uniqueness, tailored offers and messages can drive incremental revenue through marketing channels. Intuition of human markets supported by data and evidence. • A self contained Most Valuable Persona application that identifies HyperSegments and describes these persona segments in human interpretable and actionable format. • The personas describes can be ingested into a data warehouse and/or marketing campaign tools. • Built using Aster/Teradata Analytic Platform, Spark and Python. Organizations with large numbers of customers do not have a arithmetic capability to create named persona’s for their top customers. Without refined segments or hypersegments, organizations are limited to coarse offers across its customer based and limited optimization techniques. Bringing human interpretable actionable insights for better marketing and customer personalization Proof: Large Online Retailer Enable an algorithmic capability to create named personas for 100K most valuable buyers in Computers, Tablets, & Networking utilizing buying behaviors of last 2 years and their account attributes. 100K MVBs 8K Hyper- segments Enable fine grained behavioral and demographic segmentation based on extended customer and purchase history data, which resulted in improved personalization and context to drive revenue What’s The Customer Value?
  30. 30. 30 © 2018 Teradata Customer Complaints Business Challenge What We Did? By leveraging the Customer Complaints application Businesses gain complete understanding of all emerging customer issues and are able to dramatically increase the effectiveness of the analyst. • Developed a Customer Complaints Analysis Application that helps the analyst quickly resolve complaints and deliver insights to business leaders • AI/ML used to prioritize complaints and prescribe resolutions methods • Ability to detect emerging issues and drill down to determine root cause • Available on Teradata Analytics Platform and Open source technologies(spark, python, nltk) Organizations with large numbers of customers often get inundated with complaints by the thousand per day. In highly regulated industries, like consumer financial, you must respond and resolve every single complaint in a timely manner. Complaints may contain early warning signs for systemic problems. Organizations must be able to detect and understand these critical insights to better manage risk. Addressing an issue before it becomes a news headline will help improve customer satisfaction and save millions in regulatory fines. Bringing human interpretable actionable insights for better marketing and customer personalization Proof: Large US Consumer Bank Proven test cases on historical data to measure analytic effectiveness. Wells Fargo Fake accounts, Citibank False Introductory offer advertisement, Equifax hack 3 Emerging issues 3x Productivity Dramatic Increase in time saved by identifying and solving global issues instead of resolving case by case. Recommended resolutions decreases redundant work to determine root cause. What’s The Customer Value?
  31. 31. 31 © 2018 Teradata Transportation Model Optimization Business Challenge What We Did? Teradata was engaged to develop a detailed transportation optimization model and user interface tool. The optimization tool takes future forecasted volumes and determines the most optimal transportation mode and routes to deliver the product to the customer. Capacities, capability constraints and various business rules are all utilized to maximize results. • Optimization models that selects the lowest cost solutions for each material/customer pair • A User Interface Tool (UIT) that graphically presents results and “what-if” analysis of potential scenarios to aid in further reduction of total costs • Build using Teradata Database and Open Source R libraries Global manufacturing companies often struggle to ensure that their logistics network delivers the lowest cost possible. Todays complex manufacturing distribution networks may include: • A network of contract manufacturing centers and storage facilities • Constantly changing freight rates makes optimal mode selection challenging • Lowest cost logistics solution difficult to achieve with manual calculations and analysis Helping manufacturing improve on-time delivery and lower costs of their supply chain Optimized network is estimated to save the facility over $6.3MM or 10% of transportation costs The UIT has further identified cost savings attributed to potential relocation of processing centers The business users can now quickly assess and estimate changes to transportation costs and plan accordingly Proof: Commodity Processing $6.3M Transportation Cost Savings What’s The Customer Value?
  32. 32. 32 © 2018 Teradata Data Science Project Portfolio
  33. 33. 33 © 2018 Teradata Cell Tower Coverage Optimization Business Challenge What We Did? • New state of the art algorithm in identify call quality issues for providers before their customers see it. • Significant improvement in Net Promoter Score via increased customer satisfaction and lower customer churn. • Develop method for identifying undershooting cell towers that could be uptilted to improve cell quality • Combined geospatial and statistic techniques • Identified overshooting towers that have the highest negative impact on neighboring cells • Automated Antenna tilts change based on algorithm • Built using Teradata and Tableau • Cell tower and antenna position can can have a dramatic impact on the call quality of a mobile service provider, despite having good coverage • Misconfigured towers can lead to dropped calls, poor customer experience, and customer churn. Improving customer satisfaction via better call quality Teradata Consulting expert team worked with a leading telecom provider to identify problem spots and improve cell service quality. In a single region better service is estimated to provide $10M in savings. Proof: Large Telecom Provider $10M Savings 6 deg: Contained4 deg: Overshooting What’s The Customer Value?
  34. 34. 34 © 2018 Teradata Fraud Detection using Deep Learning Business Challenge What We Did? What’s The Customer Value? • State of the art experimentation framework for testing and deploying new fraud detection model. • Lower false-positive rates mean lower review costs and better customer experiences • Higher detection rates mean lower loss rates. • Developed Machine Learning and Deep Learning algorithms to detect fraudulent transactions • Explored LSTM, Auto-encoders and CovNets to find better fraud detection at scale. • Built using GPUs, CUDA, Python and TensorFlow • In 2016, a large international bank set a strategic goal to use enterprise data insights and AI to help detect fraud in business transactions. • The bank’s ‘human-written’ rule engine was outdated - fraud detection rates were as low as 30% and non-fraud cases up around 99.5%, which was losing them millions a month Adapting the best in Artificial Intelligence, Deep Learning and GPU computing from Computer Vision into Financial Fraud Teradata Consulting expert team used deep learning in real-time to accelerate fraud detection, reducing false positives by 50% and increasing the detection rate by 60%, thus saving the bank millions. Proof: Large International Bank 50% False Positive 60% Detection Rate
  35. 35. 35 © 2018 Teradata Communications Compliance Business Challenge What We Did? By leveraging the Communications Compliance IP accelerator customers with regulatory burdens on communication can reduce millions of dollars in fines per year, dramatically increase the effectiveness of their staff, and promote a better customer experience Developed ML based noise reduction techniques to clear the junk out of email and messaging, dramatically reducing false positives Created NLP based workflow to predict risk and categorize messages correctly with dramatic increases in performance over current process UI application delivered for compliance analyst to assist with case management and resolution of compliance violations Available on Teradata Analytics Platform and Open source technologies(spark, python, nltk) In Consumer Financial and many other regulated industries companies are required to monitor communications to pre-emptively resolve financial crimes or prevent misleading information from affecting customers. Failure to do this properly leads to millions of dollars in fines per year. The task of screening thousands of employee to customer communications can be monumental, requiring many intelligent resources and computational horsepower. The benefit of saving millions in fines and keeping your name out of the news headlines is very much worth the effort. Stopping Financial Crimes before they happen Proof: Large US Consumer Bank Sophisticated noise reduction techniques and NLP/AI models dramatically reduce the false positives created by rules based systems 40x Less False Positives 3x Productivity Compliance analysts are given much less false positives and a much greater level of intelligence to make informed decisions regarding emerging threats to compliance What’s The Customer Value?
  36. 36. 36 © 2018 Teradata Predictive Asset Maintenance Business Challenge What We Did? Reducing the number of outages by even a small percentage will results in large savings to the utility. Although pole failures are rare, they have huge impact. A 2007 fire, attributed to a downed pole caused by Santa Ana winds, cost the utility $351M (above insurance payments). Proof: Utility Client • Highly scalable asset maintenance and asset survivability models to both existing data sources and components as well deployed IoT sensor data • Identify assets in most need of maintenance integrated with repair parts identification, engineering and repair scheduling all on a single platform. • Models to assess the probability of failure, used to represent risk. Our risk numbers may be factored into a client’s composite risk model to generate a composite number. • Output of the model was a probability of failure, which was combined with impact of failure to represent risk and prioritize maintenance. • Build using Aster/Teradata Analytic Platform and Open Source Python libraries • You have frequent unplanned down time impacting operations. • You are not sure which assets should have maintenance to avoid problems. • Non-routine maintenance impacts operations. • You are not sure how to schedule maintenance resources and to be pro-active rather than re-active to incidents. $350M Cost Avoidance Using machine learning and AI to prevent un-planned down time, as well as optimize costs, scheduling and resources. What’s The Customer Value?
  37. 37. 37 © 2018 Teradata Business Challenge What We Are Doing? Drives millions in OpEx reduction and reduces risk management exposure. • Efficiencies: Find and consolidate duplicate datasets and ETL jobs. • Compliance: Audit data usage, and enforce standards and best practices. • Faster onboarding: Find matching source datasets and recommend known transformation jobs. • Data Self-Service: Given an sample dataset help business analysts and data scientists find the data they need without an ETL expert. Developing novel Deep Learning and Artificial Intelligence-driven data signature generation to automatically develop a data catalogue that enables: • Automated assimilation of new data sources • Automated identification of duplicate data or improper data usage • Automated recommendation of data for usage by analysts & data scientists for wrangling and analysis. A typical enterprise spends 80% of the time integrating wrangling and managing data. In addition 60% of data-driven projects are an exact replica of what has been built 5 times before. There is an opportunity to save tens of millions annually on DBA, ETL, data management, and other data assimilation related activities. Leading edge innovation in Artificial Intelligence and Deep Learning to address a 40 year old data management problem. Currently in a joint Research and Development project with a large financial institution. Proof: In-Progress Y Coming Soon Data What’s The Customer Value?
  38. 38. 38 © 2018 Teradata Product Portfolio Bundle Targeting Business Challenge What We Are Doing? What’s The Customer Value? • Fast time to identify problem product groups. • ~$1.5M yearly savings due to better visibility of at risk customers for targeted intervention. • • Provide insights driven dashboard to show problem groups and provide propensity score for at-risk customers. • • Allow business executives to see at a glance where the high impact areas are through consolidated cohort groups. • Allow the segment attributes and features to speak for the specific • All businesses have a set of traditional problems including churn, revenue diminishment, new business, cross-sell and upsell. AnalyticOps and insight driven interfaces to improve product performance visibility and enable targeted interventions. Proof: • 25% more effective targeting of at-risk customers vs. Traditional solutions. 125%
  39. 39. 39 © 2018 Teradata Product Recommendation Business Challenge What We Did? What’s The Customer Value? By leveraging the product recommendation solution, businesses are able to make More personalized offers from better recommendations. This leads to increased revenue from more completed offers. • Developed product and customer affinity recommendation models using collaborative filtering to identify most frequently purchased items • Most recommended products are offered to customers with similar shopping behavior • Compatible with Teradata Analytic Platform or R in Database on Teradata Retailers have billions of transaction details about their customers. The ability to deliver relevant personalized offers at scale could lead to significant increases in revenue. Retailers also have to account for item seasonality, different customer segmentations and too many products to effectively analyze and market. Improving customer personalization using analytics on product purchasing behavior If a large retailer sends an average of ½ billion offers per month, then a $0.01 improvement over 1 billion offers is worth $10 million dollars. Proof: Large Retailer $10M Additional Revenue
  40. 40. 40 © 2018 Teradata Hard Disk Failure Prediction Business Challenge What We Did? Hard disk manufacturers can identify underlying issues with their products as they ramp-up production early in the scale out phase. Proactively offer customer service create a high-value personalized experience that translates into deeper brand loyalty. Developed a data aggregation and modeling system that: • Collected100s of time series hard disk sensor and system data daily from 1000s of laptops and desktops. • Used Correlation, PCA, Symbolic Aggregate Approximation (SAX), Naïve Bayes to model and predict likelihood of disk failure. Hard disk failures can have a significant productivity impact and in some cases be catastrophic for customer’s businesses or personal memories. Predicting eminent hard disk failure allows manufacturers to ship new hard disk to customer before it fails. Allows customers to back up their data and minimize disruption. Improving customer experience by predicting eminent disk failure. Developed as an integrated solution for one of the leading global PC manufacturer to predict disk failure 1-week out with a F1-Score of 70%. This allowed the PC manufacturer to offer premium services for their high- value customers and develop better brand experience. Proof: Large PC Manufacturer Predict disk failure 1-week ahead What’s The Customer Value?
  41. 41. 41 © 2018 Teradata Identify Manufacturing Issues Business Challenge What We Did? Identifying product issues early in the product life-cycle manufacturer can reduce the negative impact on the brand, as well as reduce the cost of recalls by taking corrective actions on their production line. We use Natural Language Processing (NLP) and Topic modeling on customer support notes from customer complaints to identify systematic emerging patterns. Connecting the issues with Bill of Materials, using Graph Analytics (Npath, Ntree, etc) we were able to identify the manufacturer and plant that was causing the faults. Manufacturing issues that make their way to shipped products can have a disastrous impact on the band. Identifying emerging issues early enough allows manufacturers to take corrective actions to reduce recalls, warranty claims and recognize higher margins. The challenge is getting ahead of the public perception before it damages the brand. Combining Natural Language Processing and Graph Analytics for early detection of product issues. Proof: US Phone Manufacturer We engaged with a leading US mobile phone manufacturer to analyze freeform customer support notes for emerging issues for their latest product launch. We identified 4 critical detects that impact port of their inventory ahead before they were seen in consumer or media reports. Manufacturer was able to identify the root cause and shut off the faulty production line and protect the brand. Millions saved in recalls What’s The Customer Value?
  42. 42. 42 Thank you. ©2018 Teradata Thank you. ©2018 Teradata