Más contenido relacionado

Presentaciones para ti(20)

Similar a Data Intelligence: How the Amalgamation of Data, Science, and Technology is Changing the Way We Do Business(20)


Data Intelligence: How the Amalgamation of Data, Science, and Technology is Changing the Way We Do Business

  1. Data Intelligence How the Amalgamation of Data, Science, and Technology is Changing the Way We Do Business January 22, 2017 Presented by: Joe Caserta
  2. Caserta Timeline LaunchedBig Data practice Co-author, with Ralph Kimball, The Data Warehouse ETL Toolkit (Wiley) Data Analysis, Data Warehousing andBusiness Intelligence since 1996 Began consultingdatabase programing anddata modeling 25+ years hands-on experience building database solutions Founded CasertaConcepts, LLC in NYC Web log analytics solution published in Intelligent Enterprise magazine Launched Data Science, Data Interaction andCloud practices Laser focus on extending Data Analytics with Big Data solutions 1986 2004 1996 2009 2001 2013 2012 2016 Dedicated to Data GovernanceTechniques onBig Data (Innovation) Awarded Top 20 Big Data Companies Top 20 Most Powerful Big Data consulting firms Launched Big DataWarehousing (BDW) Meetup NYC:4,000 Members 2017 Top 20 Most Admired Tech Leaders in Business Established best practicesfor big dataecosystem implementations Caserta InnovationLab invents Blockchain, AI,AR Solutions
  3. About Caserta Data Intelligence and Strategic Consulting Data Lakes, Data Laboratories, Data Warehouses Award-winning company for Data Innovation Data Science, Machine Learning, Artificial Intelligence Internationally recognized work force Best Practices, Authors, Educators, Mentors Strategy, Governance, Architecture, Implementation
  4. Our Clients Retail/eCommerce & Manufacturing Finance, Healthcare, Energy & Insurance Digital Media/AdTech Education & Services
  5. Evolution of Analytics What happened? Why did it happen? What will happen? How can we make It happen? Data Analytics Sophistication BusinessValue Source: Gartner How to interact with the customer? Reports  Correlations  Predictions  Recommendation s  Artificial Intelligence
  6. Why is Data so Important? 1500s Prin ng Press 1840s Penny Post 1850s Telegraph 1850s Rural Free Post 1890s Telephone 1900s Radio 1950s TV 1970s PCs 1980s Internet 1990s Web 2000s Social Media, Mobile, Big Data, Cloud 98,000+ Tweets 695,000 Status Updates 11 Million instant messages 698,445 Google Searches 168 million+ emails sent 1,829 TB of data created 217 new mobile web users Every 60 Seconds
  7. Data Analytics is your Differentiator Acquiring, analyzing and acting on data with a focus on speed to action
  8. Artificial Intelligence  “AI is one of the most important things that humanity is working on. It’s more profound than electricity or fire” - Sundar Pichai, CEO, Google
  9. The Customer Journey PR Radio TV Print Outdoor Word of Mouth Direct Mail Customer Service Physical Touchpoints Digital Touchpoints Search Paid Content email Website/ Landing Pages Social Media Community Chat Social Media Call Center Offers Mailings Survey Loyalty Programs email Agents Partners Ads Website Mobile 3rd Party Sites Offers Web self-service
  10. Learning the Path-to-Purchase Attribution Type Comments Single Touch Rules-Based Statistically Driven Assign the credit to the first or last exposure Assign the credit to each interaction based on business rules Assign the credit to interactions based on data-driven model Ad-Click Mailing MailingE-mail E-mailAd-Click Ad-Click 100% 33% 33% 33% 27% 49% 24% - Last touch only - Ignores bulk of customer journey - Undervalues other interactions and influencers - Subjective - Assigns arbitrary values to each interaction - Lacks analytics rigor to determine weights  Looks at full behavior patterns  Consider all touch points  Can apply different models for best results  Use data to find correlations between touch points (winning combinations)
  11. Data Science in Practice Source:
  12. Data Science for the Enterprise CRISP-DM: Cross Industry Standard Process for Data Mining 1. Business Understanding • Solve a single business problem 2. Data Understanding • Discovery • Data Munging • Cleansing Requirements 3. Data Preparation • ETL 4. Modeling • Evaluate various models • Iterative experimentation 5. Evaluation • Does the model achieve business objectives? 6. Deployment • PMML; application integration; data platform; Excel Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Data
  13. Governing Data Innovation
  14. S3 Ingest Storage ETL Presentation VisualizationData Sources • OPRA • Equifax • CDS • Moody’s • BlackBox Relational Datasets • Barclay • Eureka • Hedge Fund Intelligence • Hedge Fund Research • Lipper • Morningstar • MF Holdings • BD/ ADV Flat File Datasets S/ FTP Push Kinesis • CAT Landing Data Lake (Tier 1) Data Lake (Tier 2) Data Science (Ephemeral) Redshift Spark (Streaming* / Batch) Lambda Data Science • Python • SQL • Scala • Predic ve Analy cs • Text Analy cs • Business Intelligence Structured Data Redshift Metadata Repository • Data Marketplace • Clean • Match • Derive • Aggregate • Mllib • CoreNLP • Prepare • Deliver Streaming Data Sets Data Analytics Innovation Ecosystem SAP Oracle Financials Marketing Relational DBs Salesforce Workday RESTful APIs Cloud DBs Bloomberg Capital IQ FactSet Quandl Alternative Data Web logs IoT Streaming Data
  15. Data Quality & Monitoring • Build a robust data quality subsystem: • Metadata and error event facts • Orchestration • Based on Data Warehouse ETL Toolkit • Each error instance of each data quality check is captured • Implemented as sub-system after ingestion • Each fact stores unique identifier of the defective source row
  16. Change Management Global economics Intensity of competition Reduce costs Move to cross-functional teams New executive leadership Social trends and changes Speed of technical change Period of time in present role Status & perks of office/dept under threat No apparent reasons for proposed changes Lack of understanding of proposed changes Fear of inability to cope with new technology Concern over job security Forces for Change Forces ResistingChange Status Quo
  17. Agile Data Organization
  18. Cloud Platform Components Cloud Component AWS Google Microsoft Scalable distributed storage S3 GCS Azure Storage Pluggable fit-for-purpose processing EMR DataProc HDInsight Compute Services EC2 GCE VMs Consistent extensible framework Spark Spark Spark Dimensional MPP Data Warehouse Redshift/ Snowflake BigQuery Azure SQL Data Warehouse Data Streaming Kenesis PubSub Azure Stream Common Interface Jupyter DataLab Azure Notebook Machine Learning SageMaker TensorFlow Azure ML
  19. Customer Journey Dashboard
  20. What the Future Holds • DevOps for Analytics • Search-Based BI (NLP) • Artificial Intelligence (AI) • Virtual Reality BI (VR) • Virtual Assistant BI (Voice) • Reporting/Predictions Converge • Citizen Data Scientists Emerge
  21. Thank You @Joe_Caserta Joe Caserta President, Caserta Concepts

Notas del editor

  1. Reports 70s – 90s = 20 years Correlations (DW) 90s- 2000 Predictions (data mining) 2005 Recommendations (ML) 2007 Artificial Intelligence - 2017
  2. In August 2001, robots beat humans in a simulated financial trading competition. AI has reduced fraud and financial crimes by monitoring behavioral patterns of users for abnormal changes or anomalies.
  3. Teaching half-day class on this at the Data Summit in Boston in May