Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Building Modern Data Platform with Microsoft Azure

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 45 Anuncio

Building Modern Data Platform with Microsoft Azure

Descargar para leer sin conexión

This presentation will cover Cloud history and Microsoft Azure Data Analytics capabilities. Moreover, it has a real-world example of DW modernization. Finally, we will check the alternative solution on Azure using Snowflake and Matillion ETL.

This presentation will cover Cloud history and Microsoft Azure Data Analytics capabilities. Moreover, it has a real-world example of DW modernization. Finally, we will check the alternative solution on Azure using Snowflake and Matillion ETL.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Building Modern Data Platform with Microsoft Azure (20)

Anuncio

Más de Dmitry Anoshin (20)

Más reciente (20)

Anuncio

Building Modern Data Platform with Microsoft Azure

  1. 1. Building Modern Cloud Analytics Solution Dmitry Anoshin
  2. 2. Outline • About Me • Role of Analytics • History of Cloud • Analytics powered by Microsoft Azure • DW modernization Project • Use cases and Challenges • Alternative Solution with Azure
  3. 3. About Myself
  4. 4. About Myself • Work with Business Intelligence since 2007
  5. 5. #dimaworkplace
  6. 6. Technical Skills Matrix 2015 2010 2007 Data Warehouse ETL/ELT Business Intelligence Big Data Cloud Analytics (AWS, Azure, GCP) Machine Learning 2019
  7. 7. Other Activities Jumpstart Sno wflake: A Step- by-Step Guide to Modern Cloud Analytics. • Victoria Power BI andVictoria SQL Server meetup • Victoria andVancouverTableau User Group • Conferences (EDW 2018, 2019, Data Architecture Summit) • Amazon internal conferences
  8. 8. Role of Analytics
  9. 9. BusinessValue Stakeholders Employees Customers Value ”The goal of any organization is to generateValue” The Future of Competition. https://www.amazon.com/Future-Competition-Co-Creating-Unique-Customers/dp/1578519535
  10. 10. BIValue Chain Stakeholders Employees Customers Value Decisions Data Value creation based on effective decisions Effective decisions based on accurate information
  11. 11. For Data to be a differentiator, customers need to be able to… • Capture and store new non-relational data at PB-EB scale in real time • Discover value in a new type of analytics that go beyond batch reporting to incorporate real-time, predictive, voice, and image recognition • Democratize access to data in a secure and governed way New types of analytics Dashboards Predictive Image Recognition VoiceReal-time New types of data
  12. 12. Cloud Analytics Introduction
  13. 13. Cloud Early History 1970 Time Sharing Concept by GE 1977 Cloud symbol used in ARPANET 1990 VPN by telecom 1993 Cloud refer to Distributed Computing 1994 Cloud metaphor for virtualized services
  14. 14. Cloud Recent History 2002 AWS 2006 AWS Elastic Compute Cloud 2006 Google Docs 2008 Google App Engine 2008 Microsoft Announced Azure 2010 Microsoft Azure
  15. 15. Why moving to the Cloud? • Elasticity • Pay for what you need • Fail fast • Fast time to market • Secure • Reliable • Business SLA
  16. 16. Downsides of on-premise solution Scale Constrained Up-front cost Maintenance Resources Tuning and Deployment
  17. 17. Cloud Restrictions -> Hybrid Clouds Sensitive Data Data Moving Cost Public/Private Cloud
  18. 18. Cloud Service Models
  19. 19. Cloud Service Models – friendly version
  20. 20. Cloud Analytics with Microsoft Azure
  21. 21. Microsoft Azure for Analytics
  22. 22. Data Analytics with Azure • Data Factory • Integration Service • Kafka • Event Hub • Data Lake Gen 1 • Data Lake Gen 2 • Blob Storage • HD Insight • Data Lake Analytics • Streaming Analytics • PolyBase • CosmosDB • SQL DW • Analysis Service • SQL Database • SQL Server in VM • Cosmos DB Data Integration and Transformation Data Warehouse and Data bases Big Data • Analysis Service • ML Analytics • Business Intelligence Analytics
  23. 23. DW Modernization Use Case
  24. 24. BI/DW (before) Storage LayerSource Layer Ad-hoc SQL SFTP Data Warehouse ETL (PL/SQL)Files Inventory Sales Access Layer
  25. 25. Cloud Migration Strategy Lift & Shift • Typical Approach • Move all-at-once • Target platform then evolve • Approach gets you to the cloud quickly • Relatively small barrier to learning new technology since it tends to be a close fit Split & Flip • Split application into logical functional data layers • Match the data functionality with the right technology • Leverage the wide selection of tools onAWS to best fit the need • Move data in phases — prototype, learn and perfect
  26. 26. Migration Approach Useful tools: • Total Cost Ownership (TCO) Calculator • Azure Database Migration Service • Azure Migration Assistant
  27. 27. Cloud Data Warehouse
  28. 28. What is Azure DW? • Decouple Storage and Compute • MPP • Distribution Styles: Hash/Robin/Replicat e
  29. 29. MPP?
  30. 30. SQL Database vs SQL Data Warehouse
  31. 31. What is Azure Data Factory? Azure Data Factory (ADF) is Microsoft’s fully managed ELT service in the cloud that’s delivered as a Platform as a Service (PaaS)
  32. 32. Lack of Notification Problem: Users are missing emails or they jump to spam. Solution: Leverage Messenger with Webhooks. (Slack, Chime or so on).
  33. 33. Lack of Logging Problem: We didn’t have any detail logs about our ETL performance and we didn’t have any insights. Solution: Collecting logs and events. In addition, we are able to collect logs on any level of jobs and transformation.
  34. 34. Self-Service BI Problem: Business Users wants Interactive and Self-Service tool. Fast time to Market and less dependency on IT. Solution: Implement modern Visual Analytics Platform
  35. 35. Marketing Automation Problem: Marketing team wants “Move Fast and Break Things”. Solution: Using ADF the gave Marketing template jobs and they doing their jobs themselves. Affiliates Insights
  36. 36. Integration with BI Problem: Having best BI tool doesn’t guaranty good SLA. Solution: Build Integration between Matillion ETL and Tableau based on Trigger. Add data quality checks.
  37. 37. Evolving to Cloud Data Analytics Platform
  38. 38. Streaming Data Problem: Organization is using NoSQL database and mobile application. It is critical to deliver near real time analytics Solution: Using Apache Kaffka, we are able to stream data into the Data lake and query this data in near real time Data Lake Dashboard Kafka CosmoDB Mobile App
  39. 39. Clickstream Analytics Problem: Business wants to analyze Bots traffics and discover broken URLs. Access logs are ~50GB per day, 5600 log files per day. Solution: Leveraging Databricks in order to produce Parquet file and store in Azure Data Lake Gen2. User are able query it with T-SQL and BI Tools. Databricks ParquetBlob Storage Access Logs Load Balancer Data Lake Data Factory SQL DW Query with SQL or Databricks
  40. 40. DevOps onboarding Problem: Solution isn’t reliable and could easy break. As a result end users will experience bad experience and it will affect business decisions. Solution: Onboarding Continuous Integration methodology for Cloud Data Platform • Agile and Kanban board • Code branching (Git) • Gated check-ins • Automated Tests • Build • Release
  41. 41. Evolving to Cloud Data Analytics Platform
  42. 42. Alternative Implementation
  43. 43. What is Matillion ETL?
  44. 44. What is Snowflake?

Notas del editor

  • The cloud symbol was used to represent networks of computing equipment in the original ARPANET by as early as 1977

    The term cloud was used to refer to platforms for distributed computing as early as 1993, when Apple spin-off General Magic and AT&T used it in describing their (paired) Telescript and PersonaLink technologies.
  • The cloud symbol was used to represent networks of computing equipment in the original ARPANET by as early as 1977

    The term cloud was used to refer to platforms for distributed computing as early as 1993, when Apple spin-off General Magic and AT&T used it in describing their (paired) Telescript and PersonaLink technologies.

×