Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Taming the shrew Power BI

Cargando en…3

Eche un vistazo a continuación

1 de 55 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Taming the shrew Power BI (20)


Más de Kellyn Pot'Vin-Gorman (20)

Más reciente (20)


Taming the shrew Power BI

  1. 1. Taming of the Shrew Tricks to Optimizing Power BI Kellyn Pot’Vin-Gorman Data Platform Architect, Power BI and AI
  2. 2. Kellyn Pot’Vin-Gorman Technical Solution Professional at Microsoft, Data Platform in Power BI and AI • Former Technical Intelligence Manager, Delphix • Multi-platform DBA, (Oracle, MSSQL, MySQL, Sybase, PostgreSQL, Informix…) • Oracle ACE Director, (Alumni) • Oak Table Network Member • Idera ACE Alumni 2018 • STEM education with Raspberry Pi and Python, including DevOxx4Kids, Oracle Education Foundation and TechGirls • Former President, Rocky Mtn Oracle User Group • Current President, Denver SQL Server User Group • Linux and DevOps author, instructor and presenter. • Blogger, ( @DBAKevlar
  3. 3. Gaining just 10% more access to data can result in over $65 million in revenue
  4. 4. User Chooses to Refresh Report User Gets in Car To Get Cup of Coffee In Next Town While Waiting for Refresh User Needs Updated Information from Power BI Report Our User Story
  5. 5. Relational Data Oracle, SQL Server, Teradata, Salesforce Cloud Data Azure, AWS, Google Other Data Excel, Access, Sharepoint, etc. MODEL & SERVE Azure Analysis ServicesAzure SQL Data Warehouse Power BI . Power BISQL Server Integration Services P O W E R B I L A N D S C A P E Finding all the Fish in the Ocean Data Factory Big Data DataLake,Hadoop, Hortonworks
  6. 6. Power BI is Guilty Until Proven Innocent
  7. 7. Relational Data Oracle, SQL Server, Teradata, Salesforce Cloud Data Azure, AWS, Google Other Data Excel, Access, Sharepoint, etc. MODEL & SERVE Azure Analysis ServicesAzure SQL Data Warehouse Power BI . Power BISQL Server Integration Services P O W E R B I L A N D S C A P E Finding All The External Latency Data Factory Big Data HD Insights, DataLake, Hortonworks
  8. 8. Coordinate pipeline acOPTIMIZATION EXERCISE PROCESS Power BI Layer Bring Data to Network Specialist OnceVerified Non-Issue Network Layer OnceVerified Non-issue BringWait Times to Data Specialist Repeat and verify resolved Inspect Data Model Data Sets Power BI Review Steps: Resources Concurrency Visuals and Dashboards Data Modeler to Address OnceVerified Non-IssueData Sources Identify byType and bring in expertise for each
  10. 10. • A scientific approach to optimization. • Optimizing on cost, or assumptions does not guarantee results. • Removes finger pointing and the “Blame Game” • Simplifies the process of identifying real latency. • When Time is Addressed, Long Term Resolution is Often Experienced. Why Time Should BeYour Main Focus for Optimization
  11. 11. DATA SOURCES
  12. 12. • Data sources can be relational, databases, big data, CSV/Excel, structured/unstructured data files. • If there are onsite or remote specialists available, partner to gather distinct data to identify waits and patterns. • Know, along with execution plans, tracing can assist in identifying deeper and multi-tier issues that isn’t divulged in traditional performance tools. • Infrastructure tools, cloud monitoring tools and tracing can also provide more information than traditional tools. Steps for Optimizing Data Sources
  13. 13. NETWORK
  14. 14. The Network – The Final Bottleneck On-Premise data sources SQL DB Managed Instance SQL Server VNET Data User Power BICloud data sources Microsoft SQL Server Integration Services Firewall is our best friend and worst enemy
  15. 15. NETWORK • Networks are still limited by much of “Shannon’s Law” • Filter to deter from creating bottlenecks on the network. • Become friends with the network admin to isolate issues with firewalls and network bottlenecks. • Consider how often refreshes are performed and from where the data is being sent from and to. c = b log2 (1 + s)
  16. 16. POWER BI LAYER
  17. 17. Columnar data store makes it forgiving of large data sets. But…Power BI is dependent upon the data that it sources from, along with multiple other features. Performance can be hindered by numerous items Power BI is dependent upon: • Data Model • Data Size • Resources Allocated for Processing • DataTypes
  18. 18. POWER BI QUERY EDITOR • Avoid complex queries in Query Editor, combinations of filter with context transition are some of the worst. • Don’t use relative date filtering in the Query Editor. • Keep measures simple initially, adding complexity incrementally. • Avoid relationships on calculated columns and unique identifier columns. • Try setting “Assume Referential Integrity” on relationships – this may improve query performance. • Ensure relationships are set up properly, use new many to many sparingly.
  19. 19. • Eight MAX visuals in dashboard or report • Set filters in filter pane of reports. • Understand where performance hits are sourcing from • Test and track refreshes over time for reports and dashboards – Don’t assume. • Don’t build complicated measures or aggregates at the data model layer. Tips for Dashboards
  20. 20. • Narrow Tables are Faster • Integers over strings, (text) • Slicers use multiple steps, (queries) to process • Use powerful DAX functions that can eliminate complex or poor performing expressions. • Certain filters can hinder performance if they examine each row. Identify when this occurs. • Simplify queries whenever possible • Follow best practices for relationships for your data model • Add indexes and foreign keys whenever possible Power BI Tips
  21. 21. Resource Constrictions Can Hinder Performance: • Consider increasing memory allocated for data loads • Up data cache for large processing. • Monitor and alert on thresholds for demands for enterprise reporting Resource Constrictions Can Hinder Performance, too!
  23. 23. • A tool to write, execute, and analyze DAX queries. • Works with Power BI Power Pivot for Excel, and Analysis Services Tabular. • Includes an Object Browser, query editing and execution, formula and measure editing, syntax highlighting and formatting. We’ll focus today on integrated tracing, along with query execution breakdowns What is Dax Studio?
  24. 24. USE CASE- SP BLITZ POWER BI REPORTING • Part of Brent Ozar Unlimited’s SQL Server First Responder Kit • Contains a great interface for monitoring SQL Server environments, including PerfMon, Query performance and wait stats. • Can be downloaded from Github: SQL-Server-First-Responder-Kit
  25. 25. Opportunity • With large data collection comes slow refresh times and impacted performance. • Upon opening report to view new data, it never returns. What is to blame?
  26. 26. USE DAX STUDIO TO TRACE PERFORMANCE ISSUES IN POWER BI • Open Power BI Report, (if not already open) • Open up DAX Studio • Connect to Power BI report
  27. 27. SET UP TRACING • Click on All Queries once Connected • Verify that the output states that “Query Trace Started”
  28. 28. RUN OR REFRESH REPORT • Just as you would when working with the report regularly, perform long running tasks. • DAX Studio will trace all executions performed as part of the report queries, data refreshes, etc.
  29. 29. Inspect DAX Studio Just as with any optimization exercise, sort the data by “Duration.” Double-clicking on the Query will display the formatted query above.
  30. 30. Tips with DAX Studio • Remember to clear the cache before each test so that no cached data is mistakenly used for testing. • Run the DAX query by itself in the Studio once performance impact has been identified to verify issue. • Use the Clear ALL option to clear all queries from All Queries pane if attempting to isolate a specific report or visual. • Retain all query times and filter if wanting to track optimization changes.
  33. 33. X
  34. 34. let Source = Csv.Document(File.Contents(“<logfile>"),5,"",null,1252), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", Int64.Type}, {"Column4", type text}, {"Column5", type text}}), #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Column2", "Column4"}), #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Column3", "PID"}, {"Column1", "Process Type"}}), #"Replaced Value" = Table.ReplaceValue(#"Renamed Columns","{Start:","",Replacer.ReplaceText,{"Column5"}), #"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value", "Column5", Splitter.SplitTextByEachDelimiter({",Action:"}, QuoteStyle.Csv, false), {"Column5.1", "Column5.2"}), #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column5.1", type datetime}, {"Column5.2", type text}}), #"Renamed Columns1" = Table.RenameColumns(#"Changed Type1",{{"Column5.1", "Start"}}), #"Replaced Value1" = Table.ReplaceValue(#"Renamed Columns1","}","",Replacer.ReplaceText,{"Column5.2"}), #"Split Column by Delimiter1" = Table.SplitColumn(#"Replaced Value1", "Column5.2", Splitter.SplitTextByEachDelimiter({",Duration:"}, QuoteStyle.Csv, true), {"Column5.2.1", "Column5.2.2"}), #"Replaced Value2" = Table.ReplaceValue(#"Split Column by Delimiter1","00:00:","",Replacer.ReplaceText,{"Column5.2.2"}), #"Renamed Columns2" = Table.RenameColumns(#"Replaced Value2",{{"Column5.2.2", "Duration"}}), #"Changed Type2" = Table.TransformColumnTypes(#"Renamed Columns2",{{"Duration", type number}}), #"Renamed Columns3" = Table.RenameColumns(#"Changed Type2",{{"Column5.2.1", "Message"}}), #"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns3",{"Process Type"}) in #"Removed Columns1"
  35. 35. Term Function Log Source SimpleDocument Local Object Multiple logs RemoteDocument Remote Excel or CSV file Multiple logs PackageStorage Disk waits- database, often Access Power BI logs PBIDashboard Dashboard waits PBI logs, inspect message PBIVisualConsent Row level permissions PBI Logs, inspect message PBIData.get Get Data waits PBI Logs, inspect message PBITrustedVisual Open visual view PBI Logs PBIModuleLoad Load of dashboard PBI Logs FirewallDocument Cloud or remote document MSMdsrv Logs
  38. 38.
  40. 40. SUMMARY • Remember to stay with the process. • Use time as the reason to optimize. • Use data, not assumptions. • Use Tools, (DAX Studio, SQL Profiler, Power BI) to analyze logs and traces, just as you would other data. • Collaborate with the user to identify what’s important to them, too.
  41. 41. Thanks to • Chris Webb for sharing test data and ideas. • Brent Ozar for creating the sp_blitz data model that offered the opportunity to optimize. • The EDU group at Microsoft for offering a full environment for me to build for testing, including the cloud to work with on this presentation.
  42. 42. Questions? Twitter: @dbakevlar