Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Microsoft SQL server 2017 Level 300 technical deck

1.776 visualizaciones

Publicado el

This deck covers new features in SQL Server 2017, as well as carryover features from 2012 onwards. This includes high availability, columnstore, alwayson, In-memory tables, and other enterprise features.

Publicado en: Software
  • Inicia sesión para ver los comentarios

Microsoft SQL server 2017 Level 300 technical deck

  1. 1. SQL Server 2017 Everything Built In—Technical Overview
  2. 2. United platform for the modern service provider CUSTOMER DATACENTER SERVICE PROVIDER MICROSOFT AZURE CONSISTENT PLATFORM Enterprise-grade Global reach, scale, and security to meet business demands Hybrid cloud Consistent platform across multiple environments and clouds People-focused Expands technical skill sets to the cloud for new innovation Microsoft vision for a new era
  3. 3. End-to-end mobile BI on any device Choice of platform and language Most secure over the last 7 years 0 20 40 60 80 100 120 140 160 180 200 Vulnerabilities(2010-2016) A fraction of the cost Self-serviceBIperuser Only commercial DB with AI built-in Microsoft Tableau Oracle $120 $480 $2,230 Industry-leading performance 1/10 Most consistent data platform #1 OLTP performance #1 DW performance #1 price/performance T-SQL Java C/C++ C#/VB.NET PHP Node.js Python Ruby R R and Python + in-memory at massive scale Native T-SQL scoring S Q L S E R V E R 2 0 1 7 I N D U S T R Y - L E A D I N G P E R F O R M A N C E A N D S E C U R I T Y N O W O N L I N U X A N D D O C K E R Private cloud Public cloud In-memory across all workloads 1/10th the cost of Oracle
  4. 4. SQL Server 2017 Meeting you where you are It’s the same SQL Server Database Engine that has many features and services available for all your applications—regardless of your operational ecosystem. Linux Any data Any application Anywhere Choice of platform T-SQL Java C/C++ C#/VB.NET PHP Node.js Python Ruby 1010 0101 0010 { }
  5. 5. How we develop SQL • Cloud-first but not cloud-only • Use SQL Database to improve core SQL Server features and cadence • Many interesting and compelling on-premises  cloud scenarios SQL Server and APS Azure SQL Virtual Machines Azure SQL Database Azure SQL Data Warehouse
  6. 6. A consistent experience from SQL Server on-premises to Microsoft Azure IaaS and PaaS • On-premises, private cloud, and public cloud • SQL Server local (Windows and Linux), VMs (Windows and Linux), containers, and SQL Database • Common development, management, and identity tools including Active Directory, Visual Studio, Hyper-V, and System Center • Scalability, availability, security, identity, backup and restore, and replication • Many data sources • Reporting, integration, processing, and analytics All supported in the hybrid cloud Consistency and integration
  7. 7. SQL Server 2017—new features
  8. 8. Database Engine new features Linux/Docker support • RHEL, Ubuntu, SLES, and Docker Adaptive query processing • Faster queries just by upgrading • Interleaved execution • Batch-mode memory grant feedback • Batch-mode adaptive joins
  9. 9. Database Engine new features Graph • Store relationships using nodes/edges • Analyze interconnected data using node/edge query syntax SELECT r.name FROM Person AS p, likes AS l1, Person AS p2, likes AS l2, Restaurant AS r WHERE MATCH(p-(l1)->p2-(l2)->r) AND p.name = 'Chris' Automatic tuning • Automatic plan correction—identify, and optionally fix, problematic query execution plans causing query performance problems • Automatic index management—make index recommendations (Azure SQL Database only)
  10. 10. Database Engine new features Enhanced performance for natively compiled T-SQL modules • OPENJSON, FOR JSON, JSON • CROSS APPLY operations • Computed columns New string functions • TRIM, CONCAT_WS, TRANSLATE, and STRING_AGG with support for WITHIN GROUP (ORDER BY) Bulk import now supports CSV format and Azure Blob storage as file source
  11. 11. Database Engine new features Native scoring with T-SQL PREDICT Resumable online index rebuild • Pause/resume online index rebuilds Clusterless read-scale availability groups • Unlimited, geo-distributed, linear read scaling P S1 S2 S3 S4
  12. 12. Integration Services new features Integration Services scale out • Distribute SSIS package execution more easily across multiple workers, and manage executions and workers from a single master computer Integration Services on Linux • Run SSIS packages on Linux computers • Currently some limitations Connectivity improvements • Connect to the OData feeds of Microsoft Dynamics AX Online and Microsoft Dynamics CRM Online with the updated OData components
  13. 13. Analysis Services new features 1400 Compatibility level for tabular models Object level security for tabular models Get data enhancements • New data sources, parity with Power BI Desktop and Excel 2016 • Modern experience for tabular models Enhanced ragged hierarchy support • New Hide Members property to hide blank members in ragged hierarchies Detail Rows • Custom row set contributing to a measure value • Drillthrough action in more detail than the aggregated level in tabular models
  14. 14. Reporting Services new features Comments • Comments are now available for reports, to add perspective and collaborate with others—you can also include attachments with comments Broader DAX support • With Report Builder and SQL Server Data Tools, you create native DAX queries against supported tabular data models by dragging desired fields to the query designers Standalone installer • SSRS is no longer distributed through SQL Server setup • Power BI Report Server
  15. 15. Machine Learning Services new features Python support • Python and R scripts are now supported • Revoscalepy—Pythonic equivalent of RevoScaleR—parallel algorithms for data processing with a rich API MicrosoftML • Package of machine learning algorithms and transforms (with Python bindings), as well as pretrained models for image extraction or sentiment analysis
  16. 16. SQL Server on Linux
  17. 17. Multiple data types Heterogeneous environments Different development languages On-premises, cloud, and hybrid environments enterprise DB market runs on Linux 36% Evolution of SQL Server HDInsight on Linux R Server on Linux Linux in Azure SQL Server drivers and connectivity Visual Studio Code extension for SQL Server T-SQL Java C/C++ C#/VB.NET PHP Node.js Python Ruby 1010 0101 0010 { } 20K+applications for private preview
  18. 18. Power of the SQL Server Database Engine on the platform of your choice Linux distributions: RedHat Enterprise Linux (RHEL), Ubuntu, and SUSE Linux Enterprise Server (SLES) Docker: Windows and Linux containers Windows Server/Windows 10 Linux Linux/Windows container Windows
  19. 19. Buying a SQL Server license gives you the option to use it on Windows Server, Linux, or Docker. Regardless of where you run it— VM, Docker, physical, cloud, on- premises—the licensing model is the same; available features depend on which edition of SQL Server you use. LICENSE Same license, new choice
  20. 20. Linux-native user experience
  21. 21. Supported platforms Platform Supported version(s) Supported file system(s) Red Hat Enterprise Linux 7.3 XFS or EXT4 SUSE Linux Enterprise Server v12 SP2 EXT4 Ubuntu 16.04 EXT4 Docker Engine (on Windows, Mac, or Linux) 1.8+ N/A System requirements for SQL Server on Linux
  22. 22. Cross-system architecture SQL Platform Abstraction Layer (SQLPAL) RDBMS AS IS RS Windows Linux Windows Host Extension. Linux Host Extension SQL Platform Abstraction Layer (SQLPAL) Host extension mapping to OS system calls (IO, memory, CPU scheduling) Win32-like APIsSQL OS API SQL OS v2 System resource and latency sensitive code paths Everything else
  23. 23. Installing SQL Server on Linux Add the SQL Server repository to your package manager Install the mssql-server package Run mssql-conf setup to configure SA password and edition Configure the firewall to allow remote connections (optional) SQL Server on Linux overview page sudo curl -o /etc/yum.repos.d/mssql-server.repo https://packages.microsoft.com/config/rhel/7/mssql-server-2017.repo sudo yum update sudo yum install -y mssql-server sudo /opt/mssql/bin/mssql-conf setup sudo firewall-cmd --zone=public --add-port=1433/tcp --permanent sudo firewall-cmd --reload
  24. 24. What’s installed? SQL Server runtime and associated libraries: /opt/mssql/bin/ /opt/mssql/lib/ /opt/mssql/ Data and log files for SQL Server databases: /var/opt/mssql/data/ /var/opt/mssql/log/ /var/opt/mssql/
  25. 25. Tools and programmability • Windows-based SQL Server tools—like SSMS, SSDT, and Profiler—work when connected to SQL Server on Linux • All existing drivers and frameworks supported • Third-party tools continue to work • Native command-line tools—sqlcmd, bcp • Visual Studio Code mssql extension
  26. 26. Client connectivity Language Platform More Details C# Windows, Linux, macOS Microsoft ADO.NET for SQL Server Java Windows, Linux, macOS Microsoft JDBC Driver for SQL Server PHP Windows, Linux, macOS PHP SQL Driver for SQL Server Node.js Windows, Linux, macOS Node.js Driver for SQL Server Python Windows, Linux, macOS Python SQL Driver Ruby Windows, Linux, macOS Ruby Driver for SQL Server C++ Windows, Linux, macOS Microsoft ODBC Driver for SQL Server SQL Server client drivers are available for many programming languages, including:
  27. 27. What’s available on Linux? Operations features • Support for RHEL, Ubuntu, SLES, Docker • Package-based installs • Support for Open Shift, Docker Swarm • Failover clustering via Pacemaker • Backup/Restore • SSMS on Windows connected to Linux • Command-line tools: sqlcmd, bcp • Transparent Data Encryption • Backup Encryption • SCOM management pack • DMVs • Table partitioning • SQL Server Agent • Full-Text Search • Integration Services • Active Directory (integrated) authentication • TLS for encrypted connections
  28. 28. What’s available on Linux? Programming features • All major language driver compatibility • In-Memory OLTP • Columnstore indexes • Query Store • Compression • Always Encrypted • Row-Level Security, Data Masking • Auditing • Service Broker • CLR • JSON, XML • Third-party tools
  29. 29. Features not currently supported on Linux
  30. 30. Operational features
  31. 31. In-Memory OLTP
  32. 32. In-Memory Online Transaction Processing (OLTP) In-Memory OLTP is the premier technology available in SQL Server and Azure SQL Database for optimizing performance of transaction processing, data ingestion, data load, and transient data scenarios. Memory-optimized tables outperform traditional disk-based tables, leading to more responsive transactional applications. Memory-optimized tables also improve throughput and reduce latency for transaction processing, and can help improve performance of transient data scenarios such as temp tables and ETL.
  33. 33. SQL Server provides In-Memory OLTP features that can greatly improve the performance of application systems. Steps for In-Memory OLTP ALTER DATABASE CURRENT SET COMPATIBILITY_LEVEL = 140; GO Recommended to set the database to the latest compatibility level, particularly for In-Memory OLTP: ALTER DATABASE CURRENT SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON GO When a transaction involves both a disk-based table and a memory-optimized table, it’s essential that the memory- optimized portion of the transaction operates at the transaction isolation level named SNAPSHOT. ALTER DATABASE AdventureWorks ADD FILEGROUP AdventureWorks_mod CONTAINS memory_optimized_data GO ALTER DATABASE AdventureWorks ADD FILE (NAME='AdventureWorks_mod', FILENAME='c:varoptmssqldataAdventureWorks_mod') TO FILEGROUP AdventureWorks_mod GO Before you can create a memory-optimized table, you must first create a memory-optimized FILEGROUP and a container for data files:
  34. 34. Memory-optimized tables In short, memory-optimized tables are stored in main memory as opposed to on disk. Memory-optimized tables are fully durable by default; data is persisted to disk in the background. Memory-optimized tables can be accessed with T-SQL, but are accessed more efficiently with natively compiled stored procedures.
  35. 35. Memory-optimized tables The primary store for memory-optimized tables is main memory; unlike disk-based tables, data does not need to be read in to memory buffers from disk. CREATE TABLE dbo.ShoppingCart ( ShoppingCartId INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED, UserId INT NOT NULL INDEX ix_UserId NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000), CreatedDate DATETIME2 NOT NULL, TotalPrice MONEY ) WITH (MEMORY_OPTIMIZED=ON) GO To create a memory-optimized table, use the MEMORY_OPTIMIZED = ON clause INSERT dbo.ShoppingCart VALUES (8798, SYSDATETIME(), NULL) INSERT dbo.ShoppingCart VALUES (23, SYSDATETIME(), 45.4) INSERT dbo.ShoppingCart VALUES (80, SYSDATETIME(), NULL) INSERT dbo.ShoppingCart VALUES (342, SYSDATETIME(), 65.4) Insert records into the table
  36. 36. Natively compiled stored procedures Natively compiled stored procedures are Transact-SQL stored procedures that are compiled to native code and can access memory-optimized tables. For information on creating natively complied stored procedures, see: https://docs.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/creating-natively-compiled-stored-procedures Natively compiled stored procedures implement a subset of T-SQL. For more information, see: https://docs.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/supported-features-for-natively-compiled-t-sql-modules This allows for efficient execution of the queries and business logic in the stored procedure. Native compilation enables faster data access and more efficient query execution than interpreted (traditional) Transact-SQL.
  37. 37. ALTER TABLE Sales.SalesOrderDetail ALTER INDEX PK_SalesOrderID REBUILD WITH (BUCKET_COUNT=100000000) T-SQL surface area: New {LEFT|RIGHT} OUTER JOIN Disjunction (OR, NOT) UNION [ALL] SELECT DISTINCT Subqueries (EXISTS, IN, scalar) Better T-SQL coverage, including: • Full collations support in native modules • Query surface area improvements • Nested stored procedures (EXECUTE) • Natively compiled scalar user-defined functions • Query Store support Other improvements: • Full schema change support: add/alter/drop column/constraint • Increased size allowed for durable tables • ALTER TABLE support • Multiple Active Results Sets (MARS) support In-Memory OLTP enhancements (SQL Server 2016)
  38. 38. In-Memory OLTP enhancements (SQL Server 2017) • sp_spaceused is now supported for memory-optimized tables. • sp_rename is now supported for memory-optimized tables and natively compiled T-SQL modules. • CASE statements are now supported for natively compiled T-SQL modules. • The limitation of eight indexes on memory-optimized tables has been eliminated. • TOP (N) WITH TIES is now supported in natively compiled T-SQL modules. • ALTER TABLE against memory-optimized tables is now substantially faster in most cases. • Transaction log redo of memory-optimized tables is now done in parallel. This bolsters faster recovery times and significantly increases the sustained throughput of AlwaysOn Availability Group configuration. • Memory-optimized filegroup files can now be stored on Azure Storage. Backup/Restore of memory-optimized files on Azure Storage is supported. • Support for computed columns in memory-optimized tables, including indexes on computed columns. • Full support for JSON functions in natively compiled modules, and in check constraints. • CROSS APPLY operator in natively compiled modules. • Performance of B-tree (nonclustered) index rebuild for MEMORY_OPTIMIZED tables during database recovery has been significantly optimized. This improvement substantially reduces the database recovery time when nonclustered indexes are used.
  39. 39. Real-time analytics/ HTAP
  40. 40. Real-time analytics/HTAP SQL Server’s support for columnstore and In-Memory allows you to generate analytics in real time, direct from your transactional databases. This pattern is called Hybrid Transactional and Analytical Processing (HTAP), because it combines OLTP and OLAP in one database. • Analytics can be performed on operational data with minimal overhead • Improving the timeliness of analytics adds significant business value
  41. 41. Traditional operational/analytics architecture Key issues • Complex implementation • Requires two servers (capital expenditures and operational expenditures) • Data latency in analytics • High demand—requires real-time analytics IIS Server BI analysts
  42. 42. Minimizing data latency for analytics Challenges • Analytics queries are resource intensive and can cause blocking • Minimizing impact on operational workloads • Sub-optimal execution of analytics on relational schema Benefits • No data latency • No ETL • No separate data warehouse BI analysts IIS Server
  43. 43. Real-time analytics/HTAP The ability to run analytics queries concurrently with operational workloads using the same schema. Goals: • Minimal impact on operational workloads with concurrent analytics • Performance analytics for operational schema Not a replacement for: • Extreme analytics performance queries that are possible only using customized schemas (for example, Star/Snowflake) and preaggregated cubes • Data coming from nonrelational sources • Data coming from multiple relational sources requiring integrated analytics
  44. 44. HTAP: disk-based tables
  45. 45. HTAP with columnstore index Key points • Create an updateable NCCI for analytics queries • Drop all other indexes that were created for analytics • No application changes • Columnstore index is maintained just like any other index • Query optimizer will choose columnstore index where needed B-tree index Delta row groups Nonclustered columnstore index (NCCI)
  46. 46. Columnstore index overhead Operation B-tree (NCI) Nonclustered columnstore index (NCCI) Insert Insert row into B-tree. Insert row into B-tree (delta store). Delete (a) Seek row(s) to be deleted. (b) Delete the row. (a) Seek row in delta stores. (There can be multiple rows.) (b) If found, delete row. (c) If not found, insert key into delete row buffer. Update (a) Seek the row(s). (b) Update. (a) Delete row (same steps as above). (b) Insert updated row into delta store. DML operations on OLTP workload
  47. 47. Minimizing columnstore overhead Key points • Create a columnstore only on cold data by using filtered predicate to minimize maintenance • Analytics query accesses both columnstore and “hot” data transparently • Example: • Order management application: create nonclustered columnstore index where order_status = “SHIPPED” B-tree index Nonclustered columnstore index—filtered index B-tree index Delta row groups
  48. 48. Using Availability Groups instead of data warehouses Key points • Mission critical operational workloads typically configured for high availability using Always On Availability Groups • You can offload analytics to readable secondary replicaSecondary replica Secondary replica Secondary replica Primary replica Always On Availability Group
  49. 49. HTAP: In-Memory Tables
  50. 50. Columnstore on In-Memory tables No explicit delta row group • Rows (tail) not in columnstore stay in In-Memory OLTP table • No columnstore index overhead when operating on tail • Background task migrates rows from tail to columnstore in chunks of 1 million rows • Deleted Rows Table (DRT)—tracks deleted rows • Columnstore data fully resident in memory • Persisted together with operational data • No application changes required In-Memory OLTP table Deleted rows table Tail Nonclustered index Hash index Columnstore index
  51. 51. Operational analytics: Columnstore overhead Operation Hash or range index HK-CCI Insert Insert row into HK. Insert row into HK. Delete (a) Seek the row(s) to be deleted. (b) Delete the row. (a) Seek the row(s) to be deleted. (b) Delete the row in HK. (c) If the row is in the TAIL, then return. If not, insert <colstore-RID> into DRT. Update (a) Seek the row(s) to be updated. (b) Update (delete/insert). (a) Seek the row(s) to be updated. (b) Update (delete/insert) in HK. (c) If the row is in the TAIL, then return. If not, insert <colstore-RID> into DRT. DML operations on In-Memory OLTP
  52. 52. Minimizing columnstore index overhead In-Memory OLTP table Updateable CCI TailDRT Hash index Syntax: Create nonclustered columnstore index <name> on <table> (<columns>) with (compression_delay = 30) Key points • Delta rowgroup is only compressed after compression_delay duration • Minimizes/eliminates index fragmentation
  53. 53. High availability
  54. 54. Mission critical availability • Provides unified, simplified solution • Streamlines deployment, management, and monitoring • Reuses existing investments • Offers SAN/DAS environments • Allows use of HA hardware resources • Supports fast, transparent failover • Detects failures reliably • Handles multiple failures at once
  55. 55. Always On • Failover on SQL Server instance level • Shared storage (SAN/SMB) • Failover can take minutes based on load • Multi-node clustering • Passive secondary nodes • Failover on database level • Direct attached storage • Failover takes seconds • Multiple secondaries • Active secondaries Failover cluster instances for servers Availability Groups for groups of databases
  56. 56. Cluster nodeCluster node Failover cluster instances Server failover Shared storage Multi-node clustering Passive secondary nodes Failover in minutes Windows and Linux failover clusters are supported SQL Server 2017 Shared storage SQL Server 2017SQL Server failover cluster instance
  57. 57. Configuring failover clusters on Linux 1. Set up and configure the operating system on each cluster node. 2. Install and configure SQL Server on each cluster node. 3. Configure shared storage and move database files. 4. Install and configure Pacemaker on each cluster node. 5. Create the cluster. https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-shared-disk-cluster-configure
  58. 58. Always On Availability Groups Availability Groups: High availability and disaster recovery solution where one or several databases failover together. SQL Server 2017 supports one primary, and up to eight secondaries, for a total of nine replicas. Secondaries can be enabled as read-only replicas, which can be load balanced. Storage Storage Failover cluster Failover cluster Azure region Azure region
  59. 59. High availability and disaster recovery • Resilience against guest and OS level failures • Planned and unplanned events • Minimum downtime for patching and upgrades • Minutes RTO Simple HADR VM failure • Protection against accidental or malicious data corruption • DR protection • Minutes to hours RTO Backup/restore • Instance level protection • Automatic failure detection and failover • Seconds to minutes RTO • Resilience against OS and SQL Server failures Standard HADR Failover cluster • AG with two replicas • Replaces Database Mirroring Basic Availability Groups • Warm standbys for DR Log shipping • Database level protection • Seconds RTO • No data loss • Recover from unplanned outage • No downtime for planned maintenance • Offload read/backup workload to active secondaries • Failover to geographically distributed secondary site Availability Groups Mission critical HADR
  60. 60. Availability Groups and failover clustering (Windows) Always On: Failover Cluster Instances and Availability Groups work together to ensure data is accessible despite failures Windows Server Failover Clustering (WSFC) cluster Network Subnet Network Subnet Storage Node NodeNodeNodeNode SQL Server Instance SQL Server Instance SQL Server Instance AlwaysOn SQL Server Failover Cluster Instance Secondary Replica Secondary Replica Always On Availability Group Instance Network Name WSFC Configuration WSFC Configuration WSFC Configuration WSFC Configuration WSFC Configuration Instance Network Name Instance Network Name Instance Network Name Availability Group Listener Virtual Network Name Storage Storage Shared Storage Secondary Replica Primary Replica
  61. 61. Network Subnet Network Subnet Storage Node NodeNodeNodeNode SQL Server Instance SQL Server Instance SQL Server Instance AlwaysOn SQL Server Failover Cluster Instance Instance Network Name Pacemaker Configuration Pacemaker Configuration Pacemaker Configuration Pacemaker Configuration Pacemaker Configuration Instance Network Name Instance Network Name Instance Network Name Storage Storage Shared Storage Availability Groups and failover clustering (Linux) Always On: Failover Cluster Instances and Availability Groups work together to ensure data is accessible despite failures Pacemaker Cluster Pacemaker cluster virtual IP DNS name (manual registration) Secondary Replica Secondary Replica Always On Availability Group Secondary Replica Primary Replica
  62. 62. Always On cross-platform capabilities Mission critical availability on any platform • Always On Availability Groups for Linux NEW* and Windows for HA and DR • Flexibility for HA architectures NEW* • Ultimate HA with OS-level redundancy and failover • Load balancing of readable secondaries •High Availability •Offload backups •Scale BI reporting •Enables testing •Enables migrations
  63. 63. Greater scalability • Load balancing readable secondaries • Increased number of automatic failover targets • Log transport performance • Distributed Availability Groups Improved manageability • DTC support • Database-level health monitoring • Group Managed Service Account • Domain-independent Availability Groups • Basic HA in Standard Edition AG_Listener New York (Primary) Asynchronous data movement Synchronous data movement Unified HA solution Enhanced Always On Availability Groups (SQL Server 2016) Hong Kong (Secondary) New Jersey (Secondary)
  64. 64. Guarantee commits on synchronous secondary replicas Use REQUIRED_COPIES_TO_COMMIT with CREATE AVAILABILITY GROUP or ALTER AVAILABILITY GROUP. When REQUIRED_COPIES_TO_COMMIT is set to a value higher than 0, transactions at the primary replica databases will wait until the transaction is committed on the specified number of synchronous secondary replica database transaction logs. If enough synchronous secondary replicas are not online, write transactions to primary replicas will stop until communication with sufficient secondary replicas resumes. Enhanced Always On Availability Groups (SQL Server 2017) AG_Listener New York (Primary) Asynchronous data movement Synchronous data movement Unified HA solution Hong Kong (Secondary) New Jersey (Secondary)
  65. 65. CLUSTER_TYPE CLUSTER_TYPE Use with CREATE AVAILABILITY GROUP. Identifies the type of server cluster manager that manages an availability group. Can be one of the following types: WSFC: Windows Server failover cluster. On Windows, it is the default value for CLUSTER_TYPE. EXTERNAL: A cluster manager that is not a Windows Server failover cluster—for example, on Linux with Pacemaker. NONE: No cluster manager. Used for a read- scale availability group. Enhanced Always On Availability Groups (SQL Server 2017) AG_Listener New York (Primary) Asynchronous data movement Synchronous data movement Unified HA solution Hong Kong (Secondary) New Jersey (Secondary)
  66. 66. DR Build a mission critical enterprise application Scenario • All-Linux infrastructure • Application-level protection • Automatic and “within seconds” failover during unplanned outages • No downtime during planned maintenance • Performance-sensitive application • DR required for regulatory compliance Solution HADR with Always On Availability Groups on Linux or Windows HA P BackupsReports Sync Log Synchronization Async Log Synchronization
  67. 67. Provide responsive regional BI with Azure and AG Scenario • Primary replica in on-premises datacenter • Secondary read-only replicas in on-premises datacenter used for reporting/BI • BI generated in other geographical regions performs poorly because of network bandwidth limitations • No on-premises datacenters in other geographical regions Solution Hybrid Availability Group with read-only secondary in Azure (other region) P S1 S3 S2 Hybrid AG
  68. 68. Scale/DR with Distributed Availability Groups Scenario • Availability Group must span multiple datacenters • Not possible to add all servers to a single WSFC (datacenter networks/inter-domain trust) • Secondary datacenter provides DR • Geographically distributed read-only replicas required Solution Distributed Always On Availability Groups on Linux or Windows Async Log Synchronization
  69. 69. Migration/testing Scenarios • ISV solution built on SQL Server on Windows • Linux Certification • Enterprise moving to an all-Linux infrastructure • Rigorous business requirements • Seamless migration Solution Minimum downtime and HA for cross- platform migrations with Distributed Availability Groups Migration/testing
  70. 70. Improve read concurrency with read-scale Availability Groups Scenario • SaaS app (website) • Catalog database with high volume of concurrent read-only transactions • Bottlenecks on Availability Groups primary due to read workloads • Increased response time • HA/DR elements of Availability Groups not required Solution Read-scale Availability Groups • No cluster required • Both Linux and Windows P S1 S2 S3 S4
  71. 71. Automatic tuning
  72. 72. Automatic tuning Automatic plan correction identifies problematic plans and fixes SQL plan performance problems: Adapt Verify Learn
  73. 73. Automatic plan choice detection sys.dm_db_tuning_recommendations
  74. 74. Automatic plan correction sys.dm_db_tuning_recommendations by enabling the AUTOMATIC_TUNING database property: ALTER DATABASE current SET AUTOMATIC_TUNING ( FORCE_LAST_GOOD_PLAN = ON );
  75. 75. Adaptive query processing
  76. 76. Adaptive query processing Three features to improve query performance Enabled when the database is in SQL Server 2017 compatibility mode (140) ALTER DATABASE current SET COMPATIBILITY_LEVEL = 140; Adaptive Query Processing Interleaved Execution Batch Mode Memory Grant Feedback Batch Mode Adaptive Joins
  77. 77. Query processing and cardinality estimation When estimates are accurate (enough), we make informed decisions around order of operations and physical algorithm selection CE uses a combination of statistical techniques and assumptions During optimization, the cardinality estimation (CE) process is responsible for estimating the number of rows processed at each step in an execution plan
  78. 78. Common reasons for incorrect cardinality estimates Missing statistics Stale statistics Inadequate statistics sample rate Bad parameter sniffing scenarios Out-of-model query constructs • For example, MSTVFs, table variables, XQuery Assumptions not aligned with data being queried • For example, independence versus correlation
  79. 79. Cost of incorrect estimates Slow query response time due to inefficient plans Excessive resource utilization (CPU, Memory, IO) Spills to disk Reduced throughput and concurrency T-SQL refactoring to work around off-model statements
  80. 80. Interleaved execution Pre 2017 2017+ 100 rows guessed for MSTVFs MSTVF identified 500,000 rows assumed Performance issues if skewed Execute MSTVF Good performance Problem: Multi-statement table valued functions (MSTVFs) are treated as a black box by QP and we use a fixed optimization guess. Interleaved execution will materialize row counts for MSTVFs. Downstream operations will benefit from the corrected MSTVF cardinality estimate. Optimize Execute Optimize Execute Optimize Execute…
  81. 81. Batch mode memory grant feedback Problem: Queries can spill to disk or take too much memory, based on poor cardinality estimates. Memory grant feedback (MGF) will adjust memory grants based on execution feedback. MGF will remove spills and improve concurrency for repeating queries.
  82. 82. Batch mode adaptive joins Problem: If cardinality estimates are skewed, we might choose an inappropriate join algorithm. Batch mode adaptive joins (AJ) will defer the choice of hash join or nested loop until after the first join input has been scanned. AJ uses nested loop for small inputs, and hash joins for large inputs. Build input Adaptive threshold Hash join Nested loop Yes No
  83. 83. About interleaved execution Expected performance improvements? Benefits workloads with skews and downstream operations 0 20 40 60 80 100 120 140 160 A B Original Exec Time (sec) Interleaved Execution Time (sec)
  84. 84. About interleaved execution • Minimal, because MSTVFs are always materializedExpected overhead? • First execution cached will be used by consecutive executionsCached plan considerations • Contains interleaved execution candidates • Is interleaved executed Plan attributes • Execution status, CE update, disabled reasonXevents
  85. 85. Interleaved execution candidates SELECT statements 140 compatibility level MSTVF not used on the inside of a CROSS APPLY Not using plan forcing Not using USE HINT with DISABLE_PARAMETER_SNIFFING (or TF 4136)
  86. 86. About batch mode memory grant feedback • Benefits workloads with spills or overages Expected performance improvements? Before After
  87. 87. About batch mode memory grant feedback • If a plan has oscillating memory requirements, the feedback loop for that plan is disabledExpected overhead? • Spill report, and updates by feedbackXEvents • For spills—spill size plus a buffer • For overages—reduce based on waste, and add a buffer Expected decrease and increase size? • Memory grant size will go back to original RECOMPILE or eviction scenarios
  88. 88. About batch mode adaptive join Expected performance benefit? • Performance gains occur for workloads where, prior to adaptive joins being available, the optimizer chooses the wrong join type due to incorrect cardinality estimates. 0 2 4 6 8 10 12 14 16 18 Customer Invoice Transactions Retail Top Products Report Adaptive Join Test Results Adaptive join enabled (seconds) Adaptive join disabled (seconds)
  89. 89. About batch mode adaptive join Queries involving columnstore indexes can dynamically switch between nested loop join and hash join operators at execution time:
  90. 90. About batch mode adaptive join • Memory is granted even for a nested loop scenario—if nested loop is always optimal, there is a greater overheadExpected overhead? • Adaptive threshold rows, estimated and actual join typePlan attributes • Adaptive join skippedXEvents • Single compiled plan can accommodate low and high row scenariosCached plan considerations
  91. 91. About batch mode adaptive join Eligible statements • The join is eligible to be executed both by an indexed nested loop join or a hash join physical algorithm. • The hash join uses batch mode—either through the presence of a columnstore index in the query overall or a columnstore indexed table being referenced directly by the join. • The generated alternative solutions of the nested loop join and hash join should have the same first child (outer reference).
  92. 92. Adaptive join threshold 1 50,000 Cost Rows Adaptive Join Threshold Hash Join Nested Loop Join
  93. 93. Query Store
  94. 94. Problems with query performance Fixing query plan choice regressions is difficult Query plan cache is not well suited for performance troubleshooting. Long time to detect the issue (TTD) Which query is slow? Why is it slow? What was the previous plan? Long time to mitigate (TTM) Can I modify the query? How to use plan guide? Temporary performance issues Website is down DB upgraded Database is not working Impossible to predict root cause Regression caused by new bits
  95. 95. The solution: Query Store Dedicated store for query workload performance data • Captures the history of plans for each query • Captures the performance of each plan over time • Persists the data to disk (works across restarts, upgrades, and recompiles) Significantly reduces TTD/TTM • Find regressions and other issues in seconds • Allows you to force previous plans from history DBA is now in control
  96. 96. Durability latency controlled by DB option DATA_FLUSH_INTERNAL_SECONDS Compile Execute Plan store Runtime stats Query Store schema Query Store architecture • Collects query texts (plus all relevant properties) • Stores all plan choices and performance metrics • Works across restarts/upgrade /recompiles • Dramatically lowers the bar for performance troubleshooting • New views • Intuitive and easy plan forcing
  97. 97. Query Store write architecture Query StoreQuery Execution Internal tables Query and plan store Runtime stats store Query execute stats Compile Execute async Query text and plan
  98. 98. Query Store read architecture • Views merge in-memory and on-disk content • Users always see “latest” data Query Store views Query StoreQuery Execution Internal tables Query and plan store Runtime stats store Query execute stats Compile Execute async Query text and plan
  99. 99. Keeping stability while upgrading to SQL Server 2017 Install bits keep existing compatibility level Run Query Store (create a baseline) Move to compatibility level 140 Fix regressions with plan forcing SQL Server 2017 Query Optimizer (QO) enhancements tied to database compatibility level
  100. 100. Monitoring performance by using the Query Store The Query Store feature provides DBAs with insight on query plan choice and performance
  101. 101. Working with Query Store /* (1) Turn ON Query Store */ ALTER DATABASE MyDB SET QUERY_STORE = ON; /* (2) Review current Query Store parameters */ SELECT * FROM sys.database_query_store_options /* (3) Set new parameter values */ ALTER DATABASE MyDB SET QUERY_STORE ( OPERATION_MODE = READ_WRITE, CLEANUP_POLICY = ( STALE_QUERY_THRESHOLD_DAYS = 30 ), DATA_FLUSH_INTERVAL_SECONDS = 3000, MAX_SIZE_MB = 500, INTERVAL_LENGTH_MINUTES = 15 ); /* (4) Clear all Query Store data */ ALTER DATABASE MyDB SET QUERY_STORE CLEAR; /* (5) Turn OFF Query Store */ ALTER DATABASE MyDB SET QUERY_STORE = OFF; /* (6) Performance analysis using Query Store views*/ SELECT q.query_id, qt.query_text_id, qt.query_sql_text, SUM(rs.count_executions) AS total_execution_count FROM sys.query_store_query_text qt JOIN sys.query_store_query q ON qt.query_text_id = q.query_text_id JOIN sys.query_store_plan p ON q.query_id = p.query_id JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id GROUP BY q.query_id, qt.query_text_id, qt.query_sql_text ORDER BY total_execution_count DESC /* (7) Force plan for a given query */ exec sp_query_store_force_plan 12 /*@query_id*/, 14 /*@plan_id*/ DB-level feature exposed through T-SQL extensions • ALTER DATABASE • Catalog views (settings, compile, and runtime stats) • Stored procedures (plan forcing, query/plan/stats cleanup)
  102. 102. Query Store enhancements (SQL Server 2017) • Query Store now tracks wait stats summary information. Tracking wait stats categories per query in Query Store enables the next level of performance troubleshooting experience. It provides even more insight into the workload performance and its bottlenecks while preserving the key Query Store advantages.
  103. 103. Live query statistics View CPU/memory usage, execution time, query progress, and more. Enables rapid identification of potential bottlenecks for troubleshooting query performance issues. Allows drill-down to live operator level statistics: • Number of generated rows • Elapsed time • Operator progress • Live warnings
  104. 104. Summary: Query Store Capability • Query Store helps customers quickly find and fix query performance issues • Query Store is a “flight data recorder” for database workloads Benefits • Greatly simplifies query performance troubleshooting • Provides performance stability across SQL Server upgrades • Allows deeper insight into workload performance
  105. 105. Resource Governor
  106. 106. Resource Governor Resource Governor enables you to specify limits on the amount of CPU, physical IO, and memory that incoming application requests to the Database Engine can use. With Resource Governor, you can: • Provide multitenancy and resource isolation on single instances of SQL Server that serve multiple client workloads. • Provide predictable performance and support SLAs for workload tenants in a multiworkload and multiuser environment. • Isolate and limit runaway queries or throttle IO resources for operations such as DBCC CHECKDB that can saturate the IO subsystem and negatively affect other workloads. • Add fine-grained resource tracking for resource usage chargebacks and to provide predictable billing to consumers of the server resources.
  107. 107. Resource Governor architecture
  108. 108. Defining resource pools A resource pool represents the physical resources of the server. A pool is defined as minimum and/or maximum constraints on server resources (CPU, memory, and physical IO): • MIN_CPU_PERCENT and MAX_CPU_PERCENT • CAP_CPU_PERCENT • MIN_MEMORY_PERCENT and MAX_MEMORY_PERCENT • AFFINITY • MIN_IOPS_PER_VOLUME and MAX_IOPS_PER_VOLUME
  109. 109. Data warehousing and big data
  110. 110. Columnstore
  111. 111. Data stored as columns SQL Server performance features: Columnstore Columnstore A technology for storing, retrieving, and managing data by using a columnar data format called a columnstore. You can use columnstore indexes for real-time analytics on your operational workload. Key benefits Provides a very high level of data compression, typically 10x, to reduce your data warehouse storage cost significantly. Indexing on a column with repeated values vastly improves performance for analytics. Improved performance: • More data fits in memory • Batch-mode execution
  112. 112. Columnstore: Clustered vs. nonclustered indexes In SQL Server, rowstore refers to a table where the underlying data storage format is a heap, clustered index, or memory-optimized table. Data that is logically organized as a table with rows and columns, and then physically stored in a row- wise data format. Rowstore Data that is logically organized as a table with rows and columns, and physically stored in a column- wise data format. Columnstore
  113. 113. Columnstore: Clustered vs. nonclustered indexes A secondary index on the standard table (rowstore). Nonclustered index The primary storage for the entire table. Clustered index Both columnstore indexes offer high compression (10x) and improved query performance. Nonclustered indexes enable a standard OLTP workload on the underlying rowstore, and a separate simultaneous analytical workload on the columnstore—with negligible impact to performance (Real-Time Operational Analytics).
  114. 114. Steps to creating a columnstore (NCCI) Add a columnstore index to the table by executing the T-SQL SELECT ProductID, SUM(UnitPrice) SumUnitPrice, AVG(UnitPrice) AvgUnitPrice, SUM(OrderQty) SumOrderQty, AVG(OrderQty) AvgOrderQty FROM Sales.SalesOrderDetail GROUP BY ProductID ORDER BY ProductID Execute the query that should use the columnstore index to scan the table SELECT * FROM sys.indexes WHERE name = 'IX_SalesOrderDetail_ColumnStore' GO SELECT * FROM sys.dm_db_index_usage_stats WHERE database_id = DB_ID('AdventureWorks') AND object_id = OBJECT_ID('AdventureWorks.Sales.SalesOrderDetail'); Verify that the columnstore index was used by looking up its object_id and confirming that it appears in the usage stats for the table CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_SalesOrderDetail_ColumnStore] ON Sales.SalesOrderDetail (UnitPrice, OrderQty, ProductID) GO
  115. 115. Columnstore index enhancements (SQL Server 2016) Improvements SQL Server 2016 Clustered columnstore index • Master copy of the data (10x compression) • Additional B-tree indexes for efficient equality, short-range searches, and PK/FK constraints • Locking granularity at row level using NCI index path • DDL: ALTER, REBUILD, REORGANIZE Updatable nonclustered index • Updatable • Ability to mix OLTP and analytics workload • Ability to create filtered NCCI • Partitioning supported Equality and short-range queries • Optimizer can choose NCI on column C1; index points directly to rowgroup • No full index scan • Covering NCI index String predicate pushdown • Apply filter on dictionary entries • Find rows that refer to dictionary entries that qualify (R1) • Find rows not eligible for this optimization (R2) • Scan returns (R1 + R2) rows • Filter node applies string predicate on (R2) • Row returned by Filter node = (R1 + R2)
  116. 116. Columnstore index enhancements (SQL Server 2017) • Clustered columnstore indexes now support LOB columns (nvarchar(max), varchar(max), varbinary(max)) • Online nonclustered columnstore index build and rebuild support added
  117. 117. PolyBase
  118. 118. Interest in big data spurs customer demand Adoption of big data technologies like Hadoop Increase in number and variety of data sources that generate large quantities of data Realization that data is “too valuable” to delete Dramatic decline in hardware cost, especially storage $
  119. 119. Query relational and non-relational data, on-premises and in Azure T-SQL query Apps SQL Server Hadoop PolyBase Query relational and non-relational data with T-SQL
  120. 120. PolyBase view • Execute T-SQL queries against relational data in SQL Server and semi-structured data in Hadoop or Azure Blob storage • Use existing T-SQL skills and BI tools to gain insights from different data stores PolyBase in SQL Server 2017
  121. 121. PolyBase use cases
  122. 122. PolyBase components • PolyBase Engine Service • PolyBase Data Movement Service (with HDFS Bridge) • External table constructs • MR pushdown computation support Head Node SQL 2017 PolyBase Engine PolyBase DMS
  123. 123. PolyBase architecture PolyBase T-SQL queries submitted here PolyBase queries can only refer to tables and/or external tables here PolyBase Group Head Node Compute Nodes SQL 2017 PolyBase Engine PolyBase DMS SQL 2017 PolyBase DMS SQL 2017 PolyBase DMS SQL 2017 PolyBase DMS Hadoop Cluster Namenode Datanode Datanode Datanode Datanode File System AB 01 01 01 01 File System File System File System
  124. 124. Supported big data sources Hortonworks HDP 1.3 on Linux/Windows Server Hortonworks HDP 2.0-2.3 on Windows Server Hortonworks HDP 2.0-2.6 on Windows Server Cloudera CDH 4.3 on Linux Cloudera CDH 5.1-5.12 on Linux Azure Blob storage What happens behind the scenes? Loading the right client jars to connect to Hadoop distribution -- different numbers map to various Hadoop flavors -- example: value 4 stands for HDP 2.x on Linux, value 5 for HDP 2.x on Windows, value 6 for CHD 5.x on Linux Supported big data sources
  125. 125. After setup • Compute nodes are used for scale-out query processing on external tables in HDFS • Tables on compute nodes cannot be referenced by queries submitted to head node • Number of compute nodes can be dynamically adjusted by DBA • Hadoop clusters can be shared among multiple PolyBase groups PolyBase Group Head Node Compute Nodes SQL 2017 PolyBase Engine PolyBase DMS SQL 2017 PolyBase DMS SQL 2017 PolyBase DMS SQL 2017 PolyBase DMS Hadoop Cluster Namenode Datanode Datanode Datanode Datanode File System AB 01 01 01 01 File System File System File System
  126. 126. CREATE EXTERNAL DATA SOURCE HadoopCluster WITH( TYPE = HADOOP, LOCATION = 'hdfs://10.14.0.4:8020' ); CREATE EXTERNAL FILE FORMAT CommaSeparatedFormat WITH( FORMAT_TYPE = DELIMITEDTEXT, FORMAT_OPTIONS (FIELD_TERMINATOR = ',', USE_TYPE_DEFAULT = TRUE) ); CREATE EXTERNAL TABLE [dbo].[SensorData]( vin varchar(255), speed int, fuel int, odometer int, city varchar(255), datatimestamp varchar(255) ) WITH( LOCATION = '/apps/hive/warehouse/sensordata', DATA_SOURCE = HadoopCluster, FILE_FORMAT = CommaSeparatedFormat ); • Create an external data source • Create an external file format • Create an external table for unstructured data Creating Polybase objects
  127. 127. SELECT [vin], [speed], [datetimestamp] FROM dbo.SensorData SELECT [make], [model], [modelYear], [speed], [datetimestamp] FROM dbo.AutomobileData LEFT JOIN dbo.SensorData ON dbo.AutomobileData.[vin] = dbo.SensorData.[vin] Query external data table as SQL data • Data returned as defined in external data table Join SQL data with external data • Join data between internal and external table • All TSQL commands supported • PolyBase will optimize between SQL-side query and pushdown to MapReduce Polybase queries
  128. 128. Resumable online indexing
  129. 129. Resumable online indexing With resumable online index rebuild you can resume a paused index rebuild operation from where the rebuild operation was paused, rather than having to restart the operation at the beginning. In addition, this feature rebuilds indexes using only a small amount of log space. • Resume an index rebuild operation after an index rebuild failure, such as following a database failover or after running out of disk space. There’s no need to restart the operation from the beginning. This can save a significant amount of time when rebuilding indexes for large tables. • Pause an ongoing index rebuild operation and resume it later—for example, to temporarily free up system resources to execute a high priority. Instead of aborting the index rebuild process, you can pause the index rebuild operation, and resume it later without losing prior progress. • Rebuild large indexes without using a lot of log space and have a long-running transaction that blocks other maintenance activities. This helps log truncation and avoids out-of-log errors that are possible for long-running index rebuild operations.
  130. 130. Using resumable online index rebuild Start a resumable online index rebuild ALTER INDEX test_idx on test_table REBUILD WITH (ONLINE=ON, RESUMABLE=ON) ; Pause a resumable online index rebuild ALTER INDEX test_idx on test_table PAUSE ; Resume a paused online index rebuild ALTER INDEX test_idx on test_table RESUME ; Abort a resumable online index rebuild (which is running or paused) ALTER INDEX test_idx on test_table ABORT ; View metadata about resumable online index operations SELECT * FROM sys.index_resumable_operations ;
  131. 131. Partitioning
  132. 132. Partitioning SQL Server supports partitioning of tables and indexes. In partitioning, a logical table (or index) is split into two or more physical partitions, each containing a portion of the data. Allocation of data to (and retrieval from) the partitions is managed automatically by the Database Engine, based on a partition function and partition scheme that you define. Partitioning can enhance the performance and manageability of large data sets, enabling you to work with a subset of the data.
  133. 133. Steps to create a partitioned table Create a partition function -- Creates a partition scheme called myRangePS1 that applies myRangePF1 to four database filegroups CREATE PARTITION SCHEME myRangePS1 AS PARTITION myRangePF1 TO (test1fg, test2fg, test3fg, test4fg) ; GO Create a partition scheme (assumes four filegroups, test1fg to test4fg) -- Creates a partitioned table called PartitionTable that uses myRangePS1 to partition col1 CREATE TABLE PartitionTable (col1 int PRIMARY KEY, col2 char(10)) ON myRangePS1 (col1) ; GO Create a partitioned table based on the partition scheme -- Creates a partition function called myRangePF1 that will partition a table into four partitions CREATE PARTITION FUNCTION myRangePF1 (int) AS RANGE LEFT FOR VALUES (1, 100, 1000) ; GO New data added to the table will be assigned to a partition, based on the values provided for col1.
  134. 134. Manage large tables with table partitioning Scenario • Log data table grows by millions of rows a day • Removing old data (for regulatory compliance) exceeds maintenance window Solution • Partition the table with a partition function based on date (day or month) • New data loaded into the current active partition • Historic data can be removed by clearing down partitions
  135. 135. Security
  136. 136. SQL Protect data Encryption at rest Transparent Data Encryption Backup Encryption Cell-Level Encryption Encryption in transit Transport Layer Security (SSL/TLS) Encryption in use (client) Always Encrypted Control access Database access SQL Authentication Active Directory Authentication Granular Permissions Application access Row-Level Security Dynamic Data Masking Security Monitor access Tracking activities Fine-Grained Audit
  137. 137. SQL Server 2017 and GDPR compliance Control access to personal data • Authentication • Authorization • Dynamic Data Masking • Row-Level Security Safeguarding data • Transparent Data Encryption • Transport Layer Security (TLS) • Always Encrypted • SQL Server Audit
  138. 138. Always Encrypted
  139. 139. Always Encrypted Always Encrypted allows clients to encrypt sensitive data inside client applications, and never reveal the encryption keys to the Database Engine. As a result, Always Encrypted provides a separation between those who own the data (and can view it) and those who manage the data (but should have no access). Always Encrypted makes encryption transparent to applications. An Always Encrypted-enabled driver installed on the client computer achieves this by automatically encrypting and decrypting sensitive data in the client application. The driver encrypts the data in sensitive columns before passing the data to the Database Engine, and automatically rewrites queries so that the semantics to the application are preserved. Similarly, the driver transparently decrypts data that is stored in encrypted database columns, and contained in query results.
  140. 140. Protect your data at rest and in motion (without impacting database performance) Always Encrypted Query TrustedApps SELECT Name FROM Patients WHERE SSN=@SSN @SSN='198-33-0987' Result Set SELECT Name FROM Patients WHERE SSN=@SSN @SSN=0x7ff654ae6d Column Encryption Key Enhanced ADO.NET Library Column Master Key Client side ciphertext Name 243-24-9812 SSN Country Denny Usher 198-33-0987 USA Alicia Hodge 123-82-1095 USA Philip Wegner USA dbo.Patients SQL Server dbo.Patients Philip Wegner Name SSN USA Denny Usher 0x7ff654ae6d USA Alicia Hodge 0x8fj754ea2c USA 0x7fg655se2e Country Philip Wegner Name 0x7fg655se2e SSN USA Country Denny Usher 0x7ff654ae6d USA Alicia Hodge 0x8fj754ea2c USA dbo.Patients Result Set Denny Usher Name 0x7ff654ae6d SSN USA Country 198-33-0987
  141. 141. Row-Level Security
  142. 142. Row-Level Security Row-Level Security (RLS) enables customers to control access to rows in a database table based on the characteristics of the user executing a query (for example, in a group membership or execution context). • The database system applies the access restrictions every time a tier attempts to access data • This makes the security system more reliable and robust by reducing the surface area of your security system • RLS works with a predicate (condition) which, when true, allows access to appropriate rows • Can be either a filter or block predicate • A filter predicate “filters out” rows from a query—the filter is transparent, and the end user is unaware of any filtering • A block predicate prevents unauthorized action, and will throw an exception if the action cannot be performed
  143. 143. Configure Row-Level Security 1. Create user accounts to test Row-Level Security GRANT SELECT ON Sales.SalesOrderHeader TO Manager; GRANT SELECT ON Sales.SalesOrderHeader TO SalesPerson280; 2. Grant read access to users on a required table CREATE SCHEMA Security; GO CREATE FUNCTION Security.fn_securitypredicate(@SalesPersonID AS int) RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT 1 AS fn_securitypredicate_result WHERE ('SalesPerson' + CAST(@SalesPersonId as VARCHAR(16)) = USER_NAME()) OR (USER_NAME() = 'Manager'); 3. Create a new schema and inline table-valued function USE AdventureWorks2014; GO CREATE USER Manager WITHOUT LOGIN; CREATE USER SalesPerson280 WITHOUT LOGIN; CREATE SECURITY POLICY SalesFilter ADD FILTER PREDICATE Security.fn_securitypredicate(SalesPersonID) ON Sales.SalesOrderHeader, ADD BLOCK PREDICATE Security.fn_securitypredicate(SalesPersonID) ON Sales.SalesOrderHeader WITH (STATE = ON); 4. Create a security policy, adding the function as both a filter and block predicate on the table 5. Execute the query to the required table so that each user sees the result (can also alter the security policy to disable)
  144. 144. Dynamic Data Masking
  145. 145. • Configuration made easy in the new Azure portal • Policy-driven at the table and column level, for a defined set of users • Data masking applied in real time to query results based on policy • Multiple masking functions available (for example, full, partial) for various sensitive data categories (credit card numbers, SSN, and so on) SQL Database SQL Server 2017 Table.CreditCardNo 4465-6571-7868-5796 4468-7746-3848-1978 4484-5434-6858-6550 Real-time data masking, partial masking Dynamic Data Masking Prevent the abuse of sensitive data by hiding it from users
  146. 146. Dynamic Data Masking walkthrough ALTER TABLE [Employee] ALTER COLUMN [SocialSecurityNumber] ADD MASKED WITH (FUNCTION = ‘SSN()’) ALTER TABLE [Employee] ALTER COLUMN [Email] ADD MASKED WITH (FUNCTION = ‘EMAIL()’) ALTER TABLE [Employee] ALTER COLUMN [Salary] ADD MASKED WITH (FUNCTION = ‘RANDOM(1,20000)’) GRANT UNMASK to admin1 1. Security officer defines Dynamic Data Masking policy in T-SQL over sensitive data in Employee table. 2. Application user selects from Employee table. 3. Dynamic Data Masking policy obfuscates the sensitive data in the query results. SELECT [Name], [SocialSecurityNumber], [Email], [Salary] FROM [Employee]
  147. 147. Configure Dynamic Data Masking Use an ALTER TABLE statement to add a masking function to the required column in the table CREATE USER TestUser WITHOUT LOGIN; GRANT SELECT ON Person.EmailAddress TO TestUser; Create a new user with SELECT permission on the table, and then execute a query to view masked data Verify that the masking function changes the required column with a masked field USE AdventureWorks2014; GO ALTER TABLE Person.EmailAddress ALTER COLUMN EmailAddress ADD MASKED WITH (FUNCTION = 'email()'); EXECUTE AS USER = 'TestUser'; SELECT EmailAddressID, EmailAddress FROM Person.EmailAddress; REVERT;
  148. 148. Auditing
  149. 149. SQL Server 2017 Auditing • SQL Server Audit is the primary auditing tool in SQL Server • Track and log server-level events in addition to individual database events • SQL Server Audit uses Extended Events to help create and run audit-related events • SQL Server Audit includes several audit components: SQL Server Audit: This container holds a single audit specification for either server- or database-level audits. You define multiple server audits to run simultaneously. SQL Server Audit specifications: This tracks server-level audits and invokes the necessary extended events as defined by the user. You can define only one server audit per audit (container). SQL Server Database Audit specifications: This object also comes under the server audit. User-defined database-level events are tracked and logged. Predefined templates help you define a database audit.
  150. 150. Unfortunately it cannot be done at column level as of yet Permissions required: ALTER ANY SERVER AUDIT CONTROL SERVER SQL Server Audit The server audit is the parent component of a SQL Server audit and can contain both: Server audit specifications Database audit specifications It resides in the master database, and is used to define where the audit information will be stored, the file rollover policy, the queue delay and how SQL Server should react in case auditing is not possible. The following server audit configuration is required: The server audit name The action to take • Continue and ignore the log issue • Shut down the server • Fail the operation The audit destination
  151. 151. Database Audit Specification Unfortunately this cannot yet be done at column level Permissions required: ALTER ANY DATABASE AUDIT SPECIFICATION. ALTER or CONTROL (permission for the database to which you would like to add the audit) This is at the database level. Using more granular auditing can minimize the performance impact on your server. This is done by using a Database Audit Specification that is only available in Enterprise Edition. Using the Database Audit Specification, auditing can be performed at object or user level. • The Database Audit Specification name (optional—default name will be assigned) • The server audit that the specification must be linked to • The Audit Action type. There are both: Audit Actions Audit Action groups (which may be selected, INSERTED and UPDATED or DELETED) • The object name of the object to be audited when an Audit Action has been selected • The schema of the selected object • The principal name. To audit all users, use the keyword “public” in this field
  152. 152. TODO: BI placeholder
  153. 153. Advanced analytics
  154. 154. Machine Learning Services
  155. 155. In-Database analytics with SQL Server In SQL Server 2016, Microsoft launched two server platforms for integrating the popular open source R language with business applications: • SQL Server R Services (In-Database), for integration with SQL Server • Microsoft R Server, for enterprise-level R deployments on Windows and Linux servers In SQL Server 2017, the name has been changed to reflect support for the popular Python language: • SQL Server Machine Learning Services (In-Database) supports both R and Python for in- database analytics • Microsoft Machine Learning Server supports R and Python deployments on Windows servers—expansion to other supported platforms is planned for late 2017
  156. 156. Capability • Extensible in-database analytics, integrated with R, exposed through T-SQL • Centralized enterprise library for analytic models Benefits SQL Server Analytical engines Integrate with R/Python Data management layer Relational data Use T-SQL interface Stream data in-memory Analytics library Share and collaborate Manage and deploy R Data scientists Business analysts Publish algorithms, interact directly with data Analyze through T-SQL, tools, and vetted algorithms DBAs Manage storage and analytics together Machine Learning Services
  157. 157. Enhanced Machine Learning Services (SQL Server 2017) • Python support • Microsoft Machine Learning package included • Process multiple related models in parallel with the rxExecBy function • Create a shared script library with R script package management • Native scoring with T-SQL PREDICT • In-place upgrade of R components
  158. 158. Setup and configuration SQL Server 2017 setup Install Machine Learning Services (In-Database) Consent to install Microsoft R Open/Python Optional: Install R packages on SQL Server 2017 machine Database configuration Enable R language extension in database Configure path for RRO runtime in database Grant EXECUTE EXTERNAL SCRIPT permission to users CREATE EXTERNAL EXTENSION [R] USING SYSTEM LAUNCHER WITH (RUNTIME_PATH = 'c:revolutionbin‘) GRANT EXECUTE SCRIPT ON EXTERNAL EXTENSION::R TO DataScientistsRole; /* User-defined role / users */
  159. 159. Management and monitoring ML runtime usage Resource governance via resource pool Monitoring via DMVs Troubleshooting via XEvents/ DMVs CREATE RESOURCE POOL ML_runtimes FOR EXTERNAL EXTENSION WITH MAX_CPU_PERCENT = 20, MAX_MEMORY_PERCENT = 10; select * from sys.dm_resource_governor_resouce_pools where name = ‘ML_runtimes';
  160. 160. External script usage from SQL Server Original R script: IrisPredict <- function(data, model){ library(e1071) predicted_species <- predict(model, data) return(predicted_species) } library(RODBC) conn <- odbcConnect("MySqlAzure", uid = myUser, pwd = myPassword); Iris_data <-sqlFetch(conn, "Iris_Data"); Iris_model <-sqlQuery(conn, "select model from my_iris_model"); IrisPredict (Iris_data, model); Calling R script from SQL Server: /* Input table schema */ create table Iris_Data (name varchar(100), length int, width int); /* Model table schema */ create table my_iris_model (model varbinary(max)); declare @iris_model varbinary(max) = (select model from my_iris_model); exec sp_execute_external_script @language = 'R' , @script = ' IrisPredict <- function(data, model){ library(e1071) predicted_species <- predict(model, data) return(predicted_species) } IrisPredict(input_data_1, model); ' , @parallel = default , @input_data_1 = N'select * from Iris_Data' , @params = N'@model varbinary(max)' , @model = @iris_model with result sets ((name varchar(100), length int, width int , species varchar(30))); • Values highlighted in yellow are SQL queries embedded in the original R script • Values highlighted in aqua are R variables that bind to SQL variables by name
  161. 161. The SQL extensibility architecture launchpad.exe sp_execute_external_script sqlservr.exe Named pipe SQLOS XEvent MSSQLSERVER Service MSSQLLAUNCHPAD Service (one per SQL Server instance) What and how to launch “launcher” Bxlserver.exe sqlsatellite.dll Bxlserver.exe sqlsatellite.dll Windows Satellite Process sqlsatellite.dll Run query
  162. 162. SQL Server Machine Learning Services is scalable More efficient than standalone clients • Data does not all have to fit in memory • Reduced data transmission over the network Most R Open (and Python R) functions are single threaded • ScaleR and RevoScalePy APIs in scripts support multi-threaded processing on the SQL Server computer We can stream data in parallel and batches from SQL Server to/from script Use the power of SQL Server and ML to develop, train, and operationalize • SQL Server compute context (remote compute context) • T-SQL queries • Memory-optimized tables • Columnstore indexes • Data compression • Parallel query execution • Stored procedures
  163. 163. SQL Server Machine Learning Services is secure Reduced surface area and isolation “external scripts enabled” is required Script execution outside of SQL Server process space Script execution requires explicit permission sp_execute_external_script requires EXECUTE ANY EXTERNAL SCRIPT for non-admins SQL Server login/user required and db/table access Satellite processes have limited privileges Satellite processes run under low privileged, local user accounts in the SQLRUserGroup Each execution is isolated — different users with different accounts Windows firewall rules block outbound traffic
  164. 164. MicrosoftML package MicrosoftML is a package for Microsoft R Server, Microsoft R Client, and SQL Server Machine Learning Services that adds state-of-the-art data transforms, machine learning algorithms, and pretrained models to Microsoft R functionality. • Data transforms helps you to compose, in a pipeline, a custom set of transforms that are applied to your data before training or testing. The primary purpose of these transforms is to allow you to featurize your data. • Machine learning algorithms enable you to tackle common machine learning tasks such as classification, regression and anomaly detection. You run these high-performance functions locally on Windows or Linux machines or on Azure HDInsight (Hadoop/Spark) clusters. • Pretrained models for sentiment analysis and image featurization can also be installed and deployed with the MicrosoftML package.
  165. 165. Hybrid cloud
  166. 166. Back up to Azure
  167. 167. Managed backup • Granular control of the backup schedule • Local staging support for faster recovery and resiliency to transient network issues • Support for system databases • Support for simple recovery mode Back up to Azure block blobs • Cost savings on storage • Significantly improved restore performance • More granular control over Azure Storage Azure Storage snapshot backup • Fastest method for creating backups and running restores • SQL Server database files on Azure Blob storage Back up to Azure
  168. 168. Managed backup Support for system databases Support for databases in simple recovery mode Using backup to block blobs: more granular control Allows customized backup schedules: full backup and log backup 180
  169. 169. Customized scheduling EXEC Managed_Backup.sp_backup_config_schedule @database_name = 'testDB' ,@scheduling_option= 'Custom' ,@full_backup_freq_type = 'weekly' ,@days_of_week = 'Saturday' ,@backup_begin_time = '11:00' ,@backup_duration = '02:00' ,@log_backup_freq = '00:05' EXEC msdb.managed_backup.sp_backup_config_basic @database_name= 'testDB', @enable_backup=1, @container_url='https://storage account name.blob.core.windows.net/container name', @retention_days=30 Step 1: Run the scheduling SP to configure custom scheduling: Step 2: Run the basic SP to configure managed backup:
  170. 170. Back up to Azure block blobs • Two times cheaper storage • Backup striping and faster restore • Maximum backup size is 12 TB-plus • Granular access and unified credential story (SAS URLs) • Support for all existing backup/restore features (except append) CREATE CREDENTIAL [https://<account>.blob.core.windows.net/<container>] WITH IDENTITY = 'Shared Access Signature', SECRET = 'sig=mw3K6dpwV%2BWUPj8L4Dq3cyNxCI‘ BACKUP DATABASE database TO URL = N'https://<account>.blob.core.windows.net/<container>/<blob1>', URL = N'https://<account>.blob.core.windows.net/<container>/<blob2>'
  171. 171. Back up to Azure with file snapshots BACKUP DATABASE database TO URL = N'https://<account>.blob.core.windows.net/<container>/<backupfileblob1>' WITH FILE_SNAPSHOT Instance MDF Database MDF LDFLDF BAK
  172. 172. Back up to Azure with file snapshots • Available to users whose database files are located in Azure Storage • Copies database using a virtual snapshot within Azure Storage • Database data does not move between storage system and server instance, removing IO bottleneck • Uses only a fraction of the space that a traditional backup would consume • Very fast
  173. 173. Point-in-time restore with file snapshots Traditional backup • Multiple backup types • Complex point-in-time restore process Back up to Azure with file snapshots • Full backup only once • Point-in-time only needs two adjacent backups Full Log Log Log Diff Log Log Log Diff Log Log Log Full . . . . . Log Log Log Log Log Log Log Log Log Log Log
  174. 174. SQL Server on Azure VM
  175. 175. Why SQL Server in an Azure VM? Reduced capex/pay-as- you-go pricing 1 Fast deployment 2 Reduced configuration 3 Elastic scale 4 Lift and shift legacy applications 5
  176. 176. Microsoft gallery images • SQL Server 2008 R2 / 2012 / 2014 / 2016 / 2017 • SQL Server Web / Standard / Enterprise / Developer / Express Editions • Windows Server 2008 R2 / 2012 R2 / 2016 • Linux RHEL / Ubuntu SQL licensing • Based on SQL Server edition and core count (VM Sizes) • Pay-per-minute Bring your own license • Move an existing license to Azure through BYOL images Commissioned in ~10 minutes Connect via RDP, ADO .Net, OLEDB, JDBC, PHP, and so on Manage via Azure portal, SSMS, Powershell, CLI, System Center, and so on SQL Server in Azure VM—deploying
  177. 177. Azure VM sizes • Recommended for SQL Server production workloads • Local SSD storage • Premium storage • Portal optimizes VM for SQL Server workloads DSv3, DSv2 , DS & FS Series VM • Premium performance • Local SSD storage • Premium storage • Intel® Xeon® processor E5 v3 family GS, Ls-series VM https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes
  178. 178. Azure VM—availability An Azure Availability Set distributes VMs in different failure domains (racks) and upgrade domains • VMs are not impacted at the same time by: • Rack/host failures • Azure host upgrades Managed disks • Distributes disks of different VMs to different storage stamps • Higher isolation for Always On or SQL HA
  179. 179. Azure VM—availability SLAs Single-VM SLA: 99.9% (<43 minutes downtime p/month) • 99.46% single VMs achieve 99.999% (<26 seconds downtime p/month) Multi-VM SLA: 99.95% (<22 minutes downtime p/month) • 99.99% of multi-VM deployments achieve 99.999% • Includes: • Planned downtime due to host OS patching • Unplanned downtime due to physical failures Doesn’t include servicing of guest OS or software inside (for example, SQL) SQL Server multi-VM deployments need Always On • If VM becomes unavailable, fail over to another (~15s) • Detects SQL instance failures (for example, service down or hung)
  180. 180. Azure VM—storage Each disk has three copies in Azure Storage An extent is replicated three times across different fault and upgrade domains With random selection for where to place replicas for fast MTTR Remote storage connected over high-speed network Quorum-write Checksum all stored data Verified on every client read Scrubbed every few days Automated disk verification and decommissioning Rereplicate on disk/node/rack failure or checksum mismatch
  181. 181. Solid-State Drives (SSDs) • Up to 7,500 IOPs or 250 MB/s p/disk • Average latency less than 5ms Support local read cache (SSD) • Average 1ms latency • Frees VM bandwidth to Azure Storage (for log) Uncached premium storage disk Cached premium storage disk Local SSD Virtual Machine
  182. 182. Physical security Infrastructure security Many certifications SQL security • Datacenters monitored constantly • Microsoft Ops and Support personnel don’t have access to customer storage • Virtual Networks—deployments are isolated in their own private networks • Storage—Encrypted Storage and authenticated via strong keys • ISO 27001/27002, SOC 1/SSAE 16/ISAE 3402 and SOC 2, • Cloud Security Alliance CCM, FISMA, HIPAA, EU model clauses, FIPS 140-2 • Auto patching • Encryption of databases and backups • Encryption of connections • Authentication: Windows/SQL • Row-Level Security and Always Encrypted (SQL Server 2016) SQL Azure VM—many layers of security
  183. 183. Azure VM—connectivity Over the internet Over secure site-to-site tunnel • On public connection • Dedicated connection (Express Route)—recommended Apps transparently connect to primary via listener Listeners are supported through Azure Load Balancer • Internal (VNET) or External (internet) • Hybrid (Vnet to Vnet) On-premises Virtual Network
  184. 184. Stretch Database
  185. 185. Ever-growing data, ever-shrinking IT What to do? • Expand server and storage • Move data elsewhere • Delete • Massive tables (hundreds of millions/billions of rows, TBs in size) • Users want/need to retain data indefinitely • Cold data infrequently accessed but must be online • Datacenter consolidation • Maintenance challenges • Business SLAs at risk
  186. 186. Capability • Stretch large operational tables from on-premises to Azure with the ability to query Benefits • Cost-effective online cold data • Entire table is online and remains queryable from on-premises apps • No application changes • Support for Always Encrypted and Row-Level Security • Stretching the history tables of temporal tables is a great scenario Stretch SQL Server into Azure Securely stretch cold tables to Azure with remote query processing SQL SERVER 2017 Azure
  187. 187. Stretch Database architecture How it works • Creates a secure linked server definition in the on-premises SQL Server • Targets remote endpoint with linked server definition • Provisions remote resources and begins to migrate eligible data, if migration is enabled • Queries against tables run against both local database and remote endpoint Remote endpoint Remote data Azure Internetboundary Linked servers Local database Local data Eligible data
  188. 188. -- Enable local server EXEC sp_configure 'remote data archive' , '1'; RECONFIGURE; -- Provide administrator credential to connect to -- Azure SQL Database CREATE CREDENTIAL <server_address> WITH IDENTITY = <administrator_user_name>, SECRET = <administrator_password> -- Alter database for remote data archive ALTER DATABASE <database name> SET REMOTE_DATA_ARCHIVE = ON (SERVER = server name); GO -- Alter table for remote data archive ALTER TABLE <table name> ENABLE REMOTE_DATA_ARCHIVE WITH ( MIGRATION_STATE = ON ); GO; High-level steps • Configure local server for remote data archive • Create a credential with administrator permission • Alter specific database for remote data archive • Create a filter predicate (optional) to select rows to migrate • Alter table to enable Stretch for a table • Stretch Wizard in SQL Server Management Studio makes all this easy (does not currently support creating filter predicates) Typical workflow to enable Stretch Database Hybrid solutions
  189. 189. Queries continue working • Business applications continue working without disruption • DBA scripts and tools work as before (all controls still held in local SQL Server) • Developers continue building or enhancing applications with existing tools and methods
  190. 190. Advanced security features supported • Data in motion always via secure channels (TLS 1.1/1.2) • Always Encrypted supported if enabled by user (encryption key remains on-premises) • Row-Level Security and auditing supported
  191. 191. Backup and restore benefits • DBAs only back up/restore local SQL Server hot data • StretchDB ensures remote data is transactionally consistent with local data • Upon completion of local restore, SQL Server reconciles with remote data using metadata operation, not data copy • Time of restore for remote not dependent on size of data
  192. 192. Current limitations that block stretching a table • Tables with more than 1,023 columns or more than 998 indexes cannot be stretched • FileTables or FILESTREAM data not supported • Replicated tables, memory-optimized tables • CLR data types (including geometry, geography, hierarchyid and CLR user-defined types) • Column types (COLUMN_SET, computed columns) • Constraints (default and check constraints) • Foreign key constraints that reference the table in a parent-child relationship—you can stretch the child table (for example Order_Detail) • Full text indexes • XML indexes • Spatial indexes • Indexed views that reference the table
  193. 193. Programmability and data structures
  194. 194. Graph processing
  195. 195. A graph is a collection of nodes and edges Undirected graph Directed graph Weighted graph Property graph What is a graph? Node Person Person
  196. 196. Typical scenarios for graph databases A Hierarchical or interconnected data, entities with multiple parents. Analyze interconnected data, materialize new information from existing facts. Identify connections that are not obvious. Complex many-to-many relationships. Organically grow connections as the business evolves.
  197. 197. Introducing SQL Server Graph • A collection of node and edge tables in the database • Language Extensions • DDL Extensions—create node and edge tables • DML Extensions—SELECT - T-SQL MATCH clause to support pattern matching and traversals; DELETE, UPDATE, and INSERT support graph tables • Graph support is integrated into the SQL Server ecosystem Database Contains Graph isCollectionOf Node table has Properties Edge table may or may not have Properties Node Table(s) Edges connect Nodes Edge Table(s)
  198. 198. DDL Extensions • Create node and edge tables • Properties associated with nodes and edges CREATE TABLE Product (ID INTEGER PRIMARY KEY, name VARCHAR(100)) AS NODE; CREATE TABLE Supplier (ID INTEGER PRIMARY KEY, name VARCHAR(100)) AS NODE; CREATE TABLE hasInventory AS EDGE; CREATE TABLE located_at(address varchar(100)) AS EDGE;
  199. 199. DML Extensions Multihop navigation and join-free pattern matching using the MATCH predicate: SELECT Prod.name as ProductName, Sup.name as SupplierName FROM Product Prod, Supplier Sup, hasInventory hasIn, located_at supp_loc, Customer Cus, located_at cust_loc, orders, location loc WHERE MATCH( cus-(orders)->Prod<-(hasIn)-Sup AND cus-(cust_loc)->location<-(supp_loc)-Sup ) ;
  200. 200. Spatial
  201. 201. Spatial Spatial data represents information about the physical location and shape of geometric objects. These objects can be point locations, or lines, or more complex objects such as countries, roads, or lakes. SQL Server supports two spatial data types: the geometry data type and the geography data type. • The geometry type represents data in a Euclidean (flat) coordinate system. • The geography type represents data in a round-earth coordinate system.
  202. 202. Spatial functionality • Simple and compound spatial data types supported • Import and export spatial data to industry-standard formats (Open Geospatial Consortium WKT and WKB) • Functions to query the properties of, the behaviours of, and the relationships between, spatial data instances • Spatial columns can be indexed to improve query performance
  203. 203. Spatial enhancements (SQL Server 2017) • The FullGlobe geometry data type—FullGlobe is a special type of polygon that covers the entire globe. FullGlobe has an area, but no borders or vertices.
  204. 204. JSON and XML
  205. 205. JSON support • Not a built-in data type—JSON is stored as varchar or nvarchar • Format SQL data or query results as JSON • Convert JSON to SQL data • Query JSON data • Index JSON data
  206. 206. FOR JSON Export data from SQL Server as JSON, or format query results as JSON, by adding the FOR JSON clause to a SELECT statement. • When you use the FOR JSON clause, you can specify the structure of the output explicitly, or let the structure of the SELECT statement determine the output. • When you use PATH mode with the FOR JSON clause, you maintain full control over the format of the JSON output. You can create wrapper objects and nest complex properties. • When you use AUTO mode with the FOR JSON clause, the JSON output is formatted automatically based on the structure of the SELECT statement. Use the FOR JSON clause to delegate the formatting of JSON output from your client applications to SQL Server.
  207. 207. FOR JSON In PATH mode, you use the dot syntax—for example, 'Item.Price‘—to format nested output. This example also uses the ROOT option to specify a named root element.
  208. 208. OPENJSON Import JSON data into SQL Server by using the OPENJSON rowset function. • You can also use OPENJSON to convert JSON data to rows and columns • You can call OPENJSON with or without an explicit schema: Use JSON with the default schema. When you use OPENJSON with the default schema, the function returns a table with one row for each property of the JSON object or for each element in the JSON array. Use JSON with an explicit schema. When you use OPENJSON with an explicit schema, the function returns a table with the schema that you define in the WITH clause. In the WITH clause, you specify the output columns, their types, and the paths of the JSON source properties for each output column.
  209. 209. OPENJSON
  210. 210. Query JSON data Built-in functions for JSON: ISJSON tests whether a string contains valid JSON SELECT id, json_col FROM tab1 WHERE ISJSON(json_col) > 0 JSON_VALUE extracts a scalar value from a JSON string SET @town = JSON_VALUE(@jsonInfo, '$.info.address.town') JSON_QUERY extracts an object or array from a JSON string SELECT FirstName, LastName, JSON_QUERY(jsonInfo, '$.info.address') AS Address FROM Person.Person ORDER BY LastName JSON_MODIFY updates the value of a property in a JSON string and returns the updated JSON string DECLARE @info NVARCHAR(100)='{"name":"John","skills":["C#","SQL"]}’ SET @info=JSON_MODIFY(@info,'$.name','Mike')
  211. 211. XML support • Built-in data type (since SQL Server 2005) • Format SQL data or query results as XML • Convert XML to SQL data • Query XML data • Index XML data
  212. 212. FOR XML Export data from SQL Server as XML, or format query results as JSON, by adding the FOR XML clause to a SELECT statement. When you use the FOR XML clause, you can specify the structure of the output explicitly, or let the structure of the SELECT statement determine the output. • The RAW mode generates a single <row> element per row in the rowset that is returned by the SELECT statement. You can generate XML hierarchy by writing nested FOR XML queries. • The AUTO mode generates nesting in the resulting XML by using heuristics based on the way the SELECT statement is specified. You have minimal control over the shape of the XML generated. Nested FOR XML queries can be written to generate XML hierarchy beyond the XML shape that is generated by AUTO mode heuristics. • The EXPLICIT mode allows more control over the shape of the XML. You can mix attributes and elements at will in deciding the shape of the XML. It requires a specific format for the resulting rowset that is generated because of query execution. • The PATH mode, together with the nested FOR XML query capability, provides the flexibility of the EXPLICIT mode in a simpler manner. Use the FOR XML clause to delegate the formatting of XML output from your client applications to SQL Server.
  213. 213. FOR XML In PATH mode, you can use the @ symbol to return columns as attributes. This example also uses the ROOT option to specify a named root element. SELECT Date AS [@OrderDate], Number AS [@OrderNumber], Customer AS AccountNumber, Price AS UnitPrice, Quantity AS UnitQuantity FROM SalesOrder AS Orders FOR XML PATH('Order'), ROOT('Orders') <Orders> <Order OrderDate="2011-05-31T00:00:00" OrderNumber="SO43659" > <AccountNumber>AW29825</AccountNumber> <UnitPrice>59.99</UnitPrice> <UnitQuantity>1</UnitQuantity> </Order> <Order OrderDate="2011-06-01T00:00:00" OrderNumber="SO43661" > <AccountNumber>AW73565</AccountNumber> <UnitPrice>24.99</UnitPrice> <UnitQuantity>3</UnitQuantity> </Order> </Orders>
  214. 214. Query XML data The XML data type supports methods to query XML data using Xquery—based on Xpath: • query()—return matching XML nodes as XML • value()—return matching XML nodes as SQL Server data types • exists()—verify whether a matching node exists • nodes()—shred XML into multiple rows • modify()—update or insert matching nodes DECLARE @x xml SET @x = '<ROOT><a>111</a></ROOT>' SELECT @x.query('/ROOT/a') AS Result Result ---------- <a>111</a>
  215. 215. Temporal tables
  216. 216. Data changes over time • Tracking and analyzing changes is often important Temporal in DB • Automatically tracks history of data changes • Enables easy querying of historical data states Advantages over workarounds • Simplifies app development and maintenance • Efficiently handles complex logic in DB engine Why temporal? Time travel Data audit Slowly changing dimensions Repair record-level corruptions
  217. 217. No change in programming model New insights INSERT / BULK INSERT UPDATE DELETE MERGE DML SELECT * FROM temporal Querying How does temporal work? CREATE temporal TABLE PERIOD FOR SYSTEM_TIME… ALTER regular_table TABLE ADD PERIOD… DDL FOR SYSTEM_TIME AS OF FROM..TO BETWEEN..AND CONTAINED IN Temporal Querying ANSI 2011 compliant
  218. 218. Provides correct information about stored facts at any point in time, or between two points in time. There are two orthogonal sets of scenarios with regards to temporal data: • System (transaction)-time • Application-time SELECT * FROM Person.BusinessEntityContact FOR SYSTEM_TIME BETWEEN @Start AND @End WHERE ContactTypeID = 17 Temporal database support: BETWEEN
  219. 219. Temporal table (actual data) Insert / Bulk Insert * Old versions Update */ Delete * How does system time work? History table
  220. 220. Temporal queries * (Time travel, and so on) How does system time work? Regular queries (current data) * Include historical version Temporal table (actual data) History table
  221. 221. Temporal enhancements (SQL Server 2017) • System-versioned temporal tables now support CASCADE DELETE and CASCADE UPDATE • Temporal tables retention policy support added
  222. 222. Upgrading and migrating to SQL Server 2017
  223. 223. Upgrade and migration tools Data Migration Assistant (DMA) • Upgrade from previous version of SQL Server (on-premises or SQL Server 2017 in Azure VM) SQL Server Migration Assistant • Migrate from Oracle, MySQL, SAP ASE, DB2, or Access to SQL Server 2017 (on-premises or SQL Server 2017 in Azure VM) Azure Database Migration Service • Migrate from SQL Server, Oracle, or MySQL to Azure SQL Database or SQL Server 2017 in Azure VM
  224. 224. Upgrading to SQL Server 2017 In-place or side-by-side upgrade path from: • SQL Server 2008 • SQL Server 2008 R2 • SQL Server 2012 • SQL Server 2014 • SQL Server 2016 Side-by-side upgrade path from: • SQL Server 2005 Use Data Migration Assistant to prepare for migration
  225. 225. Legacy SQL Server instance DMA: Assess and upgrade schema 1. Assess and identify issues 2. Fix issues 3. Upgrade database Data Migration Assistant SQL Server 2017
  226. 226. Choosing a migration target “What’s the best path for me?”
  227. 227. Migrating to SQL Server 2017 from other platforms Oracle SAP ASE DB2 Identify apps for migration Use migration tools and partners Deploy to production SQL Server Migration Assistant Global partner ecosystem AND SQL Server 2017 on Windows SQL Server 2017 on Linux OR
  228. 228. Migration Assistant Database and application migration process • Database connectivity • User login and permission • Performance tuning • Database Discovery • Architecture requirements • (HADR, performance, locale, maintenance, dependencies, and so on) • Migration Assessment • Complexity, effort, risk • Schema conversion • Data migration • Embedded SQL statements • ETL and batch • System and DB interfaces
  229. 229. SQL Server Migration Assistant (SSMA) Automates and simplifies all phases of database migration Assess migration complexityMigration Analyzer Convert schema and business logicSchema Converter Migrate dataData Migrator Supports migration from DB2, Oracle, SAP ASE, MySQL, or Access to SQL Server Validate converted database codeMigration Tester
  230. 230. Using SQL Server Migration Assistant (SSMA) SSMA: Automates components of database migrations to SQL Server; DB2, Oracle, Sybase, Access, and MySQL analyzers are available Assess the migration project Migrate schema and business logic Migrate data Convert the application Test, integrate, and deploy SSMA migration analyzer SSMA data migrator SSMA schema converter
  231. 231. Azure solution paths Do not have to manage any VMs, OS or database software, including upgrades, high availability, and backups. Highly customized system to address the application’s specific performance and availability requirements.
  232. 232. Azure migration tools and services Assess Migrate
  233. 233. Legacy SQL Server instance DMA: Assess and migrate schema 1. Assess and identify issues 2. Fix issues 3. Convert and deploy schema DMA
  234. 234. Oracle SQL SQL DB Azure Database Migration Service Accelerating your journey to the cloud • Streamline database migration to Azure SQL Database (PaaS) • Managed service platform for migrating databases • Migrate SQL Server and third-party databases to Azure SQL Database
  235. 235. Editions, features, and capacity
  236. 236. SQL Server Editions SQL Server Edition Definition Enterprise The premium offering, SQL Server Enterprise Edition delivers comprehensive high-end datacenter capabilities with extremely fast performance, unlimited virtualization, and end-to-end business intelligence—enabling high service levels for mission critical workloads and end user access to data insights. Standard SQL Server Standard Edition delivers basic data management and a business intelligence database for departments and small organizations to run their applications. It supports common development tools for on-premises and the cloud—enabling effective database management with minimal IT resources. Web SQL Server Web Edition is a low total-cost-of-ownership option for web hosters and web VAPs to provide scalability, affordability, and manageability capabilities for small to large scale web properties. Developer SQL Server Developer Edition lets developers build any kind of application on top of SQL Server. It includes all the functionality of Enterprise Edition but is licensed for use as a development and test system, not as a production server. SQL Server Developer is an ideal choice for people who build SQL Server and test applications. Express Express Edition is the entry-level, free database and is ideal for learning and building desktop and small server data-driven applications. It’s the best choice for independent software vendors, developers, and hobbyists who build client applications. If you need more advanced database features, SQL Server Express can be seamlessly upgraded to other higher-end versions of SQL Server. SQL Server Express LocalDB is a lightweight version of Express that has all of its programmability features, yet runs in user mode and has a fast, zero-configuration installation and a short list of prerequisites.
  237. 237. Capacity limits by edition Feature Enterprise Standard Web Express Maximum compute capacity used by a single instance—SQL Server Database Engine Operating system maximum Limited to lesser of four sockets or 24 cores Limited to lesser of four sockets or 16 cores Limited to lesser of one socket or four cores Maximum compute capacity used by a single instance—Analysis Services or Reporting Services Operating system maximum Limited to lesser of four sockets or 24 cores Limited to lesser of four sockets or 16 cores Limited to lesser of one socket or four cores Maximum memory for buffer pool per instance of SQL Server Database Engine Operating system maximum 128 GB 64 GB 1410 MB Maximum memory for columnstore segment cache per instance of SQL Server Database Engine Unlimited memory 32 GB 16 GB 352 MB Maximum memory-optimized data size per database in SQL Server Database Engine Unlimited memory 32 GB 16 GB 352 MB Maximum relational database size 524 PB 524 PB 524 PB 10 GB

×