Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

AZURE Data Related Services

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 55 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Anuncio

Similares a AZURE Data Related Services (20)

Más reciente (20)

Anuncio

AZURE Data Related Services

  1. 1. AZURE DATA RELATED SERVICES Azure Data services overview, scale options
  2. 2. WHAT IS MICROSOFT AZURE ? Your app. Your framework. Your platform. All welcome. Microsoft Azure is a rapidly growing collection of integrated cloud services—analytics, computing, database, mobile, networking, storage, and web—for moving faster, achieving more, and saving money.
  3. 3. WHY PUBLIC CLOUD? Cost savings: • Lower TCO • Pay for usage, avoid over provisioning/capacity Scalability: • Rapid expansion – Local & Global • DR (no need to pay for what can possibly not happen) Flexibility: • Change HW configuration on the fly or at least reboot • Adapt platform to baseline dynamically • Easily integrate systems in cloud Training: • Setup a lab instantly • Try new features/technology https://azure.microsoft.com/en-us/documentation/ - list of all services
  4. 4. Azure SQL Database
  5. 5. AZURE SQL DATABASE Azure SQL Database – Managed relational SQL DB PaaS • SQL Server database in the cloud • Usual management tools can be used: SSMS, Visual Studio • Fully compatible with Azure services (Data, Storage, Web) • Easy scalable - single DB and elastic pools • SLA at least 99.99% • learns, adapts, and grows with your application – Database Advisor, auto tuning(since V12 version) • Threat detection and alerts – audit • Security, encryption, compliance – all can be met
  6. 6. SQL DATABASE OPTIONS AND PERFORMANCE TIERS Basic • Best suited for a small database, supporting typically one single active operation at a given time. Examples include databases used for development or testing, or small-scale infrequently used applications. Standard • The go-to option for most cloud applications, supporting multiple concurrent queries. Examples include workgroup or web applications. Premium • Designed for high transactional volume, supporting a large number of concurrent users and requiring the highest level of business continuity capabilities. Examples are databases supporting mission critical applications.
  7. 7. UNDERSTANDING DTU The Database Transaction Unit (DTU) is the unit of measure in SQL Database that represents the relative power of databases based on a real-world measure: the database transaction. This is a set of operations that are typical for an online transaction processing (OLTP) request, and then measured how many transactions could be completed per second under fully loaded conditions (that’s the short version, details in the Benchmark overview)
  8. 8. PERFORMANCE BENCHMARKS FOR DTU Benchmarks overview: https://azure.microsoft.com/en-us/documentation/articles/sql-database-benchmark- overview/ • Read Lite [35%] - SELECT; in-memory; read-only • Read Medium [20%] - SELECT; mostly in-memory; read-only • Read Heavy [5%] - SELECT; mostly not in-memory; read-only • Update Lite [20%] - UPDATE; in-memory; read-write • Update Heavy [3%] - UPDATE; mostly not in-memory; read-write • Insert Lite [3%] - INSERT; in-memory; read-write • Insert Heavy [2%] - INSERT; mostly not in-memory; read-write • Delete [2%] - DELETE; mix of in-memory and not in-memory; read-write • CPU Heavy [10%] - SELECT; in-memory; relatively heavy CPU load; read-only
  9. 9. PERFORMANCE BENCHMARKS FOR DTU Tiers requirements
  10. 10. SQL DATABASE PERFORMANCE TIERS
  11. 11. SINGLE SQL DATABASE OPTIONS
  12. 12. GEO REPLICATION Standard Geo-Replication - will be retired in April, 2017 • Standard geo-replication creates an offline secondary database in a pre-paired Azure region within the same geographic area that is at least 500 miles away. • Secondary standard geo-replication databases are priced at 0.75x of primary database prices Active Geo-Replication • Active geo-replication creates up to 4 online (readable) secondaries in any Azure region • Secondary active geo-replication databases are priced at 1x of primary database prices
  13. 13. SQL DATABASE – ELASTIC POOL Elastic pool characteristics: • It is given a set number of eDTUs, for a set price • Within the pool, individual databases are given the flexibility to auto-scale within set parameters • Under heavy load a database can consume more eDTUs to meet demand • Databases under light loads consume less • Databases under no load don’t consume any eDTUs
  14. 14. SQL DATABASE – ELASTIC POOL
  15. 15. ELASTIC POOL AND ELASTIC DB OPTIONS https://azure.microsoft.com/en-us/documentation/learning-paths/sql-database-elastic-scale/
  16. 16. SQL DATABASE V12 Increased application compatibility with SQL Server A key goal for SQL Database V12(Compatibility level 130) was to improve the compatibility with Microsoft SQL Server 2014, and to maintain the compatibility as new versions of SQL Server are released. Among other areas, V12 achieves parity with SQL Server in the important area of programmability. For example: • Built-in JSON support • Window functions, with OVER • XML indexes and selective XML indexes • Change tracking • SELECT...INTO • Full-text search • ALTER DATABASE SCOPED CONFIGURATION (Transact-SQL) Please refer link here for the small set of features not yet supported in Azure SQL Database.
  17. 17. SCALING WITH AZURE SQL DATABASE Sharding A technique to distribute large amounts of identically-structured data across a number of independent databases. • The total amount of data is too large to fit within the constraints of a single database • The transaction throughput of the overall workload exceeds the capabilities of a single database • Tenants may require physical isolation from each other, so separate databases are needed for each tenant • Different sections of a database may need to reside in different geographies for compliance, performance or geopolitical reasons.
  18. 18. HORIZONTAL AND VERTICAL SCALING Scaling options • Horizontal “scaling out” – Sharding data is partitioned across a collection of identically structured databases. Is managed using the Elastic Database client library. • Vertical scaling is accomplished using Azure PowerShell cmdlets to change the service tier, or by placing databases in an elastic pool.
  19. 19. SCALING WITH AZURE SQL DATABASE Elastic Database tools 1. A set of Azure SQL databases are hosted on Azure using sharding architecture. 2. The Elastic Database client library is used to manage a shard set. 3. A subset of the databases are put into an Elastic Database pool. 4. An Elastic Database job runs T-SQL scripts against all databases. 5. The Split-merge tool is used to move data from one shard to another. 6. The Elastic Database query allows you to write a query that spans all databases in the shard set. 7. Elastic transactions allows you to run transactions that span several databases.
  20. 20. SCALING WITH AZURE SQL DATABASE Shard map manager The shard map manager is a special database that maintains global mapping information about all shards (databases) in a shard set More details: https://azure.microsoft.com/en- us/documentation/articles/sql-database-elastic- scale-shard-map-management/
  21. 21. PRICING https://azure.microsoft.com/en-us/pricing/details/sql-database/
  22. 22. Azure SQL Server Stretch Database
  23. 23. SQL Server Stretch Database Dynamically stretch SQL Server databases to Azure • Scale SQL Server 2016 using bottomless cloud storage • Make warm and cold data available to users at low cost • Access and query stretched data online • Move data easily—no query or application changes required • Use with advanced security features like Always Encrypted • Reduce maintenance and storage costs for on- premises data AZURE STRETCH DATABASE
  24. 24. AZURE STRETCH DATABASE https://azure.microsoft.com/en-us/pricing/details/sql-server-stretch-database/ - Pricing
  25. 25. Azure Document DB
  26. 26. AZURE DOCUMENT DB Document DB – NoSQL DBaaS (Designed to leverage Programming standards JSON and JS) In DocumentDB, you can store and query schema-less JSON documents with order-of-millisecond response times at any scale. DocumentDB provides containers for storing data called collections. Key features: • Schema free highly scalable • Allows T-SQL querying • Allows JS programming to execute transactional application logic using JS based triggers, UDFs, SPs • Data always indexed automatically • Easily integrates with Azure HDInsight, Azure Search and other Azure services
  27. 27. AZURE DOCUMENT DB RU • Request Unit (RU) per second is the unit of throughput measurement. • A single request unit represents the processing capacity required to read a single 1KB document • When you query against a collection, Azure returns request charge value in portal or through x-ms-request- charge response header in code. Therefore, you can get some ideas about cost of your queries. • Many factors are involved in request unit measurement. Things like number of document properties, indexes, document size and data consistency. Therefore, RU cost differs from application to another application.
  28. 28. AZURE DOCUMENT DB Simplified structure
  29. 29. AZURE DOCUMENT DB Scaling
  30. 30. Azure Data Warehouse
  31. 31. AZURE DATA WAREHOUSE SQL Data Warehouse • Petabyte scale with massively parallel processing • Independent scaling of compute and storage—in seconds • Transact-SQL queries across relational and non-relational data • Full enterprise-class SQL Server experience • Works seamlessly with Power BI, Machine Learning, HDInsight, and Data Factory • Combines Azure proven SQL Server relational database with Azure cloud scale-out capabilities. You can increase, decrease, pause, or resume compute in seconds. MPP architecture spreads data across 60 shared-nothing storage and processing units. The data is stored in Premium locally redundant storage and linked to compute nodes for query execution.
  32. 32. AZURE DATA WAREHOUSE DWUs Data Warehouse Unit is a measure of three precise metrics that are highly correlated with data warehousing workload performance: • Scan/Aggregation: This workload metric takes a standard data warehousing query that scans a large number of rows and then performs a complex aggregation. This is a IO and CPU intensive operation. • Load: This metric measures the ability to ingest data into the service. Loads are completed with PolyBase loading a representative dataset from an Azure Storage Blob. This metric is designed to stress Network and CPU aspects of the service. • CREATE TABLE AS SELECT (CTAS): CTAS measures the ability to create copy of a table. This involves reading data from storage, distributing it across the nodes of the appliance, and writing it to storage again. It is a CPU and Network intensive operation Pricing https://azure.microsoft.com/en-us/pricing/details/sql-data-warehouse/
  33. 33. AZURE DATA WAREHOUSE MPP architecture • Grow or shrink storage independent of compute • Grow or shrink compute without moving data • Pause compute capacity while keeping data intact • Resume compute capacity at a moment's notice Control node: The Control node manages and optimizes queries. Coordinates all of the data movement and computation required to run parallel queries on your distributed data Compute Nodes: The Compute nodes serve as the power behind SQL Data Warehouse. They are SQL Databases which store your data and process your query. Storage: Data is stored in Azure Storage Blobs. When Compute nodes interact with data, they write and read directly to and from blob storage. Since Azure storage expands transparently and limitlessly, SQL Data Warehouse can do the same. Data Movement Service: Data Movement Service (DMS) is Microsoft technology for moving data between the nodes. DMS gives the Compute nodes access to data they need for joins and aggregations. DMS is not an Azure service. It is a Windows service that runs alongside SQL Database on all the nodes.
  34. 34. AZURE DATA WAREHOUSE LOAD DATA Load options/utilities: Load from Azure blob storage • PolyBase - load in parallel using MPP architecture • Azure Data Factory - pipeline that uses PolyBase to load data from Azure blob storage into SQL Data Warehouse Load from SQL Server • SSIS – does not perform the load in parallel. Not supported datatypes should be converted. • AzCopy – move flat files to Blob Storage (CLI). Consider if data size is < 10 TB. • Bcp - If you have a small amount of data you can use bcp to load directly into Azure SQL Data Warehouse. • Disk shipping service Import/Export (recommended for > 10 TB data)
  35. 35. Azure Storage
  36. 36. AZURE STORAGE TYPES Blob - For users with large amounts of unstructured object data to store in the cloud • Good choice for storing documents, media files, backups etc. Table - Is a key-attribute store, meaning that every value in a table is stored with a typed property name • Table storage can be used to store flexible datasets, such as user data for web applications, address books, device information, and any other type of metadata that your service requires Queues - Provides a reliable messaging solution for asynchronous communication between application components • Queue storage also supports managing asynchronous tasks and building process workflows Files - cloud-based SMB file shares • Applications running in Azure virtual machines or cloud services can mount a file share in the cloud. Data in the share can be accessed via file sytem I/O APIs in the cloud. • On-premise applications can call the File storage REST API to access data in a file share.
  37. 37. Azure Redis Cache
  38. 38. AZURE REDIS CACHE Redis Cache – High throughput, low latency data access to build fast and scalable apps • Advanced key-value store in memory • Secure, dedicated open source Redis cache, managed by Microsoft • Helps your application become more responsive even as user load increases Redis is an open source (BSD licensed), in-memory data structure store, used as database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperlogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence. Provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
  39. 39. Azure Data Factory
  40. 40. AZURE DATA FACTORY Azure Data Factory - cloud-based data integration service that orchestrates and automates the movement and transformation of data Data Factory is priced by the frequency of activities (high or low) and where the activities run (cloud or on-premises). A low-frequency activity occurs once a day or less; a high-frequency activity occurs more than once a day. Charges for copying activities are based on source of data and calculated as per the data movement meters.
  41. 41. AZURE DATA FACTORY Prising https://azure.microsoft.com/en-us/pricing/details/data-factory/
  42. 42. Azure Data Lake Store
  43. 43. AZURE DATA LAKE STORE Azure Data Lake Store is designed to be an enterprise-wide, hyper scale repository for big data analytic workloads. In the data lake, you can easily capture data of any size, type and speed in a single place for the purposes of operational and exploratory analytics. • Built for Hadoop: A Hadoop Distributed File System for the Cloud • Unlimited storage: No fixed limits on file size, account size, or the number of files • Performance Tuned for Big Data: Optimized for massive throughput to query and analyze any amount of data • Enterprise Grade Security: Azure Active Directory authentication and role-based access control • All Data: Store data in its native format without prior transformation
  44. 44. AZURE DATA LAKE STORE
  45. 45. AZURE DATA LAKE STORE
  46. 46. AZURE DATA LAKE STORE
  47. 47. AZURE DATA LAKE ANALYTICS Azure U-SQL Job
  48. 48. AZURE DATA LAKE ANALYTICS
  49. 49. AZURE DATA LAKE ANALYTICS
  50. 50. AZURE DATA LAKE ANALYTICS Prising https://azure.microsoft.com/en-us/pricing/details/data-lake-analytics/
  51. 51. PRISING CALCULATOR Discounts areas: • Startups Offers • Visual Studio licensed devs • Prepaid 12-month subscription https://azure.microsoft.com/en-us/pricing/calculator/
  52. 52. AZURE SQL GOVERNMENT Azure SQL Government • The cloud platform designed to meet US government demands • Physical and logical network-isolated instance of Azure • Dedicated to US government with all data, applications, and hardware residing in the continental United States • Broad range of compliance certifications critical to US government • US datacenters located more than 500 miles apart, providing true geographic redundancy • Support for hybrid scenarios, as well as a vast array of services, programming languages, and tools • Part of the complete Microsoft Cloud for Government solution Compliant with • FedRAMP certification • DISA certification • Support to enable IRS 1075 compliance • Have ability to issue HIPAA Business Associate Agreements • Criminal Justice Information Services (CJIS)–capable Platform
  53. 53. The end Thanks for listening !!! 
  54. 54. RESOURCES USED https://www.youtube.com/watch?v=AicqMIPpZKc - Hybrid Cloud Solutions with Microsoft Azure - For Architects (2015) https://azure.microsoft.com/en-us/documentation/services/sql-database/ - Azure SQL Database documentation https://www.youtube.com/watch?v=mi-lilKoYok – Elastic Scale Azure Databases (2016) https://www.youtube.com/watch?v=N2N5TbWmCcU - Azure SQL Database for Business-Critical Cloud Applications (2016) https://channel9.msdn.com/events/Ignite/Microsoft-Ignite-New-Zealand-2015/M378 - Elastic for SQL – shards, pools, stretch https://azure.microsoft.com/en-us/documentation/videos/azurecon-2015-overview-of-azure-sql-data-warehouse/ - Overview of Azure SQL Data Warehouse https://www.youtube.com/watch?v=mSDz6O0bhyc - Azure Data Lake Deep Dive Azure Documentation and Videos: https://azure.microsoft.com/en-us/

×