Design Considerations For Storing With Windows Azure
1. Design considerations for storing data in the cloud with Windows Azure Eric Nelson Microsoft UK Blog: http://geekswithblogs.net/iupdateable Twitter: http://twitter.com/ericnel and http://twitter.com/ukmsdn Podcast: http://bit.ly/msdnpodcast Newsletter: http://msdn.microsoft.com/en-gb/flash Slides, links and background “diary” posts can be found on my blog
9. Windows Azure PLATFORM 101 Just in case you had something better to do over the last 18months
10. 3 Important Services 3 Critical Concepts Windows Azure Compute and Storage SQL Azure Storage .NET Services Connecting Computation Web and Worker Storage Table, Blob, Relational Messaging Queues, Service Bus
11. A simple site “Wow! What a great site!” Database Request Web Tier B/L Tier Browser Response
12. Under load Browser Browser Database Web Tier B/L Tier Browser “Server Busy” Browser Browser
13. Under load Browser Browser Database Web Tier B/L Tier Browser “Timeout” Browser Browser
14. Solve using on-premise Browser p1 p2 p3 Web Tier N L B B/L Tier N L B Browser Database Web Tier Browser B/L Tier Browser Web Tier B/L Tier Browser
15. However… p1 p2 p3 “Not so great now…” Web Tier N L B B/L Tier N L B Database Web Tier Browser B/L Tier Web Tier B/L Tier “That took a lot of work - and money!” “Hmmm... Most of this stuff is sitting idle...”
16. Solve using the Cloud aka Windows Azure Platform Browser p1 p2 p3 Web Role N L B Worker Role N L B Browser AzureStorage Web Role Browser Worker Role Worker Role Browser Web Role Browser You don’t see this bit You don’t see this bit You don’t see this bit or… Maybe you do
17. Solve using the Cloud aka Windows Azure Platform SQLAzure Browser p1 p2 p3 Web Role N L B Worker Role N L B Browser AzureStorage Web Role Browser Worker Role Worker Role Browser Web Role Browser You don’t see this bit You don’t see this bit You don’t see this bit Ok, you definitely do
22. Blobs stored in Containers 1 or more Containers per account Scoping is at container level …/Container/blobpath Blobs Capacity 50GB in CTP Metadata, accessed independently Private or Public container access Blobs
23. Put a Blob Blob Container PutBlob PUT http://account.blob.core.windows./net/containername/blobname Azure Blob Storage REST API Client http://account.blob.core.windows.net/containername/blobname
24. Get a Blob Blob Container Azure Blob Storage REST API Client GetBlob GET http://account.blob.core.windows./net/containername/blobname http://account.blob.core.windows.net/containername/blobname
25. Get part of a Blob Blob Container Azure Blob Storage REST API Client GetBlob GET http://account.blob.core.windows./net/containername/blobname Range: bytes=329300 - 730000 http://account.blob.core.windows.net/containername/blobname
26. Put a LARGE Blob PutBlock(blobname, blockid1, data) Blob Container PutBlock(blobname, blockid7, data) PutBlockList(blobname, blockid1, …, blockidN) Azure Blob Storage REST API Client http://account.blob.core.windows.net/containername/blobname
28. Provides structured storage Massively scalable tables (TBs of data) Self scaling Highly available Durable Familiar and easy-to-use API, layered .NET classes and LINQ ADO.NET Data Services – .NET 3.5 SP1 REST – with any platform or language Introduction to Tables
29. No join No group by No order by “No Schema” Not a Relational Database
30. Table A Table is a set of Entities (rows) An Entity is a set of Properties (columns) Entity Two “key” properties form unique ID PartitionKey – enables scale RowKey – uniquely ID within a partition Data Model
31. Key Example – Blog Posts Partition 1 Partition 2 Getting all of dunnry’s blog posts is fast Single partition Getting all posts after 2008-03-27 is slow Traverse all partitions
32. Query a Table REST: GET http://account.table.core.windows.net/Customer?$filter=%20PartitionKey%20eq%20value LINQ: var customers = from o in context.CreateQuery<customer>(“Customer”) where o.PartitionKey == value select o; Azure Table Storage Worker Role http://account.table.core.windows.net
33. Tradeoff between locality and scalability Considerations Entity group transactions Query efficiency Scalability Flexible Partitioning Choosing a Partition Key
34. Pick potential keys (common query filters) Order keys by importance If needed, include an additional unique key Use two most important keys as PK, RK Consider concatenating to form keys A Method of Choosing Keys
35. Non-key queries are scans Improve performance by scoping Usually by partition key But what about by table? 3 tables Top 1,000 popular items Top 10,000 popular items Everything Now arbitrary “top 1,000” queries are fast Better locality than clever partition keys Write many is one approach
37. Lessons LearnedAzure Storage Azure tables are *not* a relational database Requires a mind shift Azure tables scale 3 - 9s availability Azure tables support exactly one key PartitionKey + RowKey Case Matters No foreign keys No referential integrity No stored procedures
38. Lessons LearnedAzure Storage Azure Storage Client Library No longer just a “sample” Azure storage is available via REST Not limited to Azure hosted apps Not limited to Microsoft platform or tools Getting the signature correct is the hard part
39. Lessons LearnedAzure Storage - RESTful REST is *not* TDS Be prepared to parse LINQ and XML classes help Sometimes, string parsing is the best choice Azure storage names are picky So are Azure key values It’s possible to create an entity in a table and not be able to update or delete it
40. Lessons LearnedAzure Storage – Roundtrips are expensive Often better to pull back more than you need vs. multiple roundtrips LINQ on results in memory is fast & flexible foreach works well too Sort and cache tables on the web tier
41. Lessons LearnedAzure Storage – Entity Group Transactions Different Entity types in the same table E.g. PK = CustomerId Customer, Order and OrderDetails in the same table
52. Lessons LearnedSQL Azure From the database “down” it’s just SQL Server Well, almost … Many tools don’t work today System catalog is different Above the database is taken care of for you You can’t really change anything
53. Lessons LearnedSQL Azure Tooling SSMS partially works – “good enough” Can not create connection using Visual Studio designer Other tools may work better No BCP (currently) DDL Must be a clustered index on every table No physical file placement No indexed views No “not for replication” constraint allowed No Extended properties Some index options missing (e.g. allow_row_locks, sort_in_tempdb ..) No set ansi_nulls on
54. Lessons LearnedSQL Azure Types No spatial or hierarchy id No Text/images support. Use nvarchar(max) XML datatype and schema allowed but no XML index or schema collection. Security No integrated security
55. Lessons LearnedSQL Azure Development No CLR Local temp tables are allowed Global temp tables are not allowed Cannot alter database inside a connection No UDDT’s No ROWGUIDCOL column property
56. Lessons LearnedSQL Azure vs Windows Azure Tables SQL Server is very familiar SQL Azure *is* SQL Server in the cloud Windows Azure Storage is…very different Make the right choice Understand Azure storage Understand SQL Azure Understand they are totally different You can use both
57. Lessons Learned SQL Azure vs Windows Azure Tables SQL Azure is not always the best storage option SQL Azure costs more Delivers a *lot* more functionality SQL Azure is more limited on scale
58. Lessons Learned SQL Azure and Sharding Can be done Many 10GB databases Not fun
60. Simple asynchronous dispatch queue Create and delete queues Message: Retrieved at least once Max size 8kb Operations: Enqueue Dequeue RemoveMessage Queues
61. Using the Cloud for Communications http://app.queue.core.windows.net/ Azure Queue REST Client
62. Using the Cloud for Communications Company 1 http://app.queue.core.windows.net/ Client Azure Queue REST Company 2 Client
63. Using the Cloud for Communications x Company 1 http://app.queue.core.windows.net/ Client Azure Queue REST Company 2 Client
64. Using the Cloud for Communications Company 1 http://app.queue.core.windows.net/ Client Azure Queue REST Web Role Company 2 Client
66. Windows Azure Platform Benefits Windows Azure High Level of Abstraction Hardware Server OS Network Infrastructure Web Server Availability Automated Service Management Scalability Instance & Partitions Developer Experience Familiar Developer Tools SQL Azure Higher Level of Abstraction Hardware Server OS Network Infrastructure Database Server Availability Automated Database Management & Replication Scalability Databases Partitioning Developer Experience Familiar SQL Environment
67. Resources Slides, links and more http://geekswithblogs.net/iupdateable Azure Training Kit (August update) www.azure.com Sign up, links to resources etc http://www.azureadvantage.co.uk/ Rapid provisioning of Windows Azure
Notas del editor
name/value pairs (8kb total)
PutBlob = 64Mb MAXMetaData = 8Kb per Blob
PutBlock = 4Mb MAX to a maximum of 50GbBlockId = 64 bytes
Partition Key – how data is partitionedRow Key – unique in partition, defines sortGoalsKeep partitions small (increased scalability)Specify partition key in common queriesQuery/sort on row key
Each Table: PartitionKey (e.g. DocumentName) to ensure scalabilityRowKey (e.g. version number)[fields] for data
64kb per field
Use XML Serialization to write the results to local storageIt’s generally faster to hydrate from local storageNot as fast as caching in memory