SlideShare una empresa de Scribd logo
1 de 35
Microsoft and
Openness
NoSQL on Azure

Heriyadi Janwar
Platform Lead
+ Linux


          “Microsoft is playing quite
          nicely with Linux and other
          open source tools. “
          -Robert McMillan, Wired Enterprise
+ Apache Hadoop
+ Java
+ PHP
+ Firefox
+ Drupal
+ Node.js
+ SAMBA


          “A few years back, a patch
          submission from coders at
          Microsoft would have been
          amazing to the point of
          unthinkable, but the battles
          are mostly over and times
          have changed.
Attract Individual
Consumers:
- Provide interesting
  service
- Provide mobility        Online
- Provide social                      Monetize the Social:
                         Business     - Improve individual
Monetize Individual:                    experience
- Upsell service
     - VIP
                        Application   - Re-sell Aggregate Data
                                        (e.g., Advertisers)
     - Speed
     - Extra
       Capabilities
Social NetworkING: the Business Problem
• 100s of million of users

• Terabytes to petabytes of data

• Required (eventual) data
 consistency across users
Solution
• Shard/Partition user data across hundreds to
  thousands of SQL Databases
• Propagate data changes from one DB to other DBs
  using reliable, async Message Service




• Provide a caching layer for performance
• And also used for
Many LARGE SCALE customers using similar patterns

• Patterns
  • Sharding and reliable messaging
  • Sharding and fan/out query layer
  • Caching layer

• Customer Examples
  • Social Networking: Facebook, MySpace, etc
  • Online electronic stores (cannot give names )
  • Travel reservation systems (e.g. Choice International)
  • MSN Casual Gaming
  • etc.
• Require high availability
• Be able to scale out:




• Be able to quickly grow and change:



Move better support for these patterns into the Data Platform!
• NoSQL = operational and developer agility at low CapEx and OpEx!

• Low Cost



• Processing Paradigms




• Data Model Paradigms




• Range from devices, over OLTP Web 2.0 applications to BigData Analytics
Data Model                  Example Stores (apologies to the ones I did not list)

Simple Key-Value Pairs      Memcache, Redis, Dynamo, Voldermort, LevelDB, Azure Caching

Wide Sparse Column Sets     HyperTable, Big Table, Cassandra, HBASE, Hyperbase, Amazon
                            DynamoDB, Windows Azure Tables, SQL Server/Azure Sparse
                            columns
BLOBs                       Amazon S3, Oracle Berkeley NoSQL, Windows Azure Blob Store,
                            SQL Server RBS/FileTable

JSON Documents              MongoDB, CouchBase, Riak, RavenDB

Graph                       Neo4J, GraphDB, HypergraphDB, Stig, Intellidimension

Objects and XML Documents   Versant, Oracle Berkeley NoSQL, MarkLogic, existDB, EMC
                            HiveDB, SQL Server/Azure, Oracle, IBM DB2

Extended Relational         Oracle, EMC SQLFire, IBM DB2, MySQL, Postgres, SQL
                            Server/Azure
• You want:



• You can only get 2 of 3 (CAP Theorem)
• In Brave New World:
•   Performance and Elastic Scale on Demand
•   Automate management lifecycle (or fail)
•   Simple deployment lifecycle
•   No DB or OS Admin telling me what to do
•   Code First and revise quickly
•   Application-model first (before database)
•   Flexible open data models
•   You don’t know exactly what you are looking for
•   Lower Pain of adoption and maintenance
•   No DB or OS Admin telling me what to do
• Low CapEx, Low OpEx
• Built-in tunable High-Availability
• Data scale-out (Sharding)
• Processing scale-out (Map-Reduce, Fan-Out, tunable consistency)
• Flexible Data Models



• Integrate with BigData Analytics (e.g., Hadoop)


Many Relational Database Systems are incorporating these learning!
• Provides Data Partitioning/Sharding at the Data Platform
• Enables applications to build elastic scale-out applications
• Provides non-blocking SPLIT/DROP for shards (MERGE to
  come later)
• Auto-connect to right shard based on sharding keyvalue
• Provides SPLIT resilient query mode
• Flexible data is good, but:

• Procedural Scale-Out processing is good, but:



• Eventual Consistency is good, but:

• Simple Queries are good, but:




Many NoSQL Database Systems are starting to incorporate these learnings!
Attract Individual
Consumers:
- Provide interesting
  service
- Provide mobility        Online
- Provide social                      Monetize the Social:
                         Business     - Improve individual
Monetize Individual:                    experience
- Upsell service
     - VIP
                        Application   - Re-sell Aggregate Data
                                        (e.g., Advertisers)
     - Speed
     - Extra
       Capabilities
Readable
                                  Replica

                       Primary              Copy
                        Shard
                                 Readable
OLTP Workloads                    Replica
                                                    Traditional OLAP Workloads
Highly Available                                    known schema
High Scale                       Readable           Data warehouse, “Star joins”
                                  Replica
High Flexibility
                       Primary
                        Shard                       Dynamic OLAP Workloads
mostly touching 1                Readable
to low number of                  Replica
                                                    3Vs (Volume, Velocity, Variety)
shards                                              Exploratory
                                 Readable
                                  Replica
                                                    Scale-out queries, often using
                       Primary
                        Shard               Query   eventual consistent scale-out
                                 Readable           frameworks like Hadoop
                                  Replica


                    SQL or NoSQL Store
32
http://www.windowsazure.com

Presentation                                              Speaker               Date and Time
Do We Have the Tools We Need to Navigate the
                                                        Dave Campbell           2/29 9:00am PST
New World of Data?
Onsite Interview *                                Tim O’Reilly, Dave Campbell   2/29 10:15am PST
Unleash Insights on All Data With Microsoft Big
                                                     Alexander Stojanovic       2/29 11:30am PST
Data
Office Hours (Q&A session)                              Dave Campbell           2/29 1:30pm PST
Hadoop + Javascript: What We Learned                     Asad Khan              2/29 2:20pm PST
Democratizing BI at Microsoft: 40,000 Users
                                                        Kirkland Barrett        3/1 10:40am PST
and Counting
Data Marketplaces For Your Extended
                                                        Piyush Lumba             3/1 2:20pm PST
Enterprise


                                                                                                   33
• NoSQL and the Windows Azure Platform
   http://download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE-
   6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf
   http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure-
   federations.aspx
Microsoft Openness Mongo DB

Más contenido relacionado

La actualidad más candente

Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL DatabaseGeek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
IDERA Software
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
elliando dias
 
Soft-Shake 2013 : Enabling Realtime Queries to End Users
Soft-Shake 2013 : Enabling Realtime Queries to End UsersSoft-Shake 2013 : Enabling Realtime Queries to End Users
Soft-Shake 2013 : Enabling Realtime Queries to End Users
Benoit Perroud
 
Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applications
Jeff Harris
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
Qian Lin
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
Dipti Borkar
 

La actualidad más candente (20)

Deploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopDeploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache Hadoop
 
NoSQL Seminer
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Introduction to Apache Accumulo
Introduction to Apache AccumuloIntroduction to Apache Accumulo
Introduction to Apache Accumulo
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL DatabaseGeek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
 
Introducing Kudu
Introducing KuduIntroducing Kudu
Introducing Kudu
 
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Soft-Shake 2013 : Enabling Realtime Queries to End Users
Soft-Shake 2013 : Enabling Realtime Queries to End UsersSoft-Shake 2013 : Enabling Realtime Queries to End Users
Soft-Shake 2013 : Enabling Realtime Queries to End Users
 
Big Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandBig Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-Land
 
Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applications
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time Architectures
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
 

Similar a Microsoft Openness Mongo DB

No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summit
Open Analytics
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
Igor Moochnick
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
Scott Miao
 

Similar a Microsoft Openness Mongo DB (20)

SQL and NoSQL in SQL Server
SQL and NoSQL in SQL ServerSQL and NoSQL in SQL Server
SQL and NoSQL in SQL Server
 
Apache Drill
Apache DrillApache Drill
Apache Drill
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right VMworld 2013: Virtualizing Databases: Doing IT Right
VMworld 2013: Virtualizing Databases: Doing IT Right
 
NoSQL
NoSQLNoSQL
NoSQL
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Spark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir VolkSpark Summit EU talk by Shay Nativ and Dvir Volk
Spark Summit EU talk by Shay Nativ and Dvir Volk
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydb
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summit
 
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
001 hbase introduction
001 hbase introduction001 hbase introduction
001 hbase introduction
 
SQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big DataSQLCAT: Tier-1 BI in the World of Big Data
SQLCAT: Tier-1 BI in the World of Big Data
 

Microsoft Openness Mongo DB

  • 1. Microsoft and Openness NoSQL on Azure Heriyadi Janwar Platform Lead
  • 2.
  • 3.
  • 4. + Linux “Microsoft is playing quite nicely with Linux and other open source tools. “ -Robert McMillan, Wired Enterprise
  • 11. + SAMBA “A few years back, a patch submission from coders at Microsoft would have been amazing to the point of unthinkable, but the battles are mostly over and times have changed.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Attract Individual Consumers: - Provide interesting service - Provide mobility Online - Provide social Monetize the Social: Business - Improve individual Monetize Individual: experience - Upsell service - VIP Application - Re-sell Aggregate Data (e.g., Advertisers) - Speed - Extra Capabilities
  • 18. Social NetworkING: the Business Problem • 100s of million of users • Terabytes to petabytes of data • Required (eventual) data consistency across users
  • 19. Solution • Shard/Partition user data across hundreds to thousands of SQL Databases • Propagate data changes from one DB to other DBs using reliable, async Message Service • Provide a caching layer for performance • And also used for
  • 20. Many LARGE SCALE customers using similar patterns • Patterns • Sharding and reliable messaging • Sharding and fan/out query layer • Caching layer • Customer Examples • Social Networking: Facebook, MySpace, etc • Online electronic stores (cannot give names ) • Travel reservation systems (e.g. Choice International) • MSN Casual Gaming • etc.
  • 21. • Require high availability • Be able to scale out: • Be able to quickly grow and change: Move better support for these patterns into the Data Platform!
  • 22. • NoSQL = operational and developer agility at low CapEx and OpEx! • Low Cost • Processing Paradigms • Data Model Paradigms • Range from devices, over OLTP Web 2.0 applications to BigData Analytics
  • 23. Data Model Example Stores (apologies to the ones I did not list) Simple Key-Value Pairs Memcache, Redis, Dynamo, Voldermort, LevelDB, Azure Caching Wide Sparse Column Sets HyperTable, Big Table, Cassandra, HBASE, Hyperbase, Amazon DynamoDB, Windows Azure Tables, SQL Server/Azure Sparse columns BLOBs Amazon S3, Oracle Berkeley NoSQL, Windows Azure Blob Store, SQL Server RBS/FileTable JSON Documents MongoDB, CouchBase, Riak, RavenDB Graph Neo4J, GraphDB, HypergraphDB, Stig, Intellidimension Objects and XML Documents Versant, Oracle Berkeley NoSQL, MarkLogic, existDB, EMC HiveDB, SQL Server/Azure, Oracle, IBM DB2 Extended Relational Oracle, EMC SQLFire, IBM DB2, MySQL, Postgres, SQL Server/Azure
  • 24. • You want: • You can only get 2 of 3 (CAP Theorem) • In Brave New World:
  • 25. Performance and Elastic Scale on Demand • Automate management lifecycle (or fail) • Simple deployment lifecycle • No DB or OS Admin telling me what to do
  • 26. Code First and revise quickly • Application-model first (before database) • Flexible open data models • You don’t know exactly what you are looking for • Lower Pain of adoption and maintenance • No DB or OS Admin telling me what to do
  • 27. • Low CapEx, Low OpEx • Built-in tunable High-Availability • Data scale-out (Sharding) • Processing scale-out (Map-Reduce, Fan-Out, tunable consistency) • Flexible Data Models • Integrate with BigData Analytics (e.g., Hadoop) Many Relational Database Systems are incorporating these learning!
  • 28. • Provides Data Partitioning/Sharding at the Data Platform • Enables applications to build elastic scale-out applications • Provides non-blocking SPLIT/DROP for shards (MERGE to come later) • Auto-connect to right shard based on sharding keyvalue • Provides SPLIT resilient query mode
  • 29. • Flexible data is good, but: • Procedural Scale-Out processing is good, but: • Eventual Consistency is good, but: • Simple Queries are good, but: Many NoSQL Database Systems are starting to incorporate these learnings!
  • 30. Attract Individual Consumers: - Provide interesting service - Provide mobility Online - Provide social Monetize the Social: Business - Improve individual Monetize Individual: experience - Upsell service - VIP Application - Re-sell Aggregate Data (e.g., Advertisers) - Speed - Extra Capabilities
  • 31. Readable Replica Primary Copy Shard Readable OLTP Workloads Replica Traditional OLAP Workloads Highly Available known schema High Scale Readable Data warehouse, “Star joins” Replica High Flexibility Primary Shard Dynamic OLAP Workloads mostly touching 1 Readable to low number of Replica 3Vs (Volume, Velocity, Variety) shards Exploratory Readable Replica Scale-out queries, often using Primary Shard Query eventual consistent scale-out Readable frameworks like Hadoop Replica SQL or NoSQL Store
  • 32. 32
  • 33. http://www.windowsazure.com Presentation Speaker Date and Time Do We Have the Tools We Need to Navigate the Dave Campbell 2/29 9:00am PST New World of Data? Onsite Interview * Tim O’Reilly, Dave Campbell 2/29 10:15am PST Unleash Insights on All Data With Microsoft Big Alexander Stojanovic 2/29 11:30am PST Data Office Hours (Q&A session) Dave Campbell 2/29 1:30pm PST Hadoop + Javascript: What We Learned Asad Khan 2/29 2:20pm PST Democratizing BI at Microsoft: 40,000 Users Kirkland Barrett 3/1 10:40am PST and Counting Data Marketplaces For Your Extended Piyush Lumba 3/1 2:20pm PST Enterprise 33
  • 34. • NoSQL and the Windows Azure Platform http://download.microsoft.com/download/9/E/9/9E9F240D-0EB6-472E-B4DE- 6D9FCBB505DD/Windows%20Azure%20No%20SQL%20White%20Paper.pdf http://blogs.msdn.com/b/cbiyikoglu/archive/2011/03/03/nosql-genes-in-sql-azure- federations.aspx

Notas del editor

  1. < Note to Presenter: Please add your name and title as appropriate > Presenter Guidance:This presentation provides a sample script; however, we strongly recommend that you learn the script content as a way to tell the story and then rely on the key points to guide your discussion. Slide timing is approximate and should be considered a guide only.  © 2012 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
  2. <choose from slides 3 – 10 as alternative intro pictures>Timing: 1 minute Key Points:Microsoft has changed as a company and become more open.Script:Microsoft has changed as a company and become more open. The old debate – black or white; open source or commercial software; us versus them – is simply no longer relevant. Today, many customers manage mixed IT environments. And they have told us that what matters today is maximizing their existing IT investments while having the freedom to choose new solutions that best support their business goals. To meet these customer needs, Microsoft is committed to openness.
  3. Timing: 2 minutes Key Points:We do not compete against open source as a category, we increasingly work collaboratively with this community. You may be surprised to learn what Microsoft is doing with open source. More and more, customers, partners and the industry understand that the work we are doing with open source is about helping customers and enabling a rich and robust ecosystem of developers and partners. The following slides will provide some great examples.  Script:You may be surprised to learn what Microsoft is doing with open source. More and more, customers, partners and the industry understand that the work we are doing with open source is about helping customers and enabling a rich and robust ecosystem of developers and partners. We enable open source on our platforms. We recognize that if we’re going to use open source, then we also have to give back, especially if we want open source developers to continue to think of Windows and Windows Phone as platforms for them to develop on. For example, Windows Azure supports a wide-range of development languages, including Java, PHP and Node.js so that developers can build applications for using any language tool, or framework of their choice – including open source. Let’s review the following slides for some more detailed examples.
  4. Timing: 2 minutes Key Points:Device Driver Code contributionsfor Linux: enables better performance of Linux when virtualized with Hyper-V            CoApp: you are developing apps for Linux? Why not make them work on Windows and open up more opportunities for your app to get adopted? Windows Azure Virtual Machines enablescustomers to run both their existing Windows and Linux-based applications in the cloud. Compatible operating systems/images include CentOS, openSUSE, SUSE Linux Enterprise Server, Ubuntu, as well as Windows Server. Script:You may be surprised to learn what we are doing with Linux. We have learned a lot over the past decade. Embracing Linux on our platforms is a real business for us. For example, we work on a variety of interoperability initiatives with Linux vendors -- SUSE, Citrix, RedHat, CentOS -- to provide support for Linux as a “first-class guest” on Hyper-V. Another great example is CoApp, which is an is an open-source package management system for Windows. The goal of the CoApp project is to create a community of developers dedicated to creating a set of tools and processes that enable other open source developers to create and maintain their open source products with Windows as a build target. Further, with Windows Azure Virtual Machines, customers can run both their existing Windows and Linux-based applications in the cloud. Compatible operating systems/images include CentOS, openSUSE, SUSE Linux Enterprise Server, Ubuntu, and Windows Server, further illustrating Microsoft’s commitment to openness for customersand partners. FYI – Data sources and more information:-Robert McMillan, Wired Enterprise (March 2012): http://www.wired.com/wiredenterprise/2012/03/mr-linux/. Note: thequote is accurate, but the broader article is all about Linus and Linux
  5. Timing: 2 minutes Key Points:We’re committed to helping customers manage “big data”, working with the Apache Hadoop community to support Hadoop on Windows Server and Windows Azure.Our Big Data solution is also integrated into the Microsoft BI tools such as SQL Server Analysis Services, Reporting Services and even PowerPivot and Excel. This enables you to do BI on all your data, including those in Hadoop.Script:Just ten years ago, most business data was locked up behind big applications. We are now entering an era when unlocking this data and its potential to drive new knowledge and insights is becoming a key success factor for many ventures. To embrace this “Big Data revolution”, we’ve launched customer previews of Apache Hadoop-based solutions for Windows Server and Windows Azure, which enables Hadoop apps to be deployed in hours instead of days. The most recent customer preview is called Windows Azure HDInsight and Microsoft HDInsight for Windows Server. Both solutions embrace enterprise-ready Apache Hadoop to enable most any user to begin viewing and truly analyzing Big Data, using such tools as Microsoft Excel, PowerPivot, and SQL Server Analysis Services. Regardless of the size or type of data, or where it’s stored, both HDInsight versions offer simple management via Microsoft System Center 2012, a shared codebase for platform consistency whether on Windows Server or Azure, and 100% compatibility with Hadoop.Customers such as Klout,Webtrends and the University of Dundee have been using the service to glean simple, actionable insights from complex data sets hosted in the cloud. FYI – Data Sources and more information:“Opening Doors To Real Big Data Value: Hadoop On Windows Azure And Windows Server” (Oct 2012): http://blogs.technet.com/b/openness/archive/2012/10/24/opening-doors-to-real-big-data-value-hadoop-on-windows-azure-and-windows-server.aspx “Openness Customer Spotlight: Klout Uses Microsoft BI and Hadoop to Bolster Big Data Insights” (Sept 2012): http://blogs.technet.com/b/openness/archive/2012/09/07/klout-uses-microsoft-bi-and-hadoop-to-bolster-big-data-insights.aspx“Navigating the New World of Data” (Mar 2012): http://blogs.technet.com/b/openness/archive/2012/03/01/navigating-the-new-world-of-data.aspxKurt Mackie, Redmondmag.com quote (Oct 2011):http://redmondmag.com/articles/2011/10/12/hadoop-efforts-announced-at-pass.aspx?admgarea=BDNA
  6. Timing: 2 minutes Key Points:Great Java experience on Windows Server and Windows AzurePartners like Gigaspaces are taking advantage of Java support to provide services to customers with existing Java-based enterprise applications. Windows Azure plug-in for Eclipse with helps Eclipse users create and configure deployment packages of their Java applications for the Windows Azure cloud.Script:Customers and partners are taking advantage of the “first-class” Javaexperience on Windows Server and Windows Azure. For example, partners like Gigaspacesare now able to take advantage of Java support to provide services to customers with existing Java-based enterprise applications. Microsoft also continues to work on projects that foster interoperability with Java and Windows. For example, Windows Azure SDK for Java includes a Windows Azure plug-in for Eclipseprovides templates and functionality that allow you to easily create, develop, test, and deploy Windows Azure applications using the Eclipse development environment. It is an Open Source project, whose source code is available under the Apache License 2.0 from the project’s site at http://sourceforge.net/projects/waplugin4ej/.FYI – Data Sources and more information:Gigaspaces case study (Feb 2012): http://www.microsoft.com/casestudies/Windows-Azure/Gigaspaces/Solution-Provider-Streamlines-Java-Application-Deployment-in-the-Cloud/400000000081
  7. Timing: 2 minutes Key Points:Great example of how far the Linux experience has evolved over the past several years – from no PHP experience on Windows to PHP running extremely well and with high performance on both Windows and Linux. PHP releases now include support for both Windows and Linux.  Script:Over the past several years, Microsoft and its partners have worked diligently with the PHP community to improve the experience PHP developers and users have on Windows Server and Windows Azure. Now the PHP community supports Windows right alongside Linux, including the recent release of PHP 5.4.0. René de Haas, CEO of a Dutch webhosting company called SoHosted, is a partner who has been instrumental in improving the PHP on Windows experience. According to René, “Between 2003 and 2012 we've seen the general opinion about Microsoft, Windows and PHP turn 180 degrees” due to the improvements made.FYI – Data Sources and more information:“PHP 5.4 Available in Windows Azure Web Sites” (Nov 2012): http://blogs.technet.com/b/openness/archive/2012/11/27/php-5-4-available-in-windows-azure-web-sites.aspx“Evolution of PHP on Windows” (Mar 2012), including SoHosted interview:http://blogs.technet.com/b/openness/archive/2012/03/01/evolution-of-php-on-windows.aspx
  8. Timing: 1 minute Key Points:Firefox browser is well supported across cloud services (Office 365, SkyDrive, Bing, Skype).Microsoft created a Firefox plug-in for Windows Media Player.Mozilla has acknowledged how Microsoft’s commitment to HTML5 enables this support for Firefox and other modern browsers. Script:Firefox browser is well supported across Microsoft’s cloud services like Office 365, SkyDrive, Bing, and Skype, as well as Microsoft created a Firefox plug-in for Windows Media Player. Those within the Mozilla community have acknowledged how Microsoft’s commitment to HTML5 enables this support for Firefox and other modern browsers. FYI – Data Sources and more information:Blizzard quote reference: http://www.theregister.co.uk/2010/06/09/mozilla_man_on_apple_google_and_html5/
  9. Timing: 1 minute Key Points:Microsoft has worked with Drupal to improve interoperability, resulting in more choices for users. Script:Drupal is a popular open source content management system that powers many of the world's web sites.Microsoft has worked with Drupal to improve interoperability, resulting in more choices for users. The Screen Actors Guild recently migrated their Drupal site to Windows Azure. The SAG Awards, their biggest traffic day of the year, “went off with flying colors.” FYI – Data Sources and more information:“Drupal + Windows Azure: A Winning Combination for SAG” (Feb 2012): http://blogs.technet.com/b/openness/archive/2012/02/29/drupal-windows-azure-a-winning-combination-for-sag.aspx
  10. Timing: 2 minutes Key Points:Node.js provides an end-to-end JavaScript experience for the development of a whole new class of real-time applications With the work that we did to enable Windows on Node.js, not did we support Windows, but the benchmarks for Linux also improvedDevelopers can also implement a Node.js application and deploy it to Windows Azure using Cloud9 IDEScript:Node.js is Node.js is a platform built on Chrome’s JavaScript runtime for easily building fast, scalable network applications. Microsoft’s support for Node.js on Windows Azure enables a new class of real-time applications. We also released the Windows Azure SDK for Node.js as open source, availableon Github, as well as the Windows Azure Development Centers has great Node.jsdocumentation, tutorials, samples and how-to guides to get you started with Node.js on Windows Azure.Also announced recently is support for Cloud9 IDE as a way to create Node.js applications and deploy to Windows Azure. FYI – Data Sources and more information:Scott Fulton, ReadWriteWeb quote (Dec 2011): http://www.readwriteweb.com/cloud/2011/12/windows-azure-adds-nodejs-supp.php
  11. Timing: 1 minute Key Points:Patches have been submitted to SambaGreat example of how relationship between an open source solution and Microsoft can evolve Script:In late 2011, a patch to the Samba code was submitted that enables Linux clients to better interoperate with Microsoft Windows in mixed source environments. Contributed under GPL2+, the patch was an individual contribution made by Microsoft’s Stephen Zarkos (Open Source Technical Center team) in line with Samba policies in place at the time. Efforts also continue to move forward with Microsoft and the Samba team working together to support the SMB protocol. The comments by Chris Hertel of the Samba team reflect how the relationship between key open source solutions and Microsoft have been evolving in the past several years. FYI – Data Sources and more information:“Driving Interoperability with the SMB Open Specifications” (Jun 2012): http://blogs.technet.com/b/openness/archive/2012/06/29/driving-interoperability-with-the-smb-open-specifications.aspx
  12. Timing: 3 minutes Key Points:The substantial growth of the Microsoft open source project community, Codeplex, which has tripled in size in the past two years, illustrates the momentum of Microsoft + Open Source. 9 of the top 10 most downloaded OSS projects run on Windows.In 2011 Microsoftlaunched WebMatrix -- a free, light-weight web development tool designed for quick website building and deployment. This tool puts open source tools at developers’ fingertips and these developers have downloaded more than one million open source web applications.Customers are benefitting from our work with open source solutions, including the more than 900 customers of the Microsoft-SUSE Alliance.Script:Our increased commitment to working with open source has sparked tremendous momentum and contributed to rapid growth of open source software on Windows – according to Sourceforge, 9 of the top 10 most downloaded OSS projects run on Windows today.(Side note: the compete project list is below; the only project that “isn’t supported on Windows” is the “Smart package of Microsoft's core fonts” which doesn’t need to be supported because is obviously already runs on Windows ). Further, Codeplex, Microsoft’s open source project community hosts more than 32,000 open source projects and has tripled its membership in just two years, from 300,000 members to more than 900,000 in 2012. Another great example is Webmatrix, a free, light-weight web development tool designed for quick website building and deployment. This tool puts open source tools at developers’ fingertips and these developers have downloaded more than one million open source web applications. Since it’s launch in 2011, there have been more than 1 million downloads. And customers as well as developers are benefitting directly from these efforts, including the more than 900 customers of the Microsoft-SUSE Alliance, which delivers interoperability solutions that help customers to get more out of their mixed Windows and Linux environments. FYI – Data Sources and more information:SUSE,Codeplex, and WebMatrix stats current as of Nov 2012Sorceforge top projects site (http://sourceforge.net/top/). “Most downloads over all time” as of Nov 25, 2012: VLC media playereMuleAzureus / VuzeAres Galaxy7-ZipSmart package of Microsoft's core fonts (“not supported on Windows” by Sourceforge definition)FileZillaPortableApps.com: Portable Software/USBMinGW - Minimalist GNU for WindowsNotepad++ Plugin Manager
  13. <OPTIONAL SLIDE: Customize with local announcements as appropriate>Timing: 1 minute Key Points:MongoDB has been supported on Windows Azure for some time, but recently the setup, deployment, and development experience has been streamlined by the release of the MongoDB Installer for Windows Azure.In October, MongoLabreleased the preview of a MongoDB-as-a-Service offering through the Windows Azure Store. MongoLab is a full-featured MongoDB cloud database solution that completely automates the operational aspects of running MongoDB. Script: MongoDB is a very popular NoSQL database that is easy to learn if you have JavaScript (or Node.js) experience and is used in many high-volume web sites including Craigslist, FourSquare, Shutterfly, The New York Times, MTV, and others.People have been using MongoDB on Windows Azure for some time, but recently the setup, deployment, and development experience has been streamlined by the release of the MongoDB Installer for Windows Azure. It’s now easier than ever to get started with MongoDB on Windows Azure!Also, in October, MongoLab released the preview of a Mongo-DB-as-a-Service offering through the Windows Azure Store. MongoLab is a full-featured MongoDB cloud database solution that completely automates the operational aspects of running MongoDB. With the MongoLab cloud platform developers can deploy and manage highly-available databases for their applications and leverage automated backups, web-based tools, 24/7 monitoring, and expert support.FYI – Data Sources and more information:For more detail on the MongoDB Installer for Windows Azure: http://blogs.msdn.com/b/interoperability/archive/2012/07/09/mongodb-installer-for-windows-azure.aspxFor more detail on the MongoLab service: https://www.windowsazure.com/en-us/store/service/?name=mongolab
  14. Timing: 3 minutesKey Points:Windows Azure is an open and flexible cloud platform. Developers can build applications using any language, tool or framework – including open source languages such as PHP, Java, and Node.js, and other open source tools. Our June 2012 technical preview release, brought support for Linux on Windows Azure Virtual Machines and further support for multiple frameworks and popular open source applications through Windows Azure Web Sites.Script:As part of our cloud platform, interoperability is a design-time requirement. Windows Azure is an open and flexible cloud platform that enables customers to quickly build, deploy and manage applications across a global network of Microsoft-managed datacenters. To do it right we know we’ve got to be open.Developers can build applications using any language, tool or framework – including open source languages such as PHP, Java, and Node.js, and other open source tools – which means they can utilize familiar open source skills on Microsoft's cloud platform. Currently features and services in Windows Azure are exposed using open REST protocols. Windows Azure client libraries are available for multiple programming languages and are released under an open source license and hosted on GitHub. As Microsoft continues to provide incremental improvements to Windows Azure, we remain committed to working with developer communities. Other recent interoperability enhancements include: Eclipse Plugin for Java, Mongo DB support, code configuration for hosting Solr/Lucene, Hadoop services preview. Also, our June 2012 technical preview brought support for Linux images on Windows Azure Virtual Machines and further support for multiple frameworks and popular open source applications through Windows Azure Web Sites (note: see appendix slides for more detail on Virtual Machines and Web Sites).
  15. Timing: 1 minuteKey Points:Windows Azure Web Sites enable developers to quickly and easily deploy sites with support for multiple frameworks and popular open source applications to a highly scalable cloud environment.Script: Windows Azure Web Sites allows you to build highly scalable websites on Windows Azure. You can quickly and easily deploy sites to a highly scalable cloud environment that allows you to start small and scale as traffic grows. Windows Azure Web Sites uses the languages and open source apps of your choice and supports deployment with Git, FTP, and TFS. You can easily integrate other services like MySQL, SQL Database, Caching, CDN, and Storage.
  16. In January 2011 Microsoftlaunched WebMatrix -- a free, light-weight web development tool designed for quick web site building and deployment. This tool puts open source tools at developers’ fingertips:Choose from a gallery of popular open source web applications to get a site up and running in a few clicks.Installs PHP & MySQL for necessary apps. Edit your code or database within WebMatrix.Utilizes NuGet to gain access to a community-driven gallery of ASP.NET “helpers” that given you small snippets of code to perform common tasks (bit.ly, Facebook integration, twitter, etc.).
  17. Example MSN Casual Gaming:~2 Million users at launch~86 Million services requests/day 135 Windows Azure Data Services Hosting VMs ca. 18K connections in Connection Pools, this could grow with trafficCa. 1200 SQL Azure requests/second spread across all partitions during peak load~ 90% reads vs 10% writes (this varies per storage type)~ 200 bytes of storage per user~ 20% of database storage is currently used, but expect this to growSharded over 400 SQL Azure Databases
  18. Note: Big-sized companies invest resources in building these platforms instead of using existing relational platforms!
  19. No DB or OS Admin telling me what to do!
  20. Performance and Scale:Map/Reduce PatternsEventual consistency (trade-off due to CAP)ShardingCachingAutomate management Lifecycle:Elastic Scale on demand (no need to pay for resources until needed)Automatic Fail-overScalable Schema version rolloutPerf troubleshootingAuto alertingAuto loadbalancingAuto resourcing (e.g., auto splits based on policies)Declarative policy-based management
  21. Code First and revise quicklyWorking software over comprehensive documentationResponding to change over following a planApplication-model first (before database) Dictates the data model and queriesFlexible data modelsNo a priori modeling: Data first, schema later/Open SchemaKey/Value storesReduced impedance mismatch: JSON, XML, YAMLYou don’t know exactly what you are looking forMap/Reduce for adhoc analysisProvide Search across all your data instead of just queryLower Pain of adoption and maintenance From code to deployment & “monetization” of data, services, apps and tenantsRich Services out of the BoxData and services mashupEasy troubleshooting of deployed appsNo DB or OS Admin telling me what to do
  22. Low CapEx, Low OpEx: SQL Azure and other Platform as a Service offeringsBuilt-in High-Availability (tunable): SQL Azure has quorum based built-in replicasData scale-out (Sharding): SQL Azure FederationsProcessing scale-out (Map-Reduce, Fan-Out, tunable consistency)Flexible Data ModelsJSON (& XML) supportSparse columns/Column sets Integrate with BigData Analytics (e.g., Hadoop)
  23. SharePoint – BI, Enterprise Search, Enterprise Content Management, CollaborationTransform - ETLClean – Data Quality, AugmentationDiscover – Search, Meta-data, Classification, Information CatalogInfer – Recommendation Engines, Machine LearningShare – Publish, CollaborateGovern – Lineage & Impact Analysis, Master Data ManagementMarketplace – Private, Public, Bing Data, 3rd Party Data Sources, Models, Algorithms, APIs