SlideShare una empresa de Scribd logo
1 de 50
Data Liberty
Alternatives to the shackles of limited scale in data solutions
Andy Cross
Windows Azure MVP
Elastacloud
Competition and Hashtag
• Tell everyone I’m awesome #ukmvpcloud
• Fill in the forms on your seats – for a chance to win a WINDOWS
PHONE!
Data value at scale requires technology choices;
* often prioritising data read traversal over operational characteristics
of create/update/delete
* embracing hybrid data platforms with varied technology partners
over homogenous estates
* establishing alternative skillsets, augmented with entrenched
languages, trusting cloud over maintenance
* following robust engineering processes to provide rigour in a
deterministic world
Bravery leads to rewards;
* the winners will have data which shows them that they’ve won
* the commoditised query turns energy sucking data silos into profit
centres
* new data traversal mechanisms lead to new connotative data
expression
* everything you already know is relevant and valid; the constraints on
how it is applied are not
Today we’ve already heard about the position of big data in the market
and the usual
I’m going to give a tour of actually how to get some value out. By value
I mean a result to a query. It’s not going to be visualised.
But it’s a tour so lets start with the history.
IBM have been a leader in Big Data for years.
Wikimedia commons
We’re not as great as we’d hope; we’re often still bound by our ability
to marshal our IO.
Just as the speed of loading punchcards was historically a limiting
factor, we are now limited by our capacity to ingest data on individual
machines.
This leads to ideas such as DFS and data locality.
During the evolution of data we eventually moved to client/server and
this was a big step up from dBase et al of the time.
Fundamentally however, the tabular structured nature of data poses
many changes; not least the long term effects of normalisation which
trade off effective storage in the short term with long term offset
compute which is required to reconstruct sets.
This eventually leads to such ideas as NoSQL document and entity
stores.
Modelling of data provides a consistent challenge. Our world is highly
connected and our brains are effective connectors of data. Real world
data fits poorly into highly structured data sets.
This leads to semi-structured and unstructured data formats and data
queryability through relationship traversal
The technologies shown today are primarily written in non-.net and
non-Microsoft languages and frameworks. Every time we do this, I’ll
show examples ONLY in the .net and Microsoft stacks.
There are obviously challenges beyond language to running the
alternative stacks; but remember in the Cloud you aren’t responsible
for tuning a Linux cluster which has been running for 5 years. You
should provision for a duration that is bounded by the likelihood of the
cluster requiring routine maintenance.
Hadoop – KEY FACTS
Open Source; Apache Foundation.
Java.
Map Reduce framework for job distribution; Distributed File System for
file access.
In Windows Azure this is known as HDInsight.
Hadoop is O(n)
It exhibits linear performance; when the dataset doubles, the time taken to execute
the algorithm doubles.
Hadoop SDK
public class SwedishSessionsJob : HadoopJob<SwedishSessionsMapper, SessionsReducer>
{
public override HadoopJobConfiguration Configure(ExecutorContext context)
{
var config = new HadoopJobConfiguration()
{
InputPath = ""/AllSessions/*.gz"",
OutputFolder = "/SwedishSessions/"
};
return config;
}
}
Jobs
public class SwedishSessionsMapper : MapperBase
{
public override void Map(string inputLine, MapperContext context)
{
if (inputLine.Contains("Country=Sweden")
{
context.IncrementCounter("SwedishSession");
context.EmitKeyValue(“SE", "1");
}
}
}
Mapper
public class SessionsReducer : ReducerCombinerBase
{
public override void Reduce(string key, IEnumerable<string> values, ReducerContext context)
{
context.EmitKeyValue(key, values.Count());
}
}
Reducer
Testing Hadoop Queries
var inputData = "Country=Sweden&Name=Magnus";
var result =
StreamingUnit.Execute<Jobs.SwedishJob>(new[]{inputData});
Assert.AreEqual("SEt1", result.ReducerResult.First());
* Tools are great but not friendly
HDInsight wins.
Automated provisioning and job execution services.
Transient clusters limit exposure to poorly tooled* java estate.
Persistence with Windows Azure Blob Storage as HDFS proxy known as Azure
Storage Vault (ASV).
Persistence in Windows Azure SQL Database for Hive Metastore.
Javascript console.
NoSQL Document and Entity Stores
Examples in MongoDb and Windows Azure Table Storage.
What is a document database?
{
"_id" : ObjectId("51fccc57f82352d76653bdae"),
"Name" : {
"FirstName" : "Owen",
"LastName" : "Grzegorek"
},
"Company" : "Howard Miller Co",
"Address" : {
"Line1" : "15410 Minnetonka Industrial Rd",
"Line2" : "Minnetonka",
"Line3" : "Hennepin",
"Line4" : "MN",
"Line5" : "55345"
},
"ContactDetails" : {
"Phone" : "952-939-2973",
"Fax" : "952-939-4663",
"Email" : "owen@grzegorek.com",
"Web" : "http://www.owengrzegorek.com"
}
}
{
"_id" : ObjectId("51fccc57f82352d76653bdae"),
"Name" : {
"FirstName" : "Owen",
"LastName" : "Grzegorek"
},
"Company" : "Howard Miller Co",
"Address" : {
"Line1" : "15410 Minnetonka Industrial Rd",
"Line2" : "Minnetonka",
"Line3" : "Hennepin",
"Line4" : "MN",
"Line5" : "55345"
},
"ContactDetails" : {
"Phone" : "952-939-2973",
"Fax" : "952-939-4663",
"Email" : "owen@grzegorek.com",
"Web" : "http://www.owengrzegorek.com"
}
}
{
"_id" : ObjectId("51fccc57f82352d76653bdae"),
"Name" : {
"FirstName" : "Owen",
"LastName" : "Grzegorek"
},
"Company" : "Howard Miller Co",
"Address" : {
"Line1" : "15410 Minnetonka Industrial Rd",
"Line2" : "Minnetonka",
"Line3" : "Hennepin",
"Line4" : "MN",
"Line5" : "55345"
},
"ContactDetails" : {
"Phone" : "952-939-2973",
"Fax" : "952-939-4663",
"Email" : "owen@grzegorek.com",
"Web" : "http://www.owengrzegorek.com"
}
}
{
"_id" : ObjectId("51fccc57f82352d76653bdae"),
"Name" : {
"FirstName" : "Owen",
"LastName" : "Grzegorek"
},
"Company" : "Howard Miller Co",
"Address" : {
"Line1" : "15410 Minnetonka Industrial Rd",
"Line2" : "Minnetonka",
"Line3" : "Hennepin",
"Line4" : "MN",
"Line5" : "55345"
},
"ContactDetails" : {
"Phone" : "952-939-2973",
"Fax" : "952-939-4663",
"Email" : "owen@grzegorek.com",
"Web" : "http://www.owengrzegorek.com"
}
}
{
"Name" : {
"FirstName" : "Owen",
"LastName" : "Grzegorek"
},
"Company" : "Howard Miller Co",
"Address" : {
"Line1" : "15410 Minnetonka Industrial Rd",
"Line2" : "Minnetonka",
"Line3" : "Hennepin",
"Line4" : "MN",
"Line5" : "55345"
},
"ContactDetails" : {
"Phone" : "952-939-2973",
"Fax" : "952-939-4663",
"Email" : "owen@grzegorek.com",
"Web" : "http://www.owengrzegorek.com"
}
}
{
"Name" : {
"FirstName" : "Owen",
"LastName" : "Grzegorek"
},
"Company" : "Howard Miller Co",
"Address" : {
"Line1" : "15410 Minnetonka Industrial Rd",
"Line2" : "Minnetonka",
"Line3" : "Hennepin",
"Line4" : "MN",
"Line5" : "55345"
},
"ContactDetails" : {
"Phone" : "952-939-2973",
"Fax" : "952-939-4663",
"Email" : "owen@grzegorek.com",
"Web" : "http://www.owengrzegorek.com"
}
}
{
"Name" : “Richard Conway",
“Books Published” : “12”,
“Specialises in” : “Data Science”
}
{
"Name" : “Andy Cross",
“Hometown" : “Blackpool“
}
{
"Name" : “Isaac Abraham",
“Age" : “33“
“Football Team” : “Tottenham”
“Icon” :
}
MongoDB Key Facts
• High Performance
• High Availability
• Easy Scalability
MongoDB is O(log n)
It exhibits logarithmic performance; when the dataset doubles, the time taken to
execute the algorithm increases by a fixed amount
Strengths of MongoDB
Mongo SDK
There are many different way to
connect with MongoDB from a .net
project.
Official
Wrapper
Alternative
Tool
C# implementations
If your data is regularly structured, you can use domain classes:
public class Book
{
public string Author { get; set; }
public string Title { get; set; }
}
// "entities" is the name of the collection
var books = database.GetCollection<Entity>("books");
Book book = new Book
{
Author = "Ernest Hemingway",
Title = "For Whom the Bell Tolls"
};
books.Insert(book);
C# implementations
If your data is irregularly structured or semi-structured, you can use a
BSON object model:
BsonDocument person = new BsonDocument {
{ "name", "John Doe" },
{ "address", new BsonDocument {
{ "street", "123 Main St." },
{ "city", "Centerville" },
{ "state", "PA" },
{ "zip", 12345}
}}
};
var people = database.GetCollection<BsonDocument>("people");
people.Insert(person);
Windows Azure Table Storage
Table Storage Concepts
Table Details
Entity Properties
No Fixed Schema
Querying
Purpose of the PartitionKey
Scalability
Partition: Range of entities with same partition key
value.
Partitions are fanned out based on load
They can be condensed when load decreases
Reads are load balanced against three replicas
C# Examples
public class Book : TableEntity
{
public string Author { get; set; }
public string Title { get; set; }
}
// Retrieve the storage account from the connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
CloudConfigurationManager.GetSetting("StorageConnectionString"));
// Create the table client.
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
// Create the CloudTable object that represents the "people" table.
CloudTable table = tableClient.GetTableReference("books");
// Create a new customer entity.
Book book = new Book() { Author = "Ernest Hemingway", Title = "For Whom The Bell Tolls" };
book.PartitionKey = "ErnestHemingway";
book.RowKey = "1";
// Create the TableOperation that inserts the customer entity.
TableOperation insertOperation = TableOperation.Insert(customer1);
// Execute the insert operation.
table.Execute(insertOperation);
NoSQL Document and Entity store Wins
Semi-structured data first class citizen
Built in MapReduce
Operational and interactive
Massively scalable *if you get your partitions correct*
Graph Databases, Neo4j KEY FACTS
Open Source; Neotechnologies
Java
Runs equally well on Windows or Linux. In Windows Azure there are
VMDepot images able to be deployed in a few simple steps. Additionally the
Azure Linux VMs are a good fit for this database engine.
There is an Open Source .net SDK available through Nuget and actively
maintained primarily by an Australian company, Readify.
Neo4j is O(1)
It exhibits constant-time performance; that is, the algorithm takes the same time to
execute irrespective of the size of the dataset.
How O(1)?
• Graphs don’t have tables. They don’t have collections.
• They have nodes and relationships.
• Rather than having to select out a whole table, we can identify a point
on the graph
• A start point
• Follow the traversal of relationships from that point.
http://www.apcjones.com/arrows/#
Things we can do
• Find all the things formed in Sweden
START sweden = node:countryIdx(“country=Sweden”)
MATCH Sweden<-[:FORMED_IN]-something
RETURN something;
• Find friends of friends
START magnus = node:peopleIdx(“name=magnus”)
MATCH magnus-[:FRIENDS]->friend-[:FRIENDS]->friendoffriend
RETURN friendoffriend;
NEO4J Client
Open source Neo4j Client
C# examples
var query = neo4Jclient.Cypher
.Start(new
{
sweden = Node.ByIndexLookup("countryIdx", "country", "sweden")
})
.Match("sweden-[:FRIENDS]->friend-[:FRIENDS]->friendoffriend")
.Return<Node<Friend>>("friendoffriend");
Graph Database Wins
• Modelled domains match cognitive processes
• Optimised for traversal of relationships allow complex and “social”
queries to emerge
• LIKES of FRIENDS of COLLEAGUES
• O(1) performance characteristics due to ability to START queries at
arbitrary graph points.
Summary
• HDInsight brings Hadoop to Azure
• Suited to Data Volume, Variety, Variability etc
• MongoDB brings Document stores
• Suited to Data Volume, Operational concerns
• Table Storage brings Entity stores
• Suited to Data Volume, strong consistency requirements, low cost and TCO
• Neo4j brings Graph database
• Suited to data relationship traversal
Thanks
Questions

Más contenido relacionado

La actualidad más candente

Building a Social Network with MongoDB
  Building a Social Network with MongoDB  Building a Social Network with MongoDB
Building a Social Network with MongoDBFred Chu
 
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...MongoDB
 
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...MongoDB
 
Introduction to Restkit
Introduction to RestkitIntroduction to Restkit
Introduction to Restkitpetertmarks
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsMongoDB
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB
 
Schema Design Best Practices with Buzz Moschetti
Schema Design Best Practices with Buzz MoschettiSchema Design Best Practices with Buzz Moschetti
Schema Design Best Practices with Buzz MoschettiMongoDB
 
MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...
MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...
MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...MongoDB
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDBMongoDB
 
Mongo db – document oriented database
Mongo db – document oriented databaseMongo db – document oriented database
Mongo db – document oriented databaseWojciech Sznapka
 
Data integration
Data integrationData integration
Data integrationBallerina
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
Storing and manipulating graphs in HBase
Storing and manipulating graphs in HBaseStoring and manipulating graphs in HBase
Storing and manipulating graphs in HBaseDan Lynn
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopSteven Francia
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLMongoDB
 

La actualidad más candente (18)

Building a Social Network with MongoDB
  Building a Social Network with MongoDB  Building a Social Network with MongoDB
Building a Social Network with MongoDB
 
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
Benefits of Using MongoDB Over RDBMS (At An Evening with MongoDB Minneapolis ...
 
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local Munich 2019: Best Practices for Working with IoT and Time-seri...
 
Introduction to Restkit
Introduction to RestkitIntroduction to Restkit
Introduction to Restkit
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
 
Schema Design Best Practices with Buzz Moschetti
Schema Design Best Practices with Buzz MoschettiSchema Design Best Practices with Buzz Moschetti
Schema Design Best Practices with Buzz Moschetti
 
MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...
MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...
MongoDB World 2018: Time for a Change Stream - Using MongoDB Change Streams t...
 
Managing Social Content with MongoDB
Managing Social Content with MongoDBManaging Social Content with MongoDB
Managing Social Content with MongoDB
 
MongoDB crud
MongoDB crudMongoDB crud
MongoDB crud
 
Mongo db – document oriented database
Mongo db – document oriented databaseMongo db – document oriented database
Mongo db – document oriented database
 
Data integration
Data integrationData integration
Data integration
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Mongo db queries
Mongo db queriesMongo db queries
Mongo db queries
 
Storing and manipulating graphs in HBase
Storing and manipulating graphs in HBaseStoring and manipulating graphs in HBase
Storing and manipulating graphs in HBase
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and Hadoop
 
Back to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQLBack to Basics Webinar 1: Introduction to NoSQL
Back to Basics Webinar 1: Introduction to NoSQL
 

Similar a MVP Cloud OS Week Track 1 9 Sept: Data liberty

Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...Trivadis
 
Data liberty in an age post sql - with pizazz - as presented at cloudburst
Data liberty in an age post sql - with pizazz - as presented at cloudburstData liberty in an age post sql - with pizazz - as presented at cloudburst
Data liberty in an age post sql - with pizazz - as presented at cloudburstandyelastacloud
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Keshav Murthy
 
OSCON 2011 CouchApps
OSCON 2011 CouchAppsOSCON 2011 CouchApps
OSCON 2011 CouchAppsBradley Holt
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revisedMongoDB
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessMongoDB
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQLAaron Benton
 
Webinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerWebinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerIBM Cloud Data Services
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopAhmedabadJavaMeetup
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)javier ramirez
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBMapR Technologies
 
Data sync on iOS with Couchbase Mobile
Data sync on iOS with Couchbase MobileData sync on iOS with Couchbase Mobile
Data sync on iOS with Couchbase MobileThiago Alencar
 
Discover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQLDiscover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQLDave Stokes
 
Discover The Power of NoSQL + MySQL with MySQL
Discover The Power of NoSQL + MySQL with MySQLDiscover The Power of NoSQL + MySQL with MySQL
Discover The Power of NoSQL + MySQL with MySQLDave Stokes
 
20181215 introduction to graph databases
20181215   introduction to graph databases20181215   introduction to graph databases
20181215 introduction to graph databasesTimothy Findlay
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...confluent
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Maxime Beugnet
 

Similar a MVP Cloud OS Week Track 1 9 Sept: Data liberty (20)

Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
Trivadis TechEvent 2016 Polybase challenges Hive relational access to non-rel...
 
Data liberty in an age post sql - with pizazz - as presented at cloudburst
Data liberty in an age post sql - with pizazz - as presented at cloudburstData liberty in an age post sql - with pizazz - as presented at cloudburst
Data liberty in an age post sql - with pizazz - as presented at cloudburst
 
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
 
OSCON 2011 CouchApps
OSCON 2011 CouchAppsOSCON 2011 CouchApps
OSCON 2011 CouchApps
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revised
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
 
Polyglot Persistence
Polyglot PersistencePolyglot Persistence
Polyglot Persistence
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQL
 
Webinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data LayerWebinar: The Anatomy of the Cloudant Data Layer
Webinar: The Anatomy of the Cloudant Data Layer
 
Introduction to MongoDB and Workshop
Introduction to MongoDB and WorkshopIntroduction to MongoDB and Workshop
Introduction to MongoDB and Workshop
 
Your Database Cannot Do this (well)
Your Database Cannot Do this (well)Your Database Cannot Do this (well)
Your Database Cannot Do this (well)
 
NoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DBNoSQL Application Development with JSON and MapR-DB
NoSQL Application Development with JSON and MapR-DB
 
Data sync on iOS with Couchbase Mobile
Data sync on iOS with Couchbase MobileData sync on iOS with Couchbase Mobile
Data sync on iOS with Couchbase Mobile
 
Discover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQLDiscover the Power of the NoSQL + SQL with MySQL
Discover the Power of the NoSQL + SQL with MySQL
 
Discover The Power of NoSQL + MySQL with MySQL
Discover The Power of NoSQL + MySQL with MySQLDiscover The Power of NoSQL + MySQL with MySQL
Discover The Power of NoSQL + MySQL with MySQL
 
Einführung in MongoDB
Einführung in MongoDBEinführung in MongoDB
Einführung in MongoDB
 
20181215 introduction to graph databases
20181215   introduction to graph databases20181215   introduction to graph databases
20181215 introduction to graph databases
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 

Último

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Último (20)

Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

MVP Cloud OS Week Track 1 9 Sept: Data liberty

  • 1. Data Liberty Alternatives to the shackles of limited scale in data solutions Andy Cross Windows Azure MVP Elastacloud
  • 2. Competition and Hashtag • Tell everyone I’m awesome #ukmvpcloud • Fill in the forms on your seats – for a chance to win a WINDOWS PHONE!
  • 3. Data value at scale requires technology choices; * often prioritising data read traversal over operational characteristics of create/update/delete * embracing hybrid data platforms with varied technology partners over homogenous estates * establishing alternative skillsets, augmented with entrenched languages, trusting cloud over maintenance * following robust engineering processes to provide rigour in a deterministic world
  • 4. Bravery leads to rewards; * the winners will have data which shows them that they’ve won * the commoditised query turns energy sucking data silos into profit centres * new data traversal mechanisms lead to new connotative data expression * everything you already know is relevant and valid; the constraints on how it is applied are not
  • 5. Today we’ve already heard about the position of big data in the market and the usual I’m going to give a tour of actually how to get some value out. By value I mean a result to a query. It’s not going to be visualised. But it’s a tour so lets start with the history.
  • 6. IBM have been a leader in Big Data for years. Wikimedia commons
  • 7. We’re not as great as we’d hope; we’re often still bound by our ability to marshal our IO. Just as the speed of loading punchcards was historically a limiting factor, we are now limited by our capacity to ingest data on individual machines. This leads to ideas such as DFS and data locality.
  • 8.
  • 9. During the evolution of data we eventually moved to client/server and this was a big step up from dBase et al of the time. Fundamentally however, the tabular structured nature of data poses many changes; not least the long term effects of normalisation which trade off effective storage in the short term with long term offset compute which is required to reconstruct sets. This eventually leads to such ideas as NoSQL document and entity stores.
  • 10.
  • 11. Modelling of data provides a consistent challenge. Our world is highly connected and our brains are effective connectors of data. Real world data fits poorly into highly structured data sets. This leads to semi-structured and unstructured data formats and data queryability through relationship traversal
  • 12. The technologies shown today are primarily written in non-.net and non-Microsoft languages and frameworks. Every time we do this, I’ll show examples ONLY in the .net and Microsoft stacks. There are obviously challenges beyond language to running the alternative stacks; but remember in the Cloud you aren’t responsible for tuning a Linux cluster which has been running for 5 years. You should provision for a duration that is bounded by the likelihood of the cluster requiring routine maintenance.
  • 13. Hadoop – KEY FACTS Open Source; Apache Foundation. Java. Map Reduce framework for job distribution; Distributed File System for file access. In Windows Azure this is known as HDInsight.
  • 14. Hadoop is O(n) It exhibits linear performance; when the dataset doubles, the time taken to execute the algorithm doubles.
  • 16. public class SwedishSessionsJob : HadoopJob<SwedishSessionsMapper, SessionsReducer> { public override HadoopJobConfiguration Configure(ExecutorContext context) { var config = new HadoopJobConfiguration() { InputPath = ""/AllSessions/*.gz"", OutputFolder = "/SwedishSessions/" }; return config; } } Jobs
  • 17. public class SwedishSessionsMapper : MapperBase { public override void Map(string inputLine, MapperContext context) { if (inputLine.Contains("Country=Sweden") { context.IncrementCounter("SwedishSession"); context.EmitKeyValue(“SE", "1"); } } } Mapper
  • 18. public class SessionsReducer : ReducerCombinerBase { public override void Reduce(string key, IEnumerable<string> values, ReducerContext context) { context.EmitKeyValue(key, values.Count()); } } Reducer
  • 19. Testing Hadoop Queries var inputData = "Country=Sweden&Name=Magnus"; var result = StreamingUnit.Execute<Jobs.SwedishJob>(new[]{inputData}); Assert.AreEqual("SEt1", result.ReducerResult.First());
  • 20. * Tools are great but not friendly HDInsight wins. Automated provisioning and job execution services. Transient clusters limit exposure to poorly tooled* java estate. Persistence with Windows Azure Blob Storage as HDFS proxy known as Azure Storage Vault (ASV). Persistence in Windows Azure SQL Database for Hive Metastore. Javascript console.
  • 21. NoSQL Document and Entity Stores Examples in MongoDb and Windows Azure Table Storage.
  • 22. What is a document database? { "_id" : ObjectId("51fccc57f82352d76653bdae"), "Name" : { "FirstName" : "Owen", "LastName" : "Grzegorek" }, "Company" : "Howard Miller Co", "Address" : { "Line1" : "15410 Minnetonka Industrial Rd", "Line2" : "Minnetonka", "Line3" : "Hennepin", "Line4" : "MN", "Line5" : "55345" }, "ContactDetails" : { "Phone" : "952-939-2973", "Fax" : "952-939-4663", "Email" : "owen@grzegorek.com", "Web" : "http://www.owengrzegorek.com" } } { "_id" : ObjectId("51fccc57f82352d76653bdae"), "Name" : { "FirstName" : "Owen", "LastName" : "Grzegorek" }, "Company" : "Howard Miller Co", "Address" : { "Line1" : "15410 Minnetonka Industrial Rd", "Line2" : "Minnetonka", "Line3" : "Hennepin", "Line4" : "MN", "Line5" : "55345" }, "ContactDetails" : { "Phone" : "952-939-2973", "Fax" : "952-939-4663", "Email" : "owen@grzegorek.com", "Web" : "http://www.owengrzegorek.com" } } { "_id" : ObjectId("51fccc57f82352d76653bdae"), "Name" : { "FirstName" : "Owen", "LastName" : "Grzegorek" }, "Company" : "Howard Miller Co", "Address" : { "Line1" : "15410 Minnetonka Industrial Rd", "Line2" : "Minnetonka", "Line3" : "Hennepin", "Line4" : "MN", "Line5" : "55345" }, "ContactDetails" : { "Phone" : "952-939-2973", "Fax" : "952-939-4663", "Email" : "owen@grzegorek.com", "Web" : "http://www.owengrzegorek.com" } } { "_id" : ObjectId("51fccc57f82352d76653bdae"), "Name" : { "FirstName" : "Owen", "LastName" : "Grzegorek" }, "Company" : "Howard Miller Co", "Address" : { "Line1" : "15410 Minnetonka Industrial Rd", "Line2" : "Minnetonka", "Line3" : "Hennepin", "Line4" : "MN", "Line5" : "55345" }, "ContactDetails" : { "Phone" : "952-939-2973", "Fax" : "952-939-4663", "Email" : "owen@grzegorek.com", "Web" : "http://www.owengrzegorek.com" } } { "Name" : { "FirstName" : "Owen", "LastName" : "Grzegorek" }, "Company" : "Howard Miller Co", "Address" : { "Line1" : "15410 Minnetonka Industrial Rd", "Line2" : "Minnetonka", "Line3" : "Hennepin", "Line4" : "MN", "Line5" : "55345" }, "ContactDetails" : { "Phone" : "952-939-2973", "Fax" : "952-939-4663", "Email" : "owen@grzegorek.com", "Web" : "http://www.owengrzegorek.com" } } { "Name" : { "FirstName" : "Owen", "LastName" : "Grzegorek" }, "Company" : "Howard Miller Co", "Address" : { "Line1" : "15410 Minnetonka Industrial Rd", "Line2" : "Minnetonka", "Line3" : "Hennepin", "Line4" : "MN", "Line5" : "55345" }, "ContactDetails" : { "Phone" : "952-939-2973", "Fax" : "952-939-4663", "Email" : "owen@grzegorek.com", "Web" : "http://www.owengrzegorek.com" } } { "Name" : “Richard Conway", “Books Published” : “12”, “Specialises in” : “Data Science” } { "Name" : “Andy Cross", “Hometown" : “Blackpool“ } { "Name" : “Isaac Abraham", “Age" : “33“ “Football Team” : “Tottenham” “Icon” : }
  • 23. MongoDB Key Facts • High Performance • High Availability • Easy Scalability
  • 24. MongoDB is O(log n) It exhibits logarithmic performance; when the dataset doubles, the time taken to execute the algorithm increases by a fixed amount
  • 26. Mongo SDK There are many different way to connect with MongoDB from a .net project. Official Wrapper Alternative Tool
  • 27. C# implementations If your data is regularly structured, you can use domain classes: public class Book { public string Author { get; set; } public string Title { get; set; } } // "entities" is the name of the collection var books = database.GetCollection<Entity>("books"); Book book = new Book { Author = "Ernest Hemingway", Title = "For Whom the Bell Tolls" }; books.Insert(book);
  • 28. C# implementations If your data is irregularly structured or semi-structured, you can use a BSON object model: BsonDocument person = new BsonDocument { { "name", "John Doe" }, { "address", new BsonDocument { { "street", "123 Main St." }, { "city", "Centerville" }, { "state", "PA" }, { "zip", 12345} }} }; var people = database.GetCollection<BsonDocument>("people"); people.Insert(person);
  • 35. Purpose of the PartitionKey
  • 36. Scalability Partition: Range of entities with same partition key value. Partitions are fanned out based on load They can be condensed when load decreases Reads are load balanced against three replicas
  • 37.
  • 38. C# Examples public class Book : TableEntity { public string Author { get; set; } public string Title { get; set; } }
  • 39. // Retrieve the storage account from the connection string. CloudStorageAccount storageAccount = CloudStorageAccount.Parse( CloudConfigurationManager.GetSetting("StorageConnectionString")); // Create the table client. CloudTableClient tableClient = storageAccount.CreateCloudTableClient(); // Create the CloudTable object that represents the "people" table. CloudTable table = tableClient.GetTableReference("books"); // Create a new customer entity. Book book = new Book() { Author = "Ernest Hemingway", Title = "For Whom The Bell Tolls" }; book.PartitionKey = "ErnestHemingway"; book.RowKey = "1"; // Create the TableOperation that inserts the customer entity. TableOperation insertOperation = TableOperation.Insert(customer1); // Execute the insert operation. table.Execute(insertOperation);
  • 40. NoSQL Document and Entity store Wins Semi-structured data first class citizen Built in MapReduce Operational and interactive Massively scalable *if you get your partitions correct*
  • 41. Graph Databases, Neo4j KEY FACTS Open Source; Neotechnologies Java Runs equally well on Windows or Linux. In Windows Azure there are VMDepot images able to be deployed in a few simple steps. Additionally the Azure Linux VMs are a good fit for this database engine. There is an Open Source .net SDK available through Nuget and actively maintained primarily by an Australian company, Readify.
  • 42. Neo4j is O(1) It exhibits constant-time performance; that is, the algorithm takes the same time to execute irrespective of the size of the dataset.
  • 43. How O(1)? • Graphs don’t have tables. They don’t have collections. • They have nodes and relationships. • Rather than having to select out a whole table, we can identify a point on the graph • A start point • Follow the traversal of relationships from that point.
  • 45. Things we can do • Find all the things formed in Sweden START sweden = node:countryIdx(“country=Sweden”) MATCH Sweden<-[:FORMED_IN]-something RETURN something; • Find friends of friends START magnus = node:peopleIdx(“name=magnus”) MATCH magnus-[:FRIENDS]->friend-[:FRIENDS]->friendoffriend RETURN friendoffriend;
  • 46. NEO4J Client Open source Neo4j Client
  • 47. C# examples var query = neo4Jclient.Cypher .Start(new { sweden = Node.ByIndexLookup("countryIdx", "country", "sweden") }) .Match("sweden-[:FRIENDS]->friend-[:FRIENDS]->friendoffriend") .Return<Node<Friend>>("friendoffriend");
  • 48. Graph Database Wins • Modelled domains match cognitive processes • Optimised for traversal of relationships allow complex and “social” queries to emerge • LIKES of FRIENDS of COLLEAGUES • O(1) performance characteristics due to ability to START queries at arbitrary graph points.
  • 49. Summary • HDInsight brings Hadoop to Azure • Suited to Data Volume, Variety, Variability etc • MongoDB brings Document stores • Suited to Data Volume, Operational concerns • Table Storage brings Entity stores • Suited to Data Volume, strong consistency requirements, low cost and TCO • Neo4j brings Graph database • Suited to data relationship traversal