Este documento resume las diferencias entre bases de datos relacionales y NoSQL, y presenta Microsoft Azure Table Storage y MongoDB como alternativas NoSQL. Explica los fundamentos, diseño y programación de cada uno, así como escenarios de uso comunes y una evaluación de costes.
KELA Presentacion Costa Rica 2024 - evento Protégeles
NoSQL Azure y MongoDB
1. NoSQL: Azure Table Storage & MongoDB
Alternativas y escenarios de uso NoSQL
@erincon Eladio Rincón
Director Relacional
MCT, MVP SQL Server
@augurarte Miguel López
Director Cloud | mlopez@solidq.com
MCT, MCPD, MCTS, MCP
1
2. Agenda
Bases de datos relacionales y NoSQL
Microsoft Azure Table Storage
– Fundamentos MSATS
– Diseño de NoSQL con MSATS
– Programación del modelo
MongoDB
– Fundamentos
– Diseño de NoSQL con MongoDB
– Programación del modelo
Escenarios
Evaluación de costes
3
3. NoSQL: Azure Table Storage & MongoDB
BDD Relacionales y NoSQL
Almacenamiento de table de Azure (MSATS)
MongoDB
Escenarios
Costes
8. NoSQL
Next Generation Databases mostly addressing
some of the points: being non-relational,
distributed, open-source and horizontally
scalable.
http://NoSQL-database.org 9
9. NoSQL Evolución y características
• Surgen a principios de 2009
• Crecimiento exponencial.
• Características comunes más importantes:
• No se basan en esquema
• Soporte de replicación sencillo
• API de fácil utilización
• Consistencia parcial (eventually consistent) / BASE (not ACID)
• Enormes cantidades de información.
• "nosql" es traducido tradicionalmente como "not only sql“, y se
considera un alias que referencia la definición anterior.
10
10. Modelos de datos NoSQL
http://www.slideshare.net/emileifrem/nosql-east-a-nosql-overview-and-the-benefits-of-graph-databases
11
11. Grandes familias NoSQL
• Wide Column Store / Column Families
• Document Store
• Key Value / Tuple Store
• Graph Databases
• Multimodel Databases
• Object Databases
• Grid & Cloud Database Solutions
• XML Databases
• Multidimensional Databases
• Multivalue Databases
• Event Sourcing
• Network Model
12
13. NoSQL: Azure Table Storage & MongoDB
BDD Relacionales y NoSQL
Almacenamiento de table de Azure (MSATS)
MongoDB
Escenarios
Costes
14. Almacenamiento de table Azure
Windows Azure Table Service
Windows Azure tables
Designing Windows Azure Tables
Implementing Windows Azure Tables
Querying Windows Azure Tables
Accessing table storage
Table Storage Security
– Set container permissions.
– Use Shared Access Signatures (SAS)
– Stored Access Policies
15
15. Cuentas de almacenamiento MSA
User creates a globally unique storage account name
Choose the primary location to host storage account
Three regions:
– “North Central US”, “South Central US”
– “North Europe”, “Europe West”
– “South East Asia”, “East Asia”
Table Service:
http://myaccount.table.core.windows.net
16
17. Características MSATS
• Tabla de almacenamiento de Azure
• Colección de entidades (parejas clave, valor)
• La entidad tiene una clave principal y un conjunto
aleatorio no definido (schema-less) de
propiedades.
• PartitionKey (Particionamiento / distribución de carga / escalado)
• RowKey (conjuntos de filas relacionadas en una partición)
• TimeStamp (Sólo lectura, TS creación)
• Una propiedad es un par de nombre/valor (con tipo)
similar a una columna
• MSATS no fuerza la creación o validación de esquema
• El desarrollador necesitará implementar un esquema
concreto para su acceso
18
20. Entidades del MSATS
Entities are the basic data item stored in a table
Collections of properties that are name value pairs.
Each entity has 3 fixed properties called PartitionKey,
RowKey and Timestamp.
If we were to map this to concepts in a conventional
database system, an entity is analogous to a row and
property is analogous to a column.
An entity can have up to 255 – 3 properties = 252 (3
system properties are reserved).
1 Entity = 1 MB of data (Maximum)
21
21. Entidades: Tipos de datos de propiedades
• Byte[]
• Bool
• DateTime
• Double
• Guid
• Int32/Int
• Int64/long
• String
22
23. Características de las propiedades PK
• Tablas particionadas para soportar balanceo de carga
entre nodos de almacenamiento.
• Las entidades se organizan por partición.
• Una partición es un rango consecutive de entidades
con la misma clave de partición.
• Partition key: Identificador único de partición dentro
de una tabla.
• PK es la primera parte de la clave principal de una
entidad.
• PK puede ser una cadena de hasta 1KB de tamaño.
• Incluir PartitionKey property en cada insert, update, y
delete.
24
24. Características de las propiedades RowKey
RowKey
• The row key is a unique identifier for an entity within
a given partition.
• Together the PartitionKey and RowKey uniquely
identify every entity within a table.
• The row key is a string value that may be up to 1 KB
in size.
• You must include the RowKey property in every
insert, update, and delete operation.
25
25. MSATS es una BDD sin esquema
26
PartitionKey: str
2012102517USR91
P
V
RowKey: str
jobid0121
P
V
TimeStamp: dt
201210251715
P
V
Name: st
User_91
P
V
PartitionKey: str
2012102517USR10
P
V
RowKey: str
jobid0121
P
V
TimeStamp: dt
201210251735
P
V
Status: st
Processing
P
V
Name: st
User_10
P
V
PartitionKey: str
2012102517USR10
P
V
RowKey: str
jobid0125
P
V
TimeStamp: dt
201210251755
P
V
RoleExec: st
IN_121
P
V
Name: st
User_10
P
V
LastOp: st
Decrypt
P
V
PartitionKey: str
2012102517USR13
P
V
RowKey: str
jobid0002
P
V
TimeStamp: dt
201210251759
P
V
Name: st
User_13
P
V
PartitionKey: str
2012102517USR13
P
V
RowKey: str
jobid0001
P
V
TimeStamp: dt
201210251755
P
V
Name: st
User_13
P
V
Status: st
Aborted
P
V
TABLE JOBEXECUTION
26. Consulta del servicio de tabla
27
PartitionKey: str
2012102517USR91
P
V
RowKey: str
jobid0121
P
V
TimeStamp: dt
201210251715
P
V
Name: st
User_91
P
V
PartitionKey: str
2012102517USR10
P
V
RowKey: str
jobid0121
P
V
TimeStamp: dt
201210251735
P
V
Status: st
Processing
P
V
Name: st
User_10
P
V
PartitionKey: str
2012102517USR10
P
V
RowKey: str
jobid0125
P
V
TimeStamp: dt
201210251755
P
V
RoleExec: st
IN_121
P
V
Name: st
User_10
P
V
LastOp: st
Decrypt
P
V
PartitionKey: str
2012102517USR13
P
V
RowKey: str
jobid0002
P
V
TimeStamp: dt
201210251759
P
V
Name: st
User_13
P
V
PartitionKey: str
2012102517USR13
P
V
RowKey: str
jobid0001
P
V
TimeStamp: dt
201210251755
P
V
Name: st
User_13
P
V
Status: st
Aborted
P
V
TABLE JOBEXECUTION
35. Managing tables in WATS
Use Microsoft.WindowsAzure.StorageClient. There are
several ways of creating tables.
Initialize tables once: for example using session or
Application start events.
Use CreateCloudTableClient to interact with WATS
For example, to create a table called ‘Products’:
36
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.StorageClient;
var clAccount;
clAccount = CloudStorageAccount.FromConfigurationSettings(“ProductStorage")
var tableClient = clAccount.CreateCloudTableClient();
tableClient.CreateTableIfNotExists(“Products”)
38. Operaciones CRUD
Setting up data context
Context class is required in .NET to access the WAT
using WCF Data Services and the correspondent entity
(GE06LDataEntry).
39
namespace GE06L_Data {
public class GE06LDataContext : TableServiceContext {
public GE06LDataContext(string baseAddress,
Microsoft.WindowsAzure.StorageCredentials credentials)
: base(baseAddress, credentials) {}
public IQueryable<GE06LEntry> GE06LEntry {
get {
return this.CreateQuery<GE06LEntry>(“GE06LEntry");}
}
}
}
39. Entities CRUD operations
List, Add, Update
// List (read) operation for GE06LEntry entities
public IEnumerable<GE06LEntry> GetGE06LEntries() {
var results = from g in this.context.GE06LEntry
where g.PartitionKey == DateTime.UtcNow.ToString("MMddyyyy")
40
select g;
return results; }
// Add operation for GE06LEntry entities
public void AddGE06LEntry(GE06LEntry newItem) {
this.context.AddObject(“GE06LEntry", newItem);
this.context.SaveChanges(); }
// Update operation for GE06LEntry entities
public void UpdateFileURL(string partitionKey, string rowKey,
string fileUrl) {
var results = from g in this.context.GE06LEntry
where g.PartitionKey == partitionKey && g.RowKey == rowKey
select g;
var entry = results.FirstOrDefault<GE06LEntry>();
entry.FilelUrl = fileUrl;
this.context.UpdateObject(entry);
this.context.SaveChanges(); }
40. Consultar MSATS
Single Entity (a.k.a. Point Queries)
41
var q = (
from pais in context.CreateQuery<Pais>(tableName)
where pais.PartitionKey == “Paises“ &&
pais.RowKey == “España"
select movie);
42. Consultar MSATS
Range Queries - Partition Range Scan
43
var q = (
from movie in context.CreateQuery<Movie>(tableName)
where
movie.PartitionKey == "Action" &&
movie.RowKey.CompareTo("Alien") >= 0 &&
movie.RowKey.CompareTo("Terminator") <= 0 &&
movie.IsFavorite
select movie
);
43. Consultar MSATS
Full Table Scan
var q = (from movie in context.CreateQuery<Movie>(tableName)
44
select movie );
var q = (from movie in context.CreateQuery<Movie>(tableName)
where movie.PartitionKey.CompareTo("Action") != 0
select movie );
var q = (
from movie in context.CreateQuery<Movie>(tableName)
where movie.IsFavorite select movie); var q =
(from movie in context.CreateQuery<Movie>(tableName)
where movie.RowKey.CompareTo("Sh") >= 0 &&
movie.RowKey.CompareTo("Si") < 0
select movie );
45. NoSQL: Azure Table Storage & MongoDB
BDD Relacionales y NoSQL
Almacenamiento de table de Azure (MSATS)
MongoDB
Escenarios
Costes
46. MongoDB
Motivación
• ¿”Problemas” en RDBMs tradicionales?
47
• Transaciones
• JOINs
• Prescindimos de ellos
• Transacciones a nivel de documento
• No dejamos hacer JOINs
48. MongoDB: Otro Paradigma
49
• Filosofía Schema On-Read
• El Esquema es importante (aunque laxo)
• En algún momento necesitas “modelar”
• Antes o después
• Modelos Mapeados en Memoria
• “Cerveza con alcohol vs cerveza sin alcohol” by
Garrigós
• Importante arquitectura x64!
49. MongoDB: Conceptos Clave
• Aunque veamos JSON, almacenado en binario
50
– BSON
• Consola de administración JavaScript y
JSON
• Varios GUI tipo SSMS
• Drivers de Mongo para facilitar desarrollo
• Driver para C# y usar LINQ
• Modelos de persistencia para mapeo
• MongoRepository
• Sharding y Replicasets
51. NoSQL: Azure Table Storage & MongoDB
BDD Relacionales y NoSQL
Almacenamiento de table de Azure (MSATS)
MongoDB
Escenarios
Costes
52. Escenarios
Aplicaciones altamente escalables que necesitan
consultas sencillas y rápidas
• ONU, datos georeferenciados, info calculada
• Energética, metadatos
• BIG Data, datos temporales/sensores
• Información social / redes sociales
• Aplicaciones de localización de correos, direcciones, etc.
53
53. NoSQL: Azure Table Storage & MongoDB
BDD Relacionales y NoSQL
Almacenamiento de table de Azure (MSATS)
MongoDB
Escenarios
Costes
55. NoSQL: Azure Table Storage & MongoDB
Bases de datos relacionales y NoSQL
Microsoft Azure Table Storage
– Fundamentos MSATS
– Diseño de NoSQL con MSATS
– Programación del modelo
MongoDB
– Fundamentos
– Diseño de NoSQL con MongoDB
– Programación del modelo
Escenarios
Evaluación de costes
56
56. Si quieres disfrutar de las mejores sesiones de
nuestros mentores de España y Latino América,
ésta es tu oportunidad.
http://summit.solidq.com
Síguenos:
58
Notas del editor
Key-Value Stores: Dynamo DHT, Voldemort (Used in Linkedin)
Column Families: Google BigTable, Cassandra (Facebook)
Document DBs: CouchDB, MongoDB ()
Graph DBs: Neo4j
Note: Windows Azure Blob service has support for taking snapshots of blobs. The different between a snapshot and a blob copy is that snapshots are read-only and the original blob maintains a relationship to its snapshots; blob copies on the other hand are editable. Once a snapshot has been taken for a blob, this source blob can no longer be deleted. Before a source blob can be deleted, all of its snapshots must be deleted first.
Security in WATS
Once you have created a storage service account, you will receive two 512 bit secret keys called primary and secondary access keys. Any one of these secret keys is then used to authenticate user requests to the storage system by creating a HMAC SHA256 signature for the request. The signature is passed with each request to authenticate the user requests. The reason for the two access keys is that it allows you to regenerate keys by rotating between primary and secondary access keys in your existing live applications.
Property Types
The Table service supports a subset of data types defined by the WCF Data Services specification.
By default a property is created as type String, unless you specify a different type. To explicitly type a property, specify its data type by using the appropriate WCF Data Services type in the Atom feed for an Insert Entity (REST API) or Update Entity (REST API) operation.
This table contains a small subset of data for footrace event registrations. It presents a conceptual view of partitioning where the PartitionKey contains three different values comprised of the event's name and distance. In this example, there are two partition servers.
Server A contains registrations for the half-marathon and 10 Km distances
Server B contains only the full-marathon distances.
The RowKey values are shown to provide context but are not meaningful for this example.
Scalability
Because a partition is always served from a single partition server and each partition server can serve one or more partitions, the efficiency of serving entities is correlated with the health of the server. Servers that encounter high traffic for their partitions may not be able to sustain a high throughput. For example, in the figure in the slide, if there are many requests for "2011 New York City Marathon__Half", server A may become too hot. To increase the throughput of the server, the storage system load-balances the partitions to other servers. The result is that the traffic is distributed across many other servers. For optimal load balancing of traffic, you should use more partitions because it will allow the Windows Azure Table service to distribute the partitions to more partition servers.
Note: the TableServiceContext class is in the storage client API. This class derives from DataServiceContext in WCF Data Services and manages the credentials required to access your storage account as well as providing support for a retry policy for its operations.
Type A. Single Entity (a.k.a. Point Queries): Point query is a query is to retrieve a single entity by specifying a single PartitionKey and RowKey using equality predicates.
var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.PartitionKey == "Action" && movie.RowKey == "Terminator" select movie);
Type B: Row Range Scan: Query results in scanning a range of rows within a single partition
var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.PartitionKey == "Action" && movie.RowKey.CompareTo("Alien") >= 0 && movie.RowKey.CompareTo("Terminator") <= 0 && movie.IsFavorite select movie);
Type C: Partition Range Scan: Query results in scanning a range of rows that may span across several partitions.
var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.PartitionKey.CompareTo("Action") >= 0 && movie.PartitionKey.CompareTo("War") < 0 && movie.IsFavorite select movie);
Type D: Full Table Scan: Query results in scanning the entire table i.e. all rows in all partitions in a table.
var q = (from movie in context.CreateQuery<Movie>(tableName) select movie );
var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.PartitionKey.CompareTo("Action") != 0 select movie );
var q = (
from movie in context.CreateQuery<Movie>(tableName)
where movie.IsFavorite select movie); var q = (from movie in context.CreateQuery<Movie>(tableName) where movie.RowKey.CompareTo("Sh") >= 0 && movie.RowKey.CompareTo("Si") < 0
select movie );