Enviar búsqueda
Cargar
Migrating from Relational - Data Modeling and Access
•
1 recomendación
•
188 vistas
C
Clarence J M Tauro
Seguir
Denunciar
Compartir
Denunciar
Compartir
1 de 62
Descargar ahora
Descargar para leer sin conexión
Recomendados
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Paco Nathan
Graph Analytics in Spark
Graph Analytics in Spark
Paco Nathan
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
BigDataEverywhere
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
William Lyon
Agile Document Models & Data Structures
Agile Document Models & Data Structures
Clarence J M Tauro
标示简写易读器(MarkView) 演示片
标示简写易读器(MarkView) 演示片
Shane Weng
NoSQL_Night
NoSQL_Night
Clarence J M Tauro
Couchbase Performance Benchmarking
Couchbase Performance Benchmarking
Renat Khasanshyn
Recomendados
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Paco Nathan
Graph Analytics in Spark
Graph Analytics in Spark
Paco Nathan
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
BigDataEverywhere
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
William Lyon
Agile Document Models & Data Structures
Agile Document Models & Data Structures
Clarence J M Tauro
标示简写易读器(MarkView) 演示片
标示简写易读器(MarkView) 演示片
Shane Weng
NoSQL_Night
NoSQL_Night
Clarence J M Tauro
Couchbase Performance Benchmarking
Couchbase Performance Benchmarking
Renat Khasanshyn
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
Ontotext
Putting Apache Drill into Production
Putting Apache Drill into Production
MapR Technologies
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
MapR Technologies
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
MapR Technologies
Is Spark Replacing Hadoop
Is Spark Replacing Hadoop
MapR Technologies
Cassandra introduction 2016
Cassandra introduction 2016
Duyhai Doan
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
DATAVERSITY
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
DataStax
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
Hortonworks
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
DataWorks Summit/Hadoop Summit
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public
Evans Ye
Architecting a next generation data platform
Architecting a next generation data platform
hadooparchbook
Real-World NoSQL Schema Design
Real-World NoSQL Schema Design
DataWorks Summit/Hadoop Summit
Streaming in the Extreme
Streaming in the Extreme
Julius Remigio, CBIP
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
The Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical Approach
DATAVERSITY
Ian Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous Deployment
WinOps Conf
Resume Vardan Karapetian Updated
Resume Vardan Karapetian Updated
vkarapet
Más contenido relacionado
Similar a Migrating from Relational - Data Modeling and Access
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
Ontotext
Putting Apache Drill into Production
Putting Apache Drill into Production
MapR Technologies
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
MapR Technologies
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
MapR Technologies
Is Spark Replacing Hadoop
Is Spark Replacing Hadoop
MapR Technologies
Cassandra introduction 2016
Cassandra introduction 2016
Duyhai Doan
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
DATAVERSITY
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
DataStax
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
Hortonworks
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
DataWorks Summit/Hadoop Summit
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public
Evans Ye
Architecting a next generation data platform
Architecting a next generation data platform
hadooparchbook
Real-World NoSQL Schema Design
Real-World NoSQL Schema Design
DataWorks Summit/Hadoop Summit
Streaming in the Extreme
Streaming in the Extreme
Julius Remigio, CBIP
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
MapR Technologies
The Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical Approach
DATAVERSITY
Ian Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous Deployment
WinOps Conf
Resume Vardan Karapetian Updated
Resume Vardan Karapetian Updated
vkarapet
Similar a Migrating from Relational - Data Modeling and Access
(20)
Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
Putting Apache Drill into Production
Putting Apache Drill into Production
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
Is Spark Replacing Hadoop
Is Spark Replacing Hadoop
Cassandra introduction 2016
Cassandra introduction 2016
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public
Architecting a next generation data platform
Architecting a next generation data platform
Real-World NoSQL Schema Design
Real-World NoSQL Schema Design
Streaming in the Extreme
Streaming in the Extreme
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
The Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical Approach
Ian Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous Deployment
Resume Vardan Karapetian Updated
Resume Vardan Karapetian Updated
Migrating from Relational - Data Modeling and Access
1.
©2016 Couchbase Inc. Migrating from relational data modeling and access Brant Burnett, Lead Developer, CenterEdge Clarence Tauro, Sr Trainer, Couchbase Marco Greco, Sr Engineer, N1QL R&D, Couchbase 1
2.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Agenda • Practical considerations for data and application migration • Modeling in Couchbase •
Real life experience: Centeredge 2
3.
©2016 Couchbase Inc. 3 Practical Considerations
4.
©2016 Couchbase Inc. ©2016 Couchbase Inc. In this section • Nomenclature • Type and data model mapping •
Migrating data • Business logic • Monitoring 4
5.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Nomenclature 5 Oracle Couchbase Database Bucket Table
Bucket Row Document Column Field
6.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Type Mapping 6 Oracle (PL/SQL Synonyms) Couchbase Number, Binary_real, Binary_integer Smallint, Int, Dec, Decimal, Float, … Number Char, Nchar, Varchar2, Nvarchar2 Character, String
String Boolean Boolean Date, Timestamp Handled via String Interval (year to month, day to fracPon) Some support via _millis() funcPons
7.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Modelling 7 CustomerID Name DOB CBC2016
Jane Smith 1990-01-30 CustomerID Type Cardnum Expiry CBC2016 visa 5827… 2019-03 CBC2016 master 6274… 2018-12 CustomerID ConnId Name CBC2016 XYZ987 Joe Smith CBC2016 SKR007 Sam Smith CustomerID item amt CBC2016 mac 2823.52 CBC2016 ipad2 623.52 CustomerID ConnId Name CBC2016 XYZ987 Joe Smith CBC2016 SKR007 Sam Smith Contacts Customer Billing ConnecPons Purchases { "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "ConnecPons" : [ { "CustId" : "XYZ987", "Name" : "Joe Smith" }, { "CustId" : "PQR823", "Name" : "Dylan Smith" } { "CustId" : "PQR823", "Name" : "Dylan Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ] } DocumentKey: CBC2016
8.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Migration • Generalized process • Commercial tools •
Talend • Informatica • Open source • Couchbase java importer • Oracle2couchbase • SQSL • Importing from files 8
9.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Migration Process • High level process to migrate data from RDBMS to Couchbase using N1QL • For each table •
Determine primary key columns • Describe table • For each row • Generate document key from primary key columns • Generate document from projection list description, column values • INSERT INTO <bucket> (key, value) ($1, $2) • Use key and document as placeholder values 9
10.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Talend • Talend connector for Couchbase -- Talend 5.3 or later • http://developer.couchbase.com/documentation/server/4.5/ connectors/talend/talend.html •
Ingesting unstructured data • Couchbase view support • Seamless integration with Couchbase • tCouchbaseInput • Incoming data transformed into JSON documents and stored in Couchbase. • User defines the data fields to be transformed into JSON attributes • tCouchbaseOutput: uses the schema mapping to transform JSON documents into target data formats • ODBC/JDBC drivers (provided by Simba and CData) 10
11.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Informatica 11 • Informatica Power Center • Needs ODBC driver •
Informatica Cloud • Needs JDBC driver • ODBC/JDBC drivers (provided by Simba and CData) • ETL & Data Integration • Load data from any Relational system into Couchbase • Export Couchbase data into RDBMS • Seamlessly integrate Couchbase into rest of the Data fabric
12.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Couchbase java importer • Blog post by Laurent Doguin detailing journey from process to code • Java based, but principle applies to other languages •
Geared to Postgres but principle applies to other engines • Blog: http://blog.couchbase.com/2016/january/moving-sql-database-content-to-couchbase • Source code: https://github.com/ldoguin/couchbase-java-importer 12
13.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Oracle2couchbase • Another blog post / opensource tool • By Manuel Hurtado •
Java based • Migrates from Oracle • Blog: http://blog.couchbase.com/2016/february/moving-data-from-oracle-to-couchbase • Source and binary: https://github.com/mahurtado/oracle2couchbase 13
14.
©2016 Couchbase Inc. ©2016 Couchbase Inc. SQSL • Client side SQL like scripting language developed by yours truly two decades ago • Several nifty features like •
Expansion • Data driven operation • On the fly aggregation and redirection • User defined routines • Have recently written data source for Couchbase and json library • Source: http://www.sqsl.org • Example: let fromconn="sample"; connect to fromconn source db2cli; connect to "couchbase://192.168.1.104:8091" source cb; select * from db2inst1.dept connection fromconn insert into default (key, value) values($1, $2) using json:key("::", columns), json:row2doc(displaylabels, columns); 14
15.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Import / Export utilities • Upcoming version includes cbimport & cbexport • File based utilities •
Need to export RDBDMS data to file first • Load directly into data store bypassing N1QL 15
16.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Business logic • DDL • Views •
Triggers • Procedures • Sequences • Joins • Transactions 16
17.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Language comparison 17 Query Features SQL on RDBMS N1QL DML
§ SELECT, INSERT, UPDATE, DELETE, MERGE § SELECT, INSERT, UPDATE, DELETE, MERGE DDL § CREATE [INDEX, PROCEDURE TABLE, TYPE, VIEW…] § ALTER [TABLE, TYPE, …] § DROP [INDEX, PROCEDURE TABLE, TYPE, VIEW…] § CREATE [PRIMARY] INDEX § DROP [PRIMARY] INDEX Query OperaDons § Select, Join, Project, Subqueries § Strict Schema § Strict Type checking § Select, Join, Project, Subqueries ü Nest & Unnest ü Look Ma! No Type Mismatch Errors! § JSON keys act as columns Schema § Predetermined Columns ü Fully addressable JSON ü Flexible document structure Data Types § SQL Data types § Conversion FuncPons § JSON Data types § Conversion FuncPons Query Processing § INPUT: Sets of Tuples § OUPUT: Set of Tuples § INPUT: Sets of JSON § OUTPUT: Set of JSON
18.
©2016 Couchbase Inc. ©2016 Couchbase Inc. DDL • Only Create Index / Drop Index exist in N1QL • Everything else should be removed from the application •
Temporary tables • Materialize results • Store in memory, or • Insert materialized document in a keyspace using a designated “type”: field and a UUID() as key • DROP <temporary table> becomes DELETE FROM keyspace WHERE type=… and ID=<UUID> 18
19.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Views • Access underlying keyspaces instead • Something akin to views can be obtained with •
View indexes • Functional indexes • CREATE INDEX … WHERE clauses 19
20.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Statement blocks • Handled by the application: • Triggers •
Procedures 20
21.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Sequences • The eventual persistence engine handles atomic increments • Special documents can be created with a counter and accessed atomically •
Can specify a delta on creation • Must be done from SDK • In python: • N1QL does not • Use UUID() instead 21
22.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Joins • Two types of joins • Look up •
Index • Joins use the document key • Joining side can be an expression • Joined side is document key • Full expression joins not supported 22
23.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Transactions • In two words: No need • Document modification is atomic •
Consistency can be specified at the REST call level or SDK • REST example • Add scan_consistency=[not_bounded|at_plus|request_plus|statement_plus] to REST parameters • C SDK example • Use lcb_n1p_setconsistency (…, [LCB_N1P_CONSISTENCY_NONE, LCB_N1P_CONSISTENCY_RYOW, LCB_N1P_CONSISTENCY_REQUEST, LCB_N1P_CONSISTENCY_STATEMENT]) when setting up request args 23
24.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Monitoring • Oracle • ALTER [system|session] SET timed_statistics=true turns on timed statistics collection. •
V$SESSTAT, V$SYSSTAT, V$STATNAME dynamic performance views report timed statistics. • EXPLAIN PLAN explains a statement. • MySQL • SET profiling=1 turns on profiling • SHOW PROFILES displays available query profiles • SHOW PROFILE displays the profile for a specific query • EXPLAIN <statement> produces query plan • Couchbase • system:completed_requests virtual keyspace lists completed long running queries with timings and statistics • system:active_requests virtual keyspace lists active queries with timings and statistics • EXPLAIN <statement> explains request plan as a json document 24
25.
©2016 Couchbase Inc. 25 Modeling
26.
©2016 Couchbase Inc. ©2016 Couchbase Inc. What is Data Modeling? 26 • A data model is a conceptual representation of the data structures that are required by a database • The data structures include the data objects, the associations between data objects, and the rules which govern operations on the objects.
27.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Conceptual Data Modeling • Define entities, attributes and their relationships • Entities: Main objects that are targets of your apps operates on •
Attributes: properties that your applications keep track of for the entity • Relationships: definition connections to other entities - 1-1, 1-many, many-many Airline Airport Landmark Route Passenger Flight
28.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Physical Data Model • Phase II - Map entities, attributes and their relationships to containers provided by the underlying database solution RelaDonal Databases Couchbase Server Databases
Buckets Tables Documents with type designator apribute OR Compound Keys Rows Items (Key-Value or Key-Document) Columns AMributes Index Index
29.
©2016 Couchbase Inc. ©2016 Couchbase Inc. dean@couchbase.com { “name”:…, “flights”:[ {“_id”:“route_1000”, “flight”:…,}, {“_id”:”route_6421”, “flight”:…,} …], … } Physical Data Modeling route_1000 { “id”:”1000”, “airline”: “AF”, “sourceairport”:”TLV”, “desPnaPonairport”:”MRW”, …} airport_TLV { “id”:”126701”, “airportname”: “TLV”, “geo”:{ “lat”:…,“long”:…}, …} Flights
30.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Data Modeling Approaches 30 NoSQL Relaxed NormalizaPon schema implied by structure fields may be empty, duplicate, or missing RelaDonal Required NormalizaPon schema enforced by db same fields in all records • Minimize data inconsistencies (one item = one locaPon) • Reduced update cost (no duplicated data) •
Preserve storage resources • OpPmized to planned/actual access paperns • Flexibly with soxware architecture • Supports clustered architecture • Reduced server overhead
31.
©2016 Couchbase Inc. ©2016 Couchbase Inc. JSON Design Choices 31 • Couchbase Server neither enforces nor validates for any particular document structure • Choices that impact JSON document design: –
Single Root Attributes – Objects vs. Arrays – Array Element Types – Timestamp Formats – Empty and Null Property Values – JSON Schema
32.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Root Attributes vs. Embedded Attributes 32 • The choice of having a single root attribute or the “type” attribute embedded.
33.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Objects vs. Arrays 33 • The choice of having an object type, or an array type
34.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Array Element Types Array of strings Array of objects 34 • Array elements can be simple types, objects or arrays:
35.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Timestamp Formats Array of Pme components String (ISO 8601) Number (Unix style) (Epoch) • Working and dealing with timestamps has been challenging ever since • When storing timestamps, you have at least 3 options: 16
36.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Empty and Null Property Values 36 • Keep in mind that JSON supports optional properties • If a property has a null value, consider dropping it from the JSON, unless there's a good reason not to •
N1QL makes it easy to test for missing or null property values • Be sure your application code handles the case where a property value is missing SELECT * FROM couchmusic1 WHERE userprofile.address IS NULL; SELECT * FROM couchmusic1 WHERE userprofile.gender IS MISSING;
37.
©2016 Couchbase Inc. ©2016 Couchbase Inc. JSON Schema 37 • Couchbase Server pays absolutely no attention to the shape of your JSON documents so long as they are well-formed • There are times when it is useful to validate that a JSON document conforms to some expected shape •
JSON Schema is a JSON-based format for defining the structure of JSON data • There are implementations for most popular programming languages • Learn more here: http://json-schema.org
38.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Data Nesting (aka Denormalization) 38 • Relational database design promotes separating data using normalization, which doesn’t scale • For NoSQL systems, we often avoid normalization so that we can scale •
Nesting allows related objects to be organized into a hierarchical tree structure where you can have multiple levels of grouping • Rule of thumb is to nest no more than 3 levels deep unless there is a very good reason to do so • You will often want to include a timestamp in the nested data
39.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Example of Data Nesting • Playlist with owner attribute containing username of corresponding userprofile 39 Document Key: copilotmarks61569
40.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Example of Data Nesting • Playlist with owner attribute containing a subset of the corresponding userprofile 40 * Note the inclusion of the updated apribute
41.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Choices with JSON Key Design 41 • A key formed of attributes that exist in the real world: – Phone numbers –
Usernames – Social security numbers – Account numbers – SKU, UPC or QR codes – Device IDs
42.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Surrogate Keys 42 • We often use surrogate keys when no obvious natural key exist • They are not derived from application data •
They can be generated values – 3305311F4A0FAAFEABD001D324906748B18FB24A (SHA-1) – 003C6F65-641A-4CGA-8E5E-41C947086CAE (UUID) • They can be sequential numbers (often implemented using the Counter feature of Couchbase Server) – 456789, 456790, 456791, …
43.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Making Tough Choices 43 • We must also make trade-offs in data modeling: – Document size –
Atomicity – Complexity – Speed
44.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Embed vs. Refer 44 • All of the previous trade-offs are usually rolled into a single decision – whether to embed or refer • When to embed? •
When to refer?
45.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Want to know more on Data Modeling? 45 • Session tomorrow – “Agile Document Models and Data Structures” at 1:00PM
46.
©2016 Couchbase Inc. ©2016 Couchbase Inc. 46 Brant Burnett Software
Development Team Lead Couchbase Community Expert
47.
©2016 Couchbase Inc. ©2016 Couchbase Inc. About CenterEdge Software ü Point of
Sale ü Admissions & Ticketing ü Party, Group & Event Bookings ü Online Sales & Party Reservations ü Time Clock & Labor Management ü & More! 47
48.
©2016 Couchbase Inc. ©2016 Couchbase Inc. About CenterEdge Software • Celebrating 12
Year Anniversary • Team of 50 in Roxboro, NC • Sister company is Palace Pointe, a 100k sq. ft. Entertainment Venue for which we were developed as an in-house system • Over 600 facilties using our platform across the US and abroad • FEC’s, Waterparks, Trampoline Parks, Amusement Parks, Skating Rinks, Bowling Centers, Zoos & Museums 48
49.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Why Couchbase For CenterEdge’s Newest Cloud Platform? • More scalable
and performant than traditional SQL in the cloud • Previous online store system uses 19 SQL servers, each hosting 30 stores • As each store is only on a single server, it doesn’t handle spikes in load efficiently • Servers can’t be scaled vertically without downtime for all 30 stores on that server • Schema-less JSON increases flexibility as your system evolves, leaving schema enforcement in your data access layer • Schema changes to large tables can result in downtime as data structure is updated across all records • We were already using Couchbase for our shopping carts as well as a SQL caching layer, with great success. Now we can simplify the architecture with a single data layer. 49
50.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Couchbase Cloud Data Flow Architecture 50 Data Data Data IndexQuery Web Servers Remote Application Servers
51.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Enforcing Schema • Since Couchbase doesn’t enforce schema like SQL, your data access layer should do so instead • At CenterEdge, each document type is only updated by a single service •
Within that service, schema is enforced by serializing data from consistent POCOs • Schema changes can be supported using customized JSON converters during deserialization • IS MISSING is a good way to recognize the difference in attributes that weren’t stored because the document was saved using the old schema • Where possible, try to predict possible schema needs in advance 51
52.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Pay Attention To Document Modeling Up Front • Watch out for documents that get too large • Might hit 20MB document size limit •
High serialization/deserialization/networking performance penalties • Document contention as too many actions attempt to modify the document simultaneously • Watch out for data spread across too many related documents • Lack of atomic transactions across multiple writes can result in partial updates • Can add latency if documents must be read in a chained manner (i.e. each document contains the key to the next document) • Be sure to include document keys, or a way to construct them, where you may want to use N1QL JOIN or NEST operations • Should the document key be stored inside the document, too? • Increases data size, as the key is in the document and in the metadata • Requires that the data layer maintain consistency • Can make queries easier since you don’t need to use META() function to get the key 52
53.
©2016 Couchbase Inc. ©2016 Couchbase Inc. View Indexes vs. Global Secondary Indexes • Be sure to analyze what type of index is best for each workload • Views are great where pre-aggregating numbers is useful, such as reports, graphs, etc •
GSI is usually the best option for more generic queries, especially if when you’re just trying to collect a set of documents • Views don’t scale as cleanly, they can’t be scaled independently via Multi Dimensional Scaling • Views live on the data nodes, so they only scale as you add more data nodes • At CenterEdge, our new platform started on Couchbase Server 3.0, before Global Secondary Indexes were an option • We used views and lookup documents for most of our indexing needs • We have run into problems with too many views per bucket causing performance bottlenecks • We’re currently transitioning many of these views into Global Secondary Indexes 53
54.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Efficient Indexing Is Especially Important For Couchbase 54 • Primary key scans in SQL have always been inefficient • Every record in the table would be read and checked for a match to the WHERE predicate •
For small tables, the performance penalty was negligible, and would usually go unnoticed • Primary key scans in Couchbase are usually much worse • In our experience with production-scale data, almost invariably results in queries timing out • Every record in the bucket is being read and checked for a match to the WHERE predicate • Can easily result in reading and parsing millions of JSON documents • Will also bust the in-memory cache on the data nodes if there is more data in the bucket than allocated memory • Design every query to be supported by a Global Secondary Index • Helps even if the index isn’t an exact match • A good design can vastly reduce the number of documents scanned, making it more like a SQL primary key scan
55.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Efficient Indexing Is Especially Important For Couchbase 55 /* Use predicate to only index documents of a certain type */ CREATE INDEX `airport_sourceairport` ON `travel-sample` (`sourceairport`) WHERE `type` = 'airport' /* To index the same attribute across multiple document types, include type attribute first */ CREATE INDEX `def_type_id` ON `travel-sample` (`type`, `id`) /* A good practice is to create a fallback in case other indexes aren't used */ CREATE INDEX `def_type` ON `travel-sample` (`type`) If you’re using the “type” attribute as the logical equivalent of a table in SQL, most indexes will include this attribute.
56.
©2016 Couchbase Inc. ©2016 Couchbase Inc. How To Store and Index Date/Times • Date/Times are usually stored as ISO 8601 strings in JSON • Use STR_TO_MILLIS(x) in indexes and queries to work with ISO 8601 strings 56 /* STR_TO_MILLIS converts an ISO8601 string to a Unix numeric representation */ /* It also handles the time zone specifier */ SELECT `Extent1`.* FROM `beer-sample` as `Extent1` WHERE (`type` = 'beer') AND (STR_TO_MILLIS(`Extent1`.`updated`) <= STR_TO_MILLIS("2010-01-01T00:00:00Z")) /* STR_TO_MILLIS must also be used in the index, or the index cannot be used */ CREATE INDEX `beer_updated` ON `beer-sample` (STR_TO_MILLIS(`updated`)) WHERE `type` = 'beer'
57.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Index Performance During Mutations 57 Airline SQL Table Airport SQL Table travel-sample Bucket Airline Indexes Airport Indexes Bucket Indexes Remember that GSI
indexes are similar to SQL indexes, but not the same
58.
©2016 Couchbase Inc. ©2016 Couchbase Inc. Training! • Don’t just assume you can switch to any NoSQL platform without some training • Performance profile is different, and the penalties can appear in different places •
Developers who know the pitfalls in advance can save you a lot of refactoring headaches later • N1QL does help reduce the learning curve significantly • For .Net development shops, look at Linq2Couchbase to make it even easier! • The operations department needs training, too! 58
59.
©2016 Couchbase Inc. 59 Marco Greco Senior Software Engineer marco.greco@couchbase.com Clarence J M Tauro, Ph.D. Senior Instructor clarence@couchbase.com Brant Burnett Lead Developer bburnett@centeredgesoftware.com
60.
©2016 Couchbase Inc. 60 Share your
opinion on Couchbase 1. Go here: http://gtnr.it/2eRxYWn 2. Create a profile 3. Provide feedback (~15 minutes)
61.
©2016 Couchbase Inc. 61 The Couchbase Connect16
mobile app Take our in-app survey!
62.
©2016 Couchbase Inc. Thank You! 62
Descargar ahora