SlideShare una empresa de Scribd logo
1 de 62
Descargar para leer sin conexión
©2016	Couchbase	Inc.	
Migrating	from	relational	
data	modeling	and	access	
Brant	Burnett,	Lead	Developer,	CenterEdge	
Clarence	Tauro,	Sr	Trainer,	Couchbase	
Marco	Greco,	Sr	Engineer,	N1QL	R&D,	Couchbase	
1
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Agenda	
•  Practical	considerations	for	data	and	application	migration	
•  Modeling	in	Couchbase	
•  Real	life	experience:	Centeredge	
2
©2016	Couchbase	Inc.	 3	
Practical	Considerations
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
In	this	section	
•  Nomenclature	
•  Type	and	data	model	mapping	
•  Migrating	data	
•  Business	logic	
•  Monitoring	
4
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Nomenclature	
5	
Oracle	 Couchbase	
Database	 Bucket	
Table	 Bucket	
Row	 Document	
Column	 Field
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Type	Mapping	
6	
Oracle	 (PL/SQL	Synonyms)	 Couchbase	
Number,	Binary_real,	
Binary_integer	
Smallint,	Int,	Dec,	
Decimal,	Float,	…	
Number	
Char,	Nchar,	Varchar2,	
Nvarchar2	
Character,	String	 String	
Boolean	 Boolean	
Date,	Timestamp	 Handled	via	String	
Interval	(year	to	month,	
day	to	fracPon)	
Some	support	via	_millis()	
funcPons
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Modelling	
7	
CustomerID	 Name	 DOB	
CBC2016	 Jane	Smith	 1990-01-30	
CustomerID	 Type	 Cardnum	 Expiry	
CBC2016	 visa	 5827…	 2019-03	
CBC2016	 master	 6274…	 2018-12	
CustomerID	 ConnId	 Name	
CBC2016	 XYZ987	 Joe	Smith	
CBC2016	 SKR007	 Sam	Smith	
CustomerID	 item	 amt	
CBC2016	 mac	 2823.52	
CBC2016	 ipad2	 623.52	
CustomerID	 ConnId	 Name	
CBC2016	 XYZ987	 Joe	Smith	
CBC2016	 SKR007	 Sam	Smith	
Contacts	
Customer	
Billing	
ConnecPons	Purchases	
{	
				"Name"	:	"Jane	Smith",	
				"DOB"		:	"1990-01-30",	
				"Billing"	:	[	
								{	
												"type"				:	"visa",	
												"cardnum"	:	"5827-2842-2847-3909",	
												"expiry"		:	"2019-03"	
								},	
								{	
												"type"				:	"master",	
												"cardnum"	:	"6274-2842-2847-3909",	
												"expiry"		:	"2019-03"	
								}	
				],	
				"ConnecPons"	:	[	
								{	
												"CustId"			:	"XYZ987",	
												"Name"					:	"Joe	Smith"	
								},	
								{	
												"CustId"			:	"PQR823",	
												"Name"					:	"Dylan	Smith"	
								}	
								{	
												"CustId"			:	"PQR823",	
												"Name"					:	"Dylan	Smith"	
								}	
				],	
				"Purchases"	:	[	
								{	"id":12,	item:	"mac",			"amt":	2823.52	}	
								{	"id":19,	item:	"ipad2",	"amt":	623.52	}	
			]	
}	
DocumentKey:	CBC2016
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Migration	
•  Generalized	process	
•  Commercial	tools	
•  Talend	
•  Informatica	
•  Open	source	
•  Couchbase	java	importer	
•  Oracle2couchbase	
•  SQSL	
•  Importing	from	files	
8
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Migration	Process	
•  High	level	process	to	migrate	data	from	RDBMS	to	Couchbase	using	N1QL	
•  For	each	table	
•  Determine	primary	key	columns	
•  Describe	table	
•  For	each	row	
•  Generate	document	key	from	primary	key	columns	
•  Generate	document	from	projection	list	description,	column	values	
•  INSERT	INTO	<bucket>	(key,	value)	($1,	$2)		
•  Use	key	and	document	as	placeholder	values	
9
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Talend	
•  Talend	connector	for	Couchbase	--	Talend	5.3	or	later	
•  http://developer.couchbase.com/documentation/server/4.5/
connectors/talend/talend.html	
•  Ingesting	unstructured	data	
•  Couchbase	view	support	
•  Seamless	integration	with	Couchbase	
•  tCouchbaseInput	
•  Incoming	data	transformed	into	JSON	documents	and		stored	in	
Couchbase.	
•  User	defines	the	data	fields	to	be	transformed	into	JSON	attributes	
•  tCouchbaseOutput:	uses	the	schema	mapping	to	transform	
JSON	documents	into	target	data	formats	
•  ODBC/JDBC	drivers	(provided	by	Simba	and	CData)	
10
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Informatica	
11	
•  Informatica	Power	Center	
•  Needs	ODBC	driver	
	
•  Informatica	Cloud	
•  Needs	JDBC	driver	
	
•  ODBC/JDBC	drivers	(provided	by	Simba	and	CData)	
	
•  ETL	&	Data	Integration	
•  Load	data	from	any	Relational	system	into	Couchbase	
•  Export	Couchbase	data	into	RDBMS	
•  Seamlessly	integrate	Couchbase	into	rest	of	the	Data	fabric
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Couchbase	java	importer	
•  Blog	post	by	Laurent	Doguin	detailing	journey	from	process	to	code	
•  Java	based,	but	principle	applies	to	other	languages	
•  Geared	to	Postgres	but	principle	applies	to	other	engines	
•  Blog:	http://blog.couchbase.com/2016/january/moving-sql-database-content-to-couchbase	
•  Source	code:	https://github.com/ldoguin/couchbase-java-importer	
12
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Oracle2couchbase	
•  Another	blog	post	/	opensource	tool	
•  By	Manuel	Hurtado	
•  Java	based	
•  Migrates	from	Oracle	
•  Blog:	http://blog.couchbase.com/2016/february/moving-data-from-oracle-to-couchbase	
•  Source	and	binary:	https://github.com/mahurtado/oracle2couchbase	
13
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
SQSL	
•  Client	side	SQL	like	scripting	language	developed	by	yours	truly	two	decades	ago	
•  Several	nifty	features	like	
•  Expansion	
•  Data	driven	operation	
•  On	the	fly	aggregation	and	redirection	
•  User	defined	routines	
•  Have	recently	written	data	source	for	Couchbase	and	json	library	
•  Source:	http://www.sqsl.org	
•  Example:	
let	fromconn="sample";	
connect	to	fromconn	source	db2cli;	
connect	to	"couchbase://192.168.1.104:8091"	source	cb;	
select	*	from	db2inst1.dept	connection	fromconn	
		insert	into	default	(key,	value)	values($1,	$2)	
								using	json:key("::",	columns),	json:row2doc(displaylabels,	columns);	
14
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Import	/	Export	utilities	
•  Upcoming	version	includes	cbimport	&	cbexport	
•  File	based	utilities	
•  Need	to	export	RDBDMS	data	to	file	first	
•  Load	directly	into	data	store	bypassing	N1QL	
15
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Business	logic	
•  DDL	
•  Views	
•  Triggers	
•  Procedures	
•  Sequences	
•  Joins	
•  Transactions	
16
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Language	comparison	
17	
Query	Features	 SQL	on	RDBMS	 N1QL	
DML	 §  SELECT,	INSERT,	UPDATE,	DELETE,	MERGE	
§  SELECT,	INSERT,	UPDATE,	DELETE,	
MERGE	
DDL	
§  CREATE	[INDEX,	PROCEDURE	TABLE,	TYPE,	VIEW…]	
§  ALTER	[TABLE,	TYPE,	…]	
§  DROP	[INDEX,	PROCEDURE	TABLE,	TYPE,	VIEW…]	
§  CREATE	[PRIMARY]	INDEX	
§  DROP	[PRIMARY]	INDEX	
Query	OperaDons	
§  Select,	Join,	Project,	Subqueries	
§  Strict	Schema			
§  Strict	Type	checking	
§  Select,	Join,	Project,	Subqueries	
ü  Nest	&	Unnest	
ü  Look	Ma!	No	Type	Mismatch	Errors!	
§  JSON	keys	act	as	columns	
Schema	 §  Predetermined	Columns	
ü  Fully	addressable	JSON	
ü  Flexible	document	structure	
Data	Types	
§  SQL	Data	types	
§  Conversion	FuncPons	
§  JSON	Data	types	
§  Conversion	FuncPons	
Query	Processing	
§  INPUT:	Sets	of	Tuples	
§  OUPUT:	Set	of	Tuples	
§  INPUT:	Sets	of	JSON	
§  OUTPUT:	Set	of	JSON
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
DDL	
•  Only	Create	Index	/	Drop	Index	exist	in	N1QL	
•  Everything	else	should	be	removed	from	the	application	
•  Temporary	tables	
•  Materialize	results	
•  Store	in	memory,	or	
•  Insert	materialized	document	in	a	keyspace	using	a	designated	“type”:	field	and	a	UUID()	as	key	
•  DROP	<temporary	table>	becomes	DELETE	FROM	keyspace	WHERE	type=…	and	ID=<UUID>	
18
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Views	
•  Access	underlying	keyspaces	instead	
•  Something	akin	to	views	can	be	obtained	with	
•  View	indexes	
•  Functional	indexes	
•  CREATE	INDEX	…	WHERE	clauses	
19
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Statement	blocks	
•  Handled	by	the	application:	
•  Triggers	
•  Procedures	
20
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Sequences	
•  The	eventual	persistence	engine	handles	atomic	increments	
•  Special	documents	can	be	created	with	a	counter	and	accessed	atomically	
•  Can	specify	a	delta	on	creation	
•  Must	be	done	from	SDK	
•  In	python:	
	
	
•  N1QL	does	not	
•  Use	UUID()	instead	
21
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Joins	
•  Two	types	of	joins	
•  Look	up	
•  Index	
•  Joins	use	the	document	key	
•  Joining	side	can	be	an	expression	
•  Joined	side	is	document	key	
•  Full	expression	joins	not	supported	
22
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Transactions	
•  In	two	words:	No	need	
•  Document	modification	is	atomic	
•  Consistency	can	be	specified	at	the	REST	call	level	or	SDK	
•  REST	example	
•  Add	scan_consistency=[not_bounded|at_plus|request_plus|statement_plus]	to	REST	
parameters	
•  C	SDK	example	
•  Use	lcb_n1p_setconsistency	(…,	[LCB_N1P_CONSISTENCY_NONE,	
LCB_N1P_CONSISTENCY_RYOW,	LCB_N1P_CONSISTENCY_REQUEST,	
LCB_N1P_CONSISTENCY_STATEMENT])	when	setting	up	request	args	
	
23
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Monitoring	
•  Oracle	
•  ALTER	[system|session]	SET	timed_statistics=true	turns	on	timed	statistics	collection.	
•  V$SESSTAT,	V$SYSSTAT,	V$STATNAME	dynamic	performance	views	report	timed	statistics.	
•  EXPLAIN	PLAN	explains	a	statement.	
•  MySQL	
•  SET	profiling=1	turns	on	profiling	
•  SHOW	PROFILES	displays	available	query	profiles	
•  SHOW	PROFILE	displays	the	profile	for	a	specific	query	
•  EXPLAIN	<statement>	produces	query	plan	
•  Couchbase	
•  system:completed_requests	virtual	keyspace		lists	completed	long	running	queries	with	timings	
and	statistics	
•  system:active_requests	virtual	keyspace		lists	active	queries	with	timings	and	statistics	
•  EXPLAIN	<statement>	explains	request	plan	as	a	json	document	
	
24
©2016	Couchbase	Inc.	 25	
Modeling
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
What	is	Data	Modeling?	
26	
•  A	data	model	is	a	conceptual	representation	of	the	data	structures	that	are	required	by	a	
database	
•  The	data	structures	include	the	data	objects,	the	associations	between	data	objects,	and	the	
rules	which	govern	operations	on	the	objects.
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Conceptual	Data	Modeling	
•  Define	entities,	attributes	and	their	relationships	
•  Entities:	Main	objects	that	are	targets	of	your	apps	operates	on	
•  Attributes:	properties	that	your	applications	keep	track	of	for	the	entity	
•  Relationships:	definition	connections	to	other	entities	-	1-1,	1-many,	many-many	
Airline	
Airport	
Landmark	
Route	 Passenger	
Flight
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Physical	Data	Model	
•  Phase	II	-	Map	entities,	attributes	and	their	relationships	to	containers	provided	by	the	
underlying	database	solution	
RelaDonal	Databases	 Couchbase	Server	
Databases	 Buckets	
Tables	 Documents	with	type	designator	apribute		
OR	Compound	Keys	
Rows	 Items	(Key-Value	or	Key-Document)	
Columns	 AMributes	
Index	 Index
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
dean@couchbase.com	
	
{	
		“name”:…,	
		“flights”:[	
				{“_id”:“route_1000”,	
						“flight”:…,},	
				{“_id”:”route_6421”,	
							“flight”:…,}	
				…],	
			…	
}	
Physical	Data	Modeling	
route_1000	
	
{	
		“id”:”1000”,	
		“airline”:	“AF”,	
		“sourceairport”:”TLV”,	
		“desPnaPonairport”:”MRW”,	
…}	
airport_TLV	
	
{	
		“id”:”126701”,	
		“airportname”:	“TLV”,	
		“geo”:{	
					“lat”:…,“long”:…},	
…}	
				Flights
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Data	Modeling	Approaches	
30	
NoSQL	
Relaxed	NormalizaPon	
schema	implied	by	structure	
fields	may	be	empty,	duplicate,	or	missing	
RelaDonal	
Required	NormalizaPon	
schema	enforced	by	db	
same	fields	in	all	records	
•  Minimize	data	inconsistencies	(one	item	=	one	locaPon)	
•  Reduced	update	cost	(no	duplicated	data)	
•  Preserve	storage	resources	
•  OpPmized	to	planned/actual	access	paperns	
•  Flexibly	with	soxware	architecture	
•  Supports	clustered	architecture	
•  Reduced	server	overhead
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
JSON	Design	Choices	
31	
•  Couchbase	Server	neither	enforces	nor	validates	for	any	particular	document	structure	
•  Choices	that	impact	JSON	document	design:	
–  Single	Root	Attributes	
–  Objects	vs.	Arrays	
–  Array	Element	Types	
–  Timestamp	Formats	
–  Empty	and	Null	Property	Values	
–  JSON	Schema
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Root	Attributes	vs.	Embedded	Attributes	
32	
•  The	choice	of	having	a	single	root	attribute	or	the	“type”	attribute	embedded.
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Objects	vs.	Arrays	
33	
•  The	choice	of	having	an	object	type,	or	an	array	type
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Array	Element	Types	
Array	of	strings	
Array	of	objects	
34	
•  Array	elements	can	be	simple	types,	objects	or	arrays:
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Timestamp	Formats	
Array	of	Pme	
components	
String	(ISO	8601)	
Number	(Unix	style)	
(Epoch)	
•  Working	and	dealing	with	timestamps	has	been	challenging	ever	since	
•  When	storing	timestamps,		you	have	at	least	3	options:	
16
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Empty	and	Null	Property	Values	
36	
•  Keep	in	mind	that	JSON	supports	optional	properties	
•  If	a	property	has	a	null	value,	consider	dropping	it	from	the	JSON,	unless	there's	a	good	reason	
not	to	
•  N1QL	makes	it	easy	to	test	for	missing	or	null	property	values	
•  Be	sure	your	application	code	handles	the	case	where	a	property	value	is	missing	
SELECT * FROM couchmusic1 WHERE userprofile.address IS NULL;
SELECT * FROM couchmusic1 WHERE userprofile.gender IS MISSING;
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
JSON	Schema	
37	
•  Couchbase	Server	pays	absolutely	no	attention	to	the	shape	of	your	JSON	documents	so	long	
as	they	are	well-formed	
•  There	are	times	when	it	is	useful	to	validate	that	a	JSON	document	conforms	to	some	
expected	shape	
•  JSON	Schema	is	a	JSON-based	format	for	defining	the	structure	of	JSON	data	
•  There	are	implementations	for	most	popular	programming	languages	
•  Learn	more	here:	http://json-schema.org
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Data	Nesting	(aka	Denormalization)	
38	
•  Relational	database	design	promotes	separating	data	using	normalization,	which	doesn’t	scale	
•  For	NoSQL	systems,	we	often	avoid	normalization	so	that	we	can	scale	
•  Nesting	allows	related	objects	to	be	organized	into	a	hierarchical	tree	structure	where	you	can	
have	multiple	levels	of	grouping	
•  Rule	of	thumb	is	to	nest	no	more	than	3	levels	deep	unless	there	is	a	very	good	reason	to	do	so	
•  You	will	often	want	to	include	a	timestamp	in	the	nested	data
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Example	of	Data	Nesting	
•  Playlist	with	owner	attribute	containing	username	of	corresponding	userprofile	
39	
Document	Key:		copilotmarks61569
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Example	of	Data	Nesting	
•  Playlist	with	owner	attribute	containing	a	subset	of	the	corresponding	userprofile	
	
40	
*	Note	the	inclusion	of	the	updated	apribute
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Choices	with	JSON	Key	Design	
41	
•  A	key	formed	of	attributes	that	exist	in	the	real	world:	
–  Phone	numbers	
–  Usernames	
–  Social	security	numbers	
–  Account	numbers	
–  SKU,	UPC	or	QR	codes	
–  Device	IDs
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Surrogate	Keys	
42	
•  We	often	use	surrogate	keys	when	no	obvious	natural	key	exist	
•  They	are	not	derived	from	application	data	
•  They	can	be	generated	values	
–  3305311F4A0FAAFEABD001D324906748B18FB24A	(SHA-1)	
–  003C6F65-641A-4CGA-8E5E-41C947086CAE	(UUID)	
•  They	can	be	sequential	numbers	(often	implemented	using	the	Counter	feature	of	Couchbase	
Server)	
–  456789,	456790,	456791,	…
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Making	Tough	Choices	
43	
•  We	must	also	make	trade-offs	in	data	modeling:	
–  Document	size	
–  Atomicity	
–  Complexity	
–  Speed
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Embed	vs.	Refer	
44	
•  All	of	the	previous	trade-offs	are	usually	rolled	into	a	single	decision	–	whether	to	embed	or	
refer	
•  When	to	embed?	
•  When	to	refer?
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Want	to	know	more	on	Data	Modeling?	
45	
•  Session	tomorrow	–	“Agile	Document	Models	and	Data	Structures”	at	1:00PM
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	 46	
Brant Burnett
Software Development Team Lead
Couchbase Community Expert
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
About	CenterEdge	Software	
ü  Point of Sale
ü  Admissions & Ticketing
ü  Party, Group & Event Bookings
ü  Online Sales & Party Reservations
ü  Time Clock & Labor Management
ü  & More!	
47
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
About	CenterEdge	Software	
•  Celebrating 12 Year Anniversary
•  Team of 50 in Roxboro, NC
•  Sister company is Palace Pointe, a 100k sq. ft. Entertainment Venue for which we were
developed as an in-house system
•  Over 600 facilties using our platform across the US and abroad
•  FEC’s, Waterparks, Trampoline Parks, Amusement Parks, Skating Rinks, Bowling
Centers, Zoos & Museums
48
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Why	Couchbase	For	CenterEdge’s	Newest	Cloud	Platform?	
•  More scalable and performant than traditional SQL in the cloud
•  Previous online store system uses 19 SQL servers, each hosting 30 stores
•  As each store is only on a single server, it doesn’t handle spikes in load efficiently
•  Servers can’t be scaled vertically without downtime for all 30 stores on that server
•  Schema-less JSON increases flexibility as your system evolves, leaving schema
enforcement in your data access layer
•  Schema changes to large tables can result in downtime as data structure is updated across
all records
•  We were already using Couchbase for our shopping carts as well as a SQL caching
layer, with great success. Now we can simplify the architecture with a single data layer.
49
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Couchbase	Cloud	Data	Flow	Architecture	
50	
Data Data Data
IndexQuery
Web
Servers
Remote
Application
Servers
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Enforcing	Schema	
•  Since	Couchbase	doesn’t	enforce	schema	like	SQL,	your	data	access	layer	should	do	so	instead	
•  At	CenterEdge,	each	document	type	is	only	updated	by	a	single	service	
•  Within	that	service,	schema	is	enforced	by	serializing	data	from	consistent	POCOs	
•  Schema	changes	can	be	supported	using	customized	JSON	converters	during	deserialization	
•  IS	MISSING	is	a	good	way	to	recognize	the	difference	in	attributes	that	weren’t	stored	because	
the	document	was	saved	using	the	old	schema	
•  Where	possible,	try	to	predict	possible	schema	needs	in	advance	
51
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Pay	Attention	To	Document	Modeling	Up	Front	
•  Watch	out	for	documents	that	get	too	large	
•  Might	hit	20MB	document	size	limit	
•  High	serialization/deserialization/networking	performance	penalties	
•  Document	contention	as	too	many	actions	attempt	to	modify	the	document	simultaneously	
•  Watch	out	for	data	spread	across	too	many	related	documents	
•  Lack	of	atomic	transactions	across	multiple	writes	can	result	in	partial	updates	
•  Can	add	latency	if	documents	must	be	read	in	a	chained	manner	(i.e.	each	document	contains	the	key	
to	the	next	document)	
•  Be	sure	to	include	document	keys,	or	a	way	to	construct	them,	where	you	may	want	to	use	N1QL	
JOIN	or	NEST	operations	
•  Should	the	document	key	be	stored	inside	the	document,	too?	
•  Increases	data	size,	as	the	key	is	in	the	document	and	in	the	metadata	
•  Requires	that	the	data	layer	maintain	consistency	
•  Can	make	queries	easier	since	you	don’t	need	to	use	META()	function	to	get	the	key	
52
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
View	Indexes	vs.	Global	Secondary	Indexes	
•  Be	sure	to	analyze	what	type	of	index	is	best	for	each	workload	
•  Views	are	great	where	pre-aggregating	numbers	is	useful,	such	as	reports,	graphs,	etc	
•  GSI	is	usually	the	best	option	for	more	generic	queries,	especially	if	when	you’re	just	trying	to	
collect	a	set	of	documents	
•  Views	don’t	scale	as	cleanly,	they	can’t	be	scaled	independently	via	Multi	Dimensional	Scaling	
•  Views	live	on	the	data	nodes,	so	they	only	scale	as	you	add	more	data	nodes	
•  At	CenterEdge,	our	new	platform	started	on	Couchbase	Server	3.0,	before	Global	Secondary	
Indexes	were	an	option	
•  We	used	views	and	lookup	documents	for	most	of	our	indexing	needs	
•  We	have	run	into	problems	with	too	many	views	per	bucket	causing	performance	bottlenecks	
•  We’re	currently	transitioning	many	of	these	views	into	Global	Secondary	Indexes	
53
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Efficient	Indexing	Is	Especially	Important	For	Couchbase	
54	
•  Primary	key	scans	in	SQL	have	always	been	inefficient	
•  Every	record	in	the	table	would	be	read	and	checked	for	a	match	to	the	WHERE	predicate	
•  For	small	tables,	the	performance	penalty	was	negligible,	and	would	usually	go	unnoticed	
•  Primary	key	scans	in	Couchbase	are	usually	much	worse	
•  In	our	experience	with	production-scale	data,	almost	invariably	results	in	queries	timing	out	
•  Every	record	in	the	bucket	is	being	read	and	checked	for	a	match	to	the	WHERE	predicate	
•  Can	easily	result	in	reading	and	parsing	millions	of	JSON	documents	
•  Will	also	bust	the	in-memory	cache	on	the	data	nodes	if	there	is	more	data	in	the	bucket	than	allocated	
memory	
•  Design	every	query	to	be	supported	by	a	Global	Secondary	Index	
•  Helps	even	if	the	index	isn’t	an	exact	match	
•  	A	good	design	can	vastly	reduce	the	number	of	documents	scanned,	making	it	more	like	a	SQL	
primary	key	scan
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Efficient	Indexing	Is	Especially	Important	For	Couchbase	
55	
/*	Use	predicate	to	only	index	documents	of	a	certain	type	*/	
CREATE	INDEX	`airport_sourceairport`	ON	`travel-sample`	(`sourceairport`)	
WHERE	`type`	=	'airport'	
/*	To	index	the	same	attribute	across	multiple	document	types,	include	type	attribute	first	*/	
CREATE	INDEX	`def_type_id`	ON	`travel-sample`	(`type`,	`id`)	
/*	A	good	practice	is	to	create	a	fallback	in	case	other	indexes	aren't	used	*/	
CREATE	INDEX	`def_type`	ON	`travel-sample`	(`type`)	
If	you’re	using	the	“type”	attribute	as	the	logical	equivalent	of	a	table	in	SQL,	most	indexes	will	
include	this	attribute.
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
How	To	Store	and	Index	Date/Times	
•  Date/Times	are	usually	stored	as	ISO	8601	strings	in	JSON	
•  Use	STR_TO_MILLIS(x)	in	indexes	and	queries	to	work	with	ISO	8601	strings	
56	
/*	STR_TO_MILLIS	converts	an	ISO8601	string	to	a	Unix	numeric	representation	*/	
/*	It	also	handles	the	time	zone	specifier	*/	
SELECT	`Extent1`.*	FROM	`beer-sample`	as	`Extent1`	
WHERE	(`type`	=	'beer')	
AND	(STR_TO_MILLIS(`Extent1`.`updated`)	<=	STR_TO_MILLIS("2010-01-01T00:00:00Z"))	
	
	
/*	STR_TO_MILLIS	must	also	be	used	in	the	index,	or	the	index	cannot	be	used	*/	
CREATE	INDEX	`beer_updated`	ON	`beer-sample`	(STR_TO_MILLIS(`updated`))	
WHERE	`type`	=	'beer'
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Index	Performance	During	Mutations	
57	
Airline	
SQL	Table	
Airport	
SQL	Table	
travel-sample	
Bucket	
Airline	Indexes	
Airport	Indexes	
Bucket	Indexes	
Remember that GSI indexes are similar to SQL indexes, but not the
same
©2016	Couchbase	Inc.	©2016	Couchbase	Inc.	
Training!	
•  Don’t	just	assume	you	can	switch	to	any	NoSQL	platform	without	some	training	
•  Performance	profile	is	different,	and	the	penalties	can	appear	in	different	places	
•  Developers	who	know	the	pitfalls	in	advance	can	save	you	a	lot	of	refactoring	headaches	later	
•  N1QL	does	help	reduce	the	learning	curve	significantly	
•  For	.Net	development	shops,	look	at	Linq2Couchbase	to	make	it	even	easier!	
•  The	operations	department	needs	training,	too!	
58
©2016	Couchbase	Inc.	 59	
Marco	Greco	
Senior	Software	Engineer	
marco.greco@couchbase.com	
	
	
Clarence	J	M	Tauro,	Ph.D.	
Senior	Instructor	
clarence@couchbase.com	
	
	
Brant	Burnett	
Lead	Developer	
bburnett@centeredgesoftware.com
©2016	Couchbase	Inc.	 60	
Share your opinion on
Couchbase
1.  Go here: http://gtnr.it/2eRxYWn
2.  Create a profile
3.  Provide feedback (~15 minutes)
©2016	Couchbase	Inc.	 61	
The Couchbase
Connect16 mobile app
Take our in-app survey!
©2016	Couchbase	Inc.	
Thank	You!	
62

Más contenido relacionado

Similar a Migrating from Relational - Data Modeling and Access

Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectOntotext
 
Putting Apache Drill into Production
Putting Apache Drill into ProductionPutting Apache Drill into Production
Putting Apache Drill into ProductionMapR Technologies
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationMapR Technologies
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016Mathieu Dumoulin
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachDATAVERSITY
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...DataStax
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNHortonworks
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop DataWorks Summit/Hadoop Summit
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations publicEvans Ye
 
Architecting a next generation data platform
Architecting a next generation data platformArchitecting a next generation data platform
Architecting a next generation data platformhadooparchbook
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
The Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical ApproachThe Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical ApproachDATAVERSITY
 
Ian Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous DeploymentIan Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous DeploymentWinOps Conf
 
Resume Vardan Karapetian Updated
Resume Vardan Karapetian UpdatedResume Vardan Karapetian Updated
Resume Vardan Karapetian Updatedvkarapet
 

Similar a Migrating from Relational - Data Modeling and Access (20)

Choosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your ProjectChoosing the Right Graph Database to Succeed in Your Project
Choosing the Right Graph Database to Succeed in Your Project
 
Putting Apache Drill into Production
Putting Apache Drill into ProductionPutting Apache Drill into Production
Putting Apache Drill into Production
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
 
Is Spark Replacing Hadoop
Is Spark Replacing HadoopIs Spark Replacing Hadoop
Is Spark Replacing Hadoop
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical ApproachSlides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
Slides: NoSQL Data Modeling Using JSON Documents – A Practical Approach
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
 
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARNYARN webinar series: Using Scalding to write applications to Hadoop and YARN
YARN webinar series: Using Scalding to write applications to Hadoop and YARN
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public2017 big data landscape and cutting edge innovations public
2017 big data landscape and cutting edge innovations public
 
Architecting a next generation data platform
Architecting a next generation data platformArchitecting a next generation data platform
Architecting a next generation data platform
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
The Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical ApproachThe Why, When, and How of NoSQL - A Practical Approach
The Why, When, and How of NoSQL - A Practical Approach
 
Ian Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous DeploymentIan Margetts - ASOS’ Journey to Continuous Deployment
Ian Margetts - ASOS’ Journey to Continuous Deployment
 
Resume Vardan Karapetian Updated
Resume Vardan Karapetian UpdatedResume Vardan Karapetian Updated
Resume Vardan Karapetian Updated
 

Migrating from Relational - Data Modeling and Access