2. Hands on with the App
Engine Datastore
Ikai Lan
May 9th, 2011
2
Thursday, May 26, 2011
3. About the speaker
• Ikai Lan - Developer Programs Engineer, Developer Relations
• Twitter: @ikai
• Google Profile: http://profiles.google.com/ikai.lan
3
Thursday, May 26, 2011
5. Goals of this talk
• Understand a bit of how the datastore works underneath the
hood
• Have a conceptual background for the persistence codelab
5
Thursday, May 26, 2011
6. Understanding the datastore
• The underlying Bigtable
• Indexing and queries
• Complex queries
• Entity groups
• Underlying infrastructure
6
Thursday, May 26, 2011
7. Datastore layers
Complex Entity Group Queries on Key range Get and set
queries Transactions properties scan by key
Datastore
✓ ✓ ✓ ✓ ✓
Megastore
✓ ✓ ✓ ✓
Bigtable
✓ ✓
7
Thursday, May 26, 2011
8. Datastore layers
Get and set
Complex Entity Group Group on Key on
Complex Entity Queries Queries range byGet and set
key, key
queries Transactions properties
queries Transactions properties
scan range scans
by key
Datastore
✓✓ ✓ ✓ ✓ ✓✓ ✓✓
Megastore
✓ ✓ ✓ ✓✓ ✓✓
Bigtable
✓ ✓✓
8
Thursday, May 26, 2011
9. What does a Bigtable row look like?
Source: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf
9
Thursday, May 26, 2011
10. Bigtable API
• “Give me the column ‘name’ at key 123”
• “Set the column ‘name’ at key 123 to ‘ikai’”
• “Give me all columns where the key is greater than 100 and less
than 200”
10
Thursday, May 26, 2011
11. Datastore layers
Get and set
Complex Entity Group Group on Key on
Complex Entity Queries Queries range byGet and set
key, key
queries Transactions properties
queries Transactions properties
scan range scans
by key
Datastore
✓✓ ✓ ✓ ✓ ✓✓ ✓✓
Megastore
✓ ✓ ✓ ✓✓ ✓✓
Bigtable
✓ ✓✓
11
Thursday, May 26, 2011
12. Megastore API
• “Give me all rows where the column ‘name’ equals ‘ikai’”
• “Transactionally write an update to this group of entities”
• “Do a cross datacenter write of this data such that reads will be
strongly consistent” (High Replication Datastore)
• Megastore paper: http://www.cidrdb.org/cidr2011/Papers/
CIDR11_Paper32.pdf
12
Thursday, May 26, 2011
13. Datastore layers
Get and set
Complex Entity Group Group on Key on
Complex Entity Queries Queries range byGet and set
key, key
queries Transactions properties
queries Transactions properties
scan range scans
by key
Datastore
✓✓ ✓ ✓ ✓ ✓✓ ✓✓
Megastore
✓ ✓ ✓ ✓✓ ✓✓
Bigtable
✓ ✓✓
13
Thursday, May 26, 2011
14. App Engine Datastore API
• “Give me all Users for my app where the name equals ‘ikai’,
company equals ‘Google’, and sort them by the ‘awesome’
column, descending”
14
Thursday, May 26, 2011
17. Let’s save an Entity with the low-level Java API
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
ikai.setProperty("firstName", "ikai");
ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography",
"Ikai is a great man, a great, great man.");
datastore.put(ikai);
16
Thursday, May 26, 2011
18. Get an instance of the DatastoreService
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Fetch a client instance
Entity ikai = new Entity("User", "ikai@google.com");
ikai.setProperty("firstName", "ikai");
ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography",
"Ikai is a great man, a great, great man.");
datastore.put(ikai);
17
Thursday, May 26, 2011
19. Instantiate a new Entity
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Set the Entity Kind
ikai.setProperty("firstName", "ikai");
ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography",
"Ikai is a great man, a great, great man.");
datastore.put(ikai);
18
Thursday, May 26, 2011
20. Instantiate a new Entity
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
ikai.setProperty("firstName", "ikai"); a
Set unique key
ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography",
"Ikai is a great man, a great, great man.");
datastore.put(ikai);
19
Thursday, May 26, 2011
21. Set indexed properties
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
First argument is the
Entity ikai = new Entity("User", "ikai@google.com");
property name
ikai.setProperty("firstName", "ikai");
ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography", argument
Second is the
property value
"Ikai is a great man, a great, great man.");
datastore.put(ikai);
20
Thursday, May 26, 2011
22. Set unindexed properties
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
This property will be saved, but we
ikai.setProperty("firstName", "ikai");
ikai.setProperty("company", "google");
will not run queries against it
ikai.setUnindexedProperty("biography",
"Ikai is a great man, a great, great man.");
datastore.put(ikai);
21
Thursday, May 26, 2011
23. Commit the entity to the datastore
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
ikai.setProperty("firstName", "ikai");
ikai.setProperty("company", "google");
ikai.setUnindexedProperty("biography",
"Ikai is a thing! man, a great, great man.");
Save the great
datastore.put(ikai);
22
Thursday, May 26, 2011
24. What happens when we save?
Write the entity
Make the Success!
write RPC
Write the
indexes
23
Thursday, May 26, 2011
25. What actually gets written?
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized entity - includes
firstName, company and biography
values )
Indexes table
Bigtable key Value
AppId:User:firstName:ikai:ikai@google.com ( Empty )
AppId:User:company:google:ikai@google.com ( Empty )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
24
Thursday, May 26, 2011
26. Now let’s run a query
• If we have the key, we can fetch it right away by key
• What if we don’t? We need indexes.
25
Thursday, May 26, 2011
27. Let’s run a query
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Query queryByName = new Query("User");
queryByName.addFilter("firstName",
FilterOperator.EQUAL, "ikai");
List<Entity> results = datastore.prepare(
queryByName).asList(
FetchOptions.Builder.withDefaults());
// Roughly equivalent to:
// SELECT * from User WHERE firstname = ‘ikai’;
26
Thursday, May 26, 2011
28. Step 1: Query the indexes table
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized entity - includes
firstName, company and biography
values )
Scan the indexes table for values >=
AppId:User:firstName:
Indexes table
Bigtable key Value
AppId:User:firstName:ikai:ikai@google.com ( Empty )
AppId:User:company:google:ikai@google.com ( Empty )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
27
Thursday, May 26, 2011
29. Step 2: Start extracting keys
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized entity - includes
firstName, company and biography
values )
Indexes table
Bigtable key Value
AppId:User:firstName:ikai:ikai@google.com ( Empty )
AppId:User:company:google:ikai@google.com ( Empty )
That gets us this row - extract the key
ikai@google.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
28
Thursday, May 26, 2011
30. Step 3: Batch get the entities themselves
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized entity - includes
firstName, company and biography
values )
Now
Indexes table let’s go back to the entities table and
fetch that key. Success! Value
Bigtable key
AppId:User:firstName:ikai:ikai@google.com ( Empty )
AppId:User:company:google:ikai@google.com ( Empty )
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
29
Thursday, May 26, 2011
31. Key takeaways
• This isn’t a relational database
– There are no full table scans
– Indexes MUST exist for every property we want to query
– Natively, we can only query on matches or startsWith queries
– Don’t index what we never need to query on
• Get by key = one step. Query on property value = 2 steps
30
Thursday, May 26, 2011
32. Let’s run a more complex query!
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Query queryByName = new Query("User");
queryByName.addFilter("firstName",
FilterOperator.EQUAL, "ikai");
queryByName.addFilter("company",
FilterOperator.EQUAL, "google");
List<Entity> results = datastore.prepare(
queryByName).asList(
FetchOptions.Builder.withDefaults());
// Roughly equivalent to:
// SELECT * from User WHERE firstname = ‘ikai’
// AND company = ‘google’;
31
Thursday, May 26, 2011
33. Query resolution strategies
• This query can be resolved using built in indexes
– Zig zag merge join - we’ll cover this example
• Can be optimized using composite indexes
32
Thursday, May 26, 2011
34. Zig zag across multiple indexes
Begin by scanning indexes >=
Bigtable key
AppId:User:company:google
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
33
Thursday, May 26, 2011
35. Zig zag across multiple indexes
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
There’s at least a partial match,
Bigtable key
so we “jump” to the next index
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
34
Thursday, May 26, 2011
36. Zig zag across multiple indexes
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
Move to the next index. Start a scan for keys >=
AppId:User:company:megacorp:zed@megacorp.com
AppId:User:firstName:ikai:david@google.com Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
35
Thursday, May 26, 2011
37. Zig zag across multiple indexes
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
Okay, so that’s a twist. The first value that
AppId:User:company:megacorp:zed@megacorp.com
matches has key ikai@google.com! Does this
Bigtable key
value exist in the first index? AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
36
Thursday, May 26, 2011
38. Zig zag across multiple indexes
Let’s advance the original cursor to >=
Bigtable key
AppId:User:company:google:ikai@google.com
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
37
Thursday, May 26, 2011
39. Zig zag across multiple indexes
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
Alright! We found a match. Let’s AppId:User:firstName:ikai:ikai@acme.com
add the key to our in memory list AppId:User:firstName:ikai:ikai@google.com
and go back to the first index AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
38
Thursday, May 26, 2011
40. Zig zag across multiple indexes
Bigtable key Let’s move on to see if there are any more
AppId:User:company:acme:alfred@acme.com
matches. Let’s start at max@google.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
39
Thursday, May 26, 2011
41. Zig zag across multiple indexes
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
Are there any keys >=
AppId:User:company:megacorp:zed@megacorp.com
AppId:User:firstName:ikai:max@google.com? Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
40
Thursday, May 26, 2011
42. Zig zag across multiple indexes
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
No. We’re at the end of our Bigtable key
index scans. Let’s do a batch AppId:User:firstName:alfred:alfred@acme.com
key of our list of keys: AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:ikai:ikai@google.com
[ ‘ikai@google.com’ ]
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
41
Thursday, May 26, 2011
43. Batch get the entities themselves
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized entity - includes
firstName, company and biography
values )
Now let’s go back to the entities table and
fetch that key. Success!
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
42
Thursday, May 26, 2011
44. Let’s change the shape of the data
• Zig zag performance is HIGHLY dependent on the shape of the
data
• Let’s go ahead and muck with the data a bit
43
Thursday, May 26, 2011
46. Same query, sparsely distributed matches
Begin by scanning indexes >=
Bigtable key
AppId:User:company:google
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:igor:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
45
Thursday, May 26, 2011
47. Same query, sparsely distributed matches
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Move to the next index. Start a scan for keys >=
Bigtable key
AppId:User:firstName:ikai:david@google.com
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:igor:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
46
Thursday, May 26, 2011
48. Same query, sparsely distributed matches
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
Oh ... no matches. Let’s AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
move back to the first AppId:User:firstName:igor:ikai@google.com
index and move the AppId:User:firstName:ikai:ikai@megacorp.com
cursor down AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
47
Thursday, May 26, 2011
50. Same query, sparsely distributed matches
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
AppId:User:company:google:ikai@google.com
AppId:User:company:google:max@google.com
AppId:User:company:megacorp:zed@megacorp.com
Move to the next index. Start a scan for keys >=
Bigtable key
AppId:User:firstName:ikai:ikai@google.com
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:igor:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
49
Thursday, May 26, 2011
51. Same query, sparsely distributed matches
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
Oh ... no matches here
AppId:User:company:google:ikai@google.com either. Let’s go back to
AppId:User:company:google:max@google.com the first index.
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
AppId:User:firstName:ikai:ikai@acme.com
AppId:User:firstName:igor:ikai@google.com
AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
50
Thursday, May 26, 2011
52. Same query, sparsely distributed matches
Bigtable key
AppId:User:company:acme:alfred@acme.com
AppId:User:company:google:david@google.com
Oh ... no matches here
AppId:User:company:google:ikai@google.com either. Let’s go back to
AppId:User:company:google:max@google.com the first index.
AppId:User:company:megacorp:zed@megacorp.com
Bigtable key
AppId:User:firstName:alfred:alfred@acme.com
... if these indexes were AppId:User:firstName:ikai:ikai@acme.com
huge, we could be here AppId:User:firstName:igor:ikai@google.com
for a while! AppId:User:firstName:ikai:ikai@megacorp.com
AppId:User:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
51
Thursday, May 26, 2011
53. What happens in this case?
• If we traverse too many indexes, the datastore throws a
NeedIndexException
• We’ll want to build a composite index
52
Thursday, May 26, 2011
54. Composite index
Bigtable key
AppId:User:company:acme:firstName:alfred:alfred@acme.com
AppId:User:company:google:firstName:david:david@google.com
AppId:User:company:google:firstName:ikai:ikai@google.com
AppId:User:company:google:firstName:max:max@google.com
AppId:User:company:megacorp:firstName:zed:zed@megacorp.com
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
53
Thursday, May 26, 2011
55. Composite index
Bigtable key
AppId:User:company:acme:firstName:alfred:alfred@acme.com
AppId:User:company:google:firstName:david:david@google.com
AppId:User:company:google:firstName:ikai:ikai@google.com
AppId:User:company:google:firstName:max:max@google.com
AppId:User:company:megacorp:firstName:zed:zed@megacorp.com
Search for all keys >=
AppId:User:company:google:firstName:ikai
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
54
Thursday, May 26, 2011
56. Composite index
Bigtable key
AppId:User:company:acme:firstName:alfred:alfred@acme.com
AppId:User:company:google:firstName:david:david@google.com
AppId:User:company:google:firstName:ikai:ikai@google.com
AppId:User:company:google:firstName:max:max@google.com
AppId:User:company:megacorp:firstName:zed:zed@megacorp.com
Well, that was much faster, wasn’t it?
Read more: http://code.google.com/appengine/articles/storage_breakdown.html
55
Thursday, May 26, 2011
57. Composite index tradeoffs
• Created at entity save time - incurs additional datastore CPU
and storage quota
• You can only create 200 composite index
• You need to know the possible queries ahead of time!
56
Thursday, May 26, 2011
58. Complex Queries takeaways
• This isn’t a relational database
– There are no full table scans
– Indexes MUST exist for every property we want to query
• Performance depends on the shape of the data
• Worse case scenario: if your query matches are highly sparse
• Build composite indexes when you need them
57
Thursday, May 26, 2011
61. Why entity groups?
• We can perform transactions within this group - but not outside
• Data locality - data are stored “near” each other
• Strongly consistent queries when using High Replication
datastore within this entity group
59
Thursday, May 26, 2011
62. Entity groups and transactions
• A hierarchical structuring of your data into Megastore’s unit of
atomicity
• Allows for transactional behavior - but only within a single entity
group
• Key unit of consistency when using High Replication datastore
60
Thursday, May 26, 2011
63. Example: Data for a blog hosting service
User
Blog Has many
Has many
Entry
Has many Comment
61
Thursday, May 26, 2011
64. Example: Data for a blog hosting service
User
Blog Has many
Has many
Entry
This can be structured as
an entity group (tree
structure)! Has many Comment
62
Thursday, May 26, 2011
65. Structure this data as an entity group
Entity
User
group root
Blog Blog
Entry Entry Entry
Comment
Comment Comment
63
Thursday, May 26, 2011
66. How are entity groups stored?
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized User )
AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog )
AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry )
AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry )
AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment )
Comment:111
AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment )
Comment:222
AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment )
Comment:333
Read more: http://code.google.com/appengine/docs/python/datastore/entities.html
64
Thursday, May 26, 2011
67. How are entity groups stored?
Entities table Entity groups have a single root entity
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized User )
AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog )
AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry )
AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry )
AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment )
Comment:111
AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment )
Comment:222
AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment )
Comment:333
Read more: http://code.google.com/appengine/docs/python/datastore/entities.html
65
Thursday, May 26, 2011
68. How are entity groups stored?
Entities table
Bigtable key Value
AppId:User:ikai@google.com ( Protobuf serialized User )
AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog )
AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry )
AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry )
Child entities embed the entire ancestry in
AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment )
Comment:111 their keys
AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment )
Comment:222
AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment )
Comment:333
Read more: http://code.google.com/appengine/docs/python/datastore/entities.html
66
Thursday, May 26, 2011
69. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
67
Thursday, May 26, 2011
70. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService(); Create the root entity
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
68
Thursday, May 26, 2011
71. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
This is the first child entity - notice the third
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
argument, which specifies the parent entity key
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
69
Thursday, May 26, 2011
72. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
The next deeper entity sets the blog as the
parent
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
70
Thursday, May 26, 2011
73. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
We can also opt to not provide a key name and
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
just use a parent key for a new entity
blog.getKey());
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
71
Thursday, May 26, 2011
74. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
Start a new transaction
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
72
Thursday, May 26, 2011
75. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
Put the entities in parallel
73
Thursday, May 26, 2011
76. Let’s write an entity group transactionally
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity ikai = new Entity("User", "ikai@google.com");
Entity blog = new Entity("Blog", "ikaisays.com",
ikai.getKey());
Entity entry = new Entity("Entry", "datastore-intro",
blog.getKey());
// Auto assign an ID
Entity comment = new Entity("Comment", entry.getKey());
Transaction tx = datastore.beginTransaction();
// Helper function for clarity
Actually commit the changes
datastore.put(Arrays.asList(ikai, blog,entry, comment));
tx.commit();
74
Thursday, May 26, 2011
77. Step 1: Commit
Changes to Changes to entities
Commit
entities visible and indexes visible
Roll the timestamp forward on
the root entity
75
Thursday, May 26, 2011
78. On read, check for the most
Step 2: Entity visible recent timestamp on the root
entity
Changes to Changes to entities
Commit
entities visible and indexes visible
This is the version we want
since it represents a
complete write
76
Thursday, May 26, 2011
79. Step 3: Indexes updated
Changes to Changes to entities
Commit
entities visible and indexes visible
Indexes are written - now we
can query for this entity with
the new properties
77
Thursday, May 26, 2011
80. Entity group and transactions takeaways
• Structure data into hierarchical trees
– Large enough to be useful, small enough to maximize
transactional throughput
• Transactions need an entity group root - roughly 1 transaction/
second
– If you write N entities that are all part of 1 entity group, it counts as
1 write
• Optimistic locking used - can be expensive with a lot of
contention
78
Thursday, May 26, 2011
81. General datastore tips
• Denormalize as much as possible
– As much as possible, treat datastore as a key-value store
(Dictionary or Map like structure)
– Move large reporting to offline processing. This lets you avoid
unnecessary indexes
• Use entity groups for your data
• Build composite indexes where you need them - “need” depends
on shape of your data
79
Thursday, May 26, 2011