21. Example: Speaker Entities Key Path Kind ID First Name Last Name Speaker1 - Speaker 1 Rod Johnson Key Path Kind ID First Name Last Name Middle Name Suffix Speaker1 - Speaker 2 Guy Steele L Jr.
34. Impossible Indexes SELECT * from Speaker WHERE lastname < 'Steele' and firstname > 'Gregory' ...not in subsequent rows! key lastname firstname Speaker3 Fox Pamela Speaker4 Hohpe Gregory Speaker1 Johnson Ron Speaker2 Steele Guy
35. Impossible Indexes SELECT * from Speaker WHERE lastname > 'Fox' ORDER BY firstname ...not in the correct order! key lastname firstname Speaker3 Fox Pamela Speaker4 Hohpe Gregory Speaker1 Johnson Ron Speaker2 Steele Guy
36.
37.
38. More Properties class Talk(db.Model): title = db.StringProperty(required=True) abstract = db.TextProperty(required=True) speaker = db.ReferenceProperty(Speaker) tags = db.StringListProperty() pamela = Speaker.all().filter('firstname = ', 'Pamela').get() talk = Talk('Writing Apps the Googley Way', 'Bla bla bla', pamela, ['App Engine', 'Python']) talk.put() talk = Talk('Wonders of the Onesie', 'Bluh bluh bluh', pamela, ['Pajamas', 'Onesies']) talk.put()
39. Back-References pamela = Speaker.all().filter('firstname = ', 'Pamela').get() for talk in pamela.talk_set: print talk.title SELECT * from Talk WHERE speaker = Speaker3 key speaker Talk6 Speaker2 Talk1 Speaker3 Talk2 Speaker3 Talk5 Speaker4
40. Searching List Properties talks = Talk.all().filter('tags = ', 'python').fetch(10) SELECT * from Talk WHERE tags = 'Python' LIMIT! key lastname Talk1 App Engine Talk2 Pajamas Talk1 Python Talk2 Onesies
41. Entity Groups pamela = Speaker.all().filter('firstname = ', 'Pamela').get() talk1 = Talk('Writing Apps the Googley Way', 'Bla bla bla', pamela, ['App Engine', 'Python'], parent=pamela) talk2 = Talk('Wonders of the Onesie', 'Bluh bluh bluh', pamela, ['Pajamas', 'Onesies'], parent=pamela) db.put(talk1, talk2) def update_talks(): talk1.title = 'Writing Apps the Microsoft Way' talk2.title = 'Wonders of the Windows' db.put(talk1, talk2) db.run_in_transaction(update_talks)
*This is not actually an App Engine site. But for the sake of demonstration and having something to talk about that you guys are familiar with, let’s use it as an example It has data, like each speaker and their talks, and it has users that want to register for the conference.
App Engine gets request from client OR from cron Figures out what app its mapping to Decides if request corresponds to static or dynamic content If static: Serves file from static servers. Cache, faster, latency. Otherwise: Selects a server that will respond the fastest Fires up app, sends the request, gets the response, gives response to user
Runs either Python interpreter or JVM 6 Doesn’t retain state (like global variables) Can read its own files, can’t write any files or read other app’s files Can’t access networking facilities or hardware Doesn’t expose OS details
Run multiple apps from same hardware Limit clock time, CPU usage, memory usage of apps Gives each app 30 seconds to respond to each Enforces isolation
Similar to an object store / object database. A datastore is made up of entities, and each entity has a kind, a key, and properties. The key is unique for each entity, is set on creation and never changed. It provides a fast way to retrive that entity. The kind is used mostly when querying the datastore, as most queries only returns results of a particular kind. The properties can vary for each entity of a kind – the underlying datastore is schemaless. You’ll often use the API to enforce a schema, for better application logic, however. Properties are optional; you don’t have to have any at all.
We could actually have more properties on Guy to store the additional parts of his name. App Engine would have no problems with this.
But we usually like to enforce schema in code, like so. When we save them to the datastore using put(), the datastore auto assigns a key. We could also set the keys ourself, but we have to make sure they are unique.
But we usually like to enforce schema in code, like so. When we save them to the datastore using put(), the datastore auto assigns a key. We could also set the keys ourself, but we have to make sure they are unique.
We know the key_name of the entity, as we specified it when we created it. That isn't the same as the full key, however. Here we do a batch put() to save in time. Subject to limit in size/number of entities.
This is a transaction! By default, only one entity is in a transaction at a time. We'll see how later how to have multiple in a transaction.
In App Engine, every query must be answered by an existing index or it will return an error. So it must know ahead of time the types of questions that you will ask. App Engine doesn't has a weak query engine compared to other DBs. Those other DBs don't perform at web speeds with large amounts of data spread across multiple machines, however. Let's look at some example queries and indexes.
When performing query, finds the index, finds the first matching row, returns entities until first not-matching row.
Filtering Or Sorting On a Property Requires That the Property Exists A query filter condition or sort order for a property also implies a condition that the entity have a value for the property. A datastore entity is not required to have a value for a property that other entities of the same kind have. A filter on a property can only match an entity with a value for the property. Entities without a value for a property used in a filter or sort order are omitted from the index built for the query.
When performing query, finds the index, finds the first matching row, returns entities until first not-matching row.
This index can also answer this query.
App Engine will always return either the full entities or the keys only, but never partial entities. The size of the result set is subject to a limit, so you need to be careful not to put too much information in one entity. You can spread info across multiple entities for an object if necessary.
- Needs custom ones because building every possible index would take huge amount of space/time, and an app won't use most of them, and more indexes means slower entity updates. - These queries require custom indexes: query with multiple sort orders, query with inequality filter on a property
Needs custom ones because building every possible index would take huge amount of space/time, and an app won't use most of them, and more indexes means slower entity updates. - These queries require custom indexes: queries with multiple sort orders queries with a sort order on keys in descending order queries with one or more inequality filters on a property and one or more equality filters over other properties queries with inequality filters and ancestor filters
Inequality Filters Are Allowed On One Property Only A query may only use inequality filters (<, <=, >=, >, !=) on one property across all of its filters. This makes geo queries difficult as they typically compare lat and lon in same query.
Properties In Inequality Filters Must Be Sorted Before Other Sort Orders If a query has both a filter with an inequality comparison and one or more sort orders, the query must include a sort order for the property used in the inequality, and the sort order must appear before sort orders on other properties. This query is not valid, because it uses an inequality filter and does not order by the filtered property: SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY last_name # ERROR Similarly, this query is not valid because it does not order by the filtered property before ordering by other properties: SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY last_name, birth_year # ERROR This query is valid: SELECT * FROM Person WHERE birth_year >= :min_year ORDER BY birth_year, last_name To get all results that match an inequality filter, a query scans the index table for the first matching row, then returns all consecutive results until it finds a row that doesn't match. For the consecutive rows to represent the complete result set, the rows must be ordered by the inequality filter before other sort orders.
You can specify an offset, but it is slow. It will still have to go to the first result, then count until it gets the one you want. Limit of 1000 as offset. For large datastore sets, this is not a good way to paginate.
Query cursors allow an app to perform a query and retrieve a batch of results, then fetch additional results for the same query in a subsequent web request without the overhead of a query offset. After the app fetches some results for a query, it can ask for an encoded string that represents the location in the result set after the last result fetched (the &quot;cursor&quot;). The app can use the cursor to fetch additional results starting from that point at a later time. A cursor is a base64-encoded string that represents the next starting position of a query after a fetch operation. The app can store the cursor in the datastore or memcache, or in a task queue task payload. A future request handler can perform the same query and include the cursor with the query to tell the datastore to start returning results from the location represented by the cursor. A cursor can only be used by the app that performed the original query, and can only be used to continue the same query.
Query cursors allow an app to perform a query and retrieve a batch of results, then fetch additional results for the same query in a subsequent web request without the overhead of a query offset. After the app fetches some results for a query, it can ask for an encoded string that represents the location in the result set after the last result fetched (the &quot;cursor&quot;). The app can use the cursor to fetch additional results starting from that point at a later time. A cursor is a base64-encoded string that represents the next starting position of a query after a fetch operation. The app can store the cursor in the datastore or memcache, or in a task queue task payload. A future request handler can perform the same query and include the cursor with the query to tell the datastore to start returning results from the location represented by the cursor. A cursor can only be used by the app that performed the original query, and can only be used to continue the same query.
Query cursors allow an app to perform a query and retrieve a batch of results, then fetch additional results for the same query in a subsequent web request without the overhead of a query offset. After the app fetches some results for a query, it can ask for an encoded string that represents the location in the result set after the last result fetched (the &quot;cursor&quot;). The app can use the cursor to fetch additional results starting from that point at a later time. A cursor is a base64-encoded string that represents the next starting position of a query after a fetch operation. The app can store the cursor in the datastore or memcache, or in a task queue task payload. A future request handler can perform the same query and include the cursor with the query to tell the datastore to start returning results from the location represented by the cursor. A cursor can only be used by the app that performed the original query, and can only be used to continue the same query.
Query cursors allow an app to perform a query and retrieve a batch of results, then fetch additional results for the same query in a subsequent web request without the overhead of a query offset. After the app fetches some results for a query, it can ask for an encoded string that represents the location in the result set after the last result fetched (the &quot;cursor&quot;). The app can use the cursor to fetch additional results starting from that point at a later time. A cursor is a base64-encoded string that represents the next starting position of a query after a fetch operation. The app can store the cursor in the datastore or memcache, or in a task queue task payload. A future request handler can perform the same query and include the cursor with the query to tell the datastore to start returning results from the location represented by the cursor. A cursor can only be used by the app that performed the original query, and can only be used to continue the same query.
http://blog.notdot.net/2010/04/High-concurrency-counters-without-sharding We could do sharding, but that's a lot of work, and it takes time to add up the counter.
App Engine's motto is that its easy to build, easy to maintain, and easy to scale. I think you may find that parts of your app are harder to build on App Engine than on other platforms, because you're not used to doing things the scalable App Engine way, and you'll have to do some rethinking. But, once you do start thinking that way, and perhaps also experiment with other scalable platforms, you should find it easier and easier. App Engine is also continuing to come out with features to enable developers to do more in their webapps, and also to make webapps for different audiences. So, try it out, see how you like it, and keep it in mind for your next project.