Google Cloud Computing for Java Developers: Platform and Monetization was a presentation given by Chris Schalk at TheEdge 2010 conference in Tel Aviv, Israel on December 16, 2010. The presentation introduced Google App Engine and other Google cloud technologies, discussed monetizing applications, and provided an overview of the Google Prediction API and BigQuery.
1. Google Cloud Computing for Java
Developers: Platform and Monetization
Chris Schalk TheEdge 2010
Google Developer Advocate Tel Aviv, Israel
Dec 16th, 2010
2. Google Cloud Platform Technologies at Glance
ExisFng Google App Engine
Google App Engine for Business (new)
New! Google
Google BigQuery
Predic0on API
Google Storage
3. Agenda
• Part I - Intro to App Engine
• App Engine Details
• Development Tools
• App Engine for Business
• Apps Monetization – Apps Marketplace
• Part II – Google’s new cloud technologies
• Google Storage
• Prediction API
• BigQuery
4. Part I – Intro to App Engine
Topics covered
• App Engine a PaaS
• App Engine usage/customers
• App Engine Technical Details
13. Cloud Development in a Box
• Downloadable SDK
• Application runtimes
• Java, Python
• Local development tools
• Eclipse plugin, AppEngine Launcher
• Specialized application services
• Cloud based dashboard
• Ready to scale
• Built in fault tolerance, load balancing
13
14. Specialized Services
Memcache Datastore URL Fetch
Mail XMPP Task Queue
Images Blobstore User Service
14
24. Two+ years in review
Apr 2008
Python launch
May 2008
Memcache, Images API
Jul 2008
Logs export
Aug 2008
Batch write/delete
Oct 2008
HTTPS support
Dec 2008
Status dashboard, quota details
Feb 2009
Billing, larger files
Apr 2009
Java launch, DB import, cron support, SDC
May 2009
Key-only queries
Jun 2009
Task queues
Aug 2009
Kindless queries
Sep 2009
XMPP
Oct 2009
Incoming email
Dec 2009
Blobstore
Feb 2010
Datastore cursors, Appstats
Mar 2010
Read policies, IPv6
May 2010
App Engine for Business
Jun 2010
Task queue increases, Python pre-compilation…
Jul 2010
Mapper API
Aug 2010
Multi-tenancy, hi perf img serving, custom err pages
Oct 2010
Instances Console, Delete Kind/App Data
24
25. App Engine 1.4 Release New Features
1. Channel API
Allows for Server Push (Comet) to browser
‐ hXp://code.google.com/appengine/docs/java/channel/
2. Always On
3. Warm Up Requests
– Enabled by default for Java apps
– Can turn off in appengine‐web.xml via: <warmup‐requests‐
enabled>false</warmup‐requests‐enabled>
26. App Engine 1.4 Release New Features
4. Hard Limit Updates
– No more 30 second limit for background work ‐> up to 10 minutes
– Response size limits for URLFetch have been raised from 1MB to 32MB
– Memcache batch get/put can now also do up to 32MB requests
– Image API requests and response size limits have been raised from 1MB to 32MB
– Mail API outgoing aXachments have been increased from 1MB to 10MB
27. Other Upcoming Features
…but you can try out early versions now!
1. Mapper API
First component of App Engine’s MapReduce toolkit
• hXp://code.google.com/p/appengine‐mapreduce/
– Large scale data manipulaFon
– Examples include:
• Report generaFon
• CompuFng staFsFcs and metrics …
– Java Example:
• hXp://ikaisays.com/2010/07/09/using‐the‐java‐mapper‐framework‐for‐app‐engine/
• Google “sqlreduce”
– hXp://code.google.com/p/fredsa/source/browse/trunk/?r=115#trunk%2Fsqlreduce
2. Matcher API
– Matcher allows an app to register a set of queries to match against a stream of documents. For every
document presented, matcher will return the ids of all the registered queries that match the document.
– Trusted tester program announced in App Engine forum
– Java support coming, but sFll Python only for now
28. Introducing App Engine for Business
App Engine for Business
Same scalable cloud platform, but designed for the Enterprise
28
29. Google App Engine for Business Details
• Enterprise application management
– Centralized domain console (preview available)
• Enterprise reliability and support Google App Engine
– 99.9% Service Level Agreement for Business
– Direct support
• Hosted SQL
– Relational SQL database in the cloud (preview available)
• SSL on your domain
• Extremely Secure by default
– Integrated Single Sign On (SSO)
• Pricing that makes sense
– Apps cost $8 per user, up to $1000 max per month
29
30. Enterprise App Development with Google
Buy from others Buy from Google Build your own
Google Apps Google Apps Google App Engine
Marketplace for Business for Business
Enterprise Application Platform
Enterprise Firewall
Enterprise Data AuthenFcaFon Enterprise Services User Management
30
31. App Engine for Business
Roadmap
Enterprise Administration
Preview (signups available)
Console
Direct Support Preview (signups available)
Hosted SQL Preview (signups available)
Service Level Agreement Available Q4 2010 (Draft published)
Enterprise billing Available Q4 2010
Custom Domain SSL Limited Release EOY 2010
31
32. App Engine Resources
Get started with App Engine
• http://code.google.com/appengine
Read up on App Engine for Business and become a trusted tester
• http://code.google.com/appengine/business
• bit.ly/gae4btt <- sign up!
40. What Is Google Storage?
• Store your data in Google's cloud
o any format, any amount, any Fme
• You control access to your data
o private, shared, or public
• Access via Google APIs or 3rd party tools/libraries
41. Sample Use Cases
Static content hosting
e.g. static html, images, music, video
Backup and recovery
e.g. personal data, business records
Sharing
e.g. share data with your customers
Data storage for applications
e.g. used as storage backend for Android, AppEngine, Cloud based apps
Storage for Computation
e.g. BigQuery, Prediction API
42. Google Storage Benefits
High Performance and Scalability
Backed by Google infrastructure
Strong Security and Privacy
Control access to your data
Easy to Use
Get started fast with Google & 3rd party tools
43. Google Storage Technical Details
• RESTful API
o Verbs: GET, PUT, POST, HEAD, DELETE
o Resources: identified by URI
o Compatible with S3
• Buckets
o Flat containers
• Objects
o Any type
o Size: 100 GB / object
• Access Control for Google Accounts
o For individuals and groups
• Two Ways to Authenticate Requests
o Sign request using access keys
o Web browser login
44. Performance and Scalability
• Objects of any type and 100 GB / Object
• Unlimited numbers of objects, 1000s of buckets
• All data replicated to multiple US data centers
• Utilizes Google's worldwide network for data delivery
• Only you can use bucket names with your domain names
• Read-your-writes data consistency
• Range Get
46. Google Storage - Availability
• Preview in US currently
o 100GB free storage and network from Google per
account
o Sign up for waitlist at http://code.google.com/apis/
storage/
• Note: Non US preview available on case-by-case basis
• http://bit.ly/dKm770 (for Storage, BigQuery, Prediction)
47. Google Storage - Pricing
o Storage
$0.17/GB/Month
o Network
Upload - $0.10/GB
Download
$0.15/GB Americas / EMEA
$0.30/GB APAC
o Requests
PUT, POST, LIST - $0.01 / 1000 Requests
GET, HEAD - $0.01 / 10000 Requests
48. Demo
• Tools:
o GS Manager
o GSUtil
• Upload / Download
50. Introducing the Google Prediction API
• Google's sophisticated machine learning technology
• Available as an on-demand RESTful HTTP web service
51. How does it work?
"english" The quick brown fox jumped over the lazy
The Prediction API dog.
finds relevant
features in the "english" To err is human, but to really foul things up
sample data during you need a computer.
training. "spanish" No hay mal que por bien no venga.
"spanish" La tercera es la vencida.
The PredicFon API
? To be or not to be, that is the quesFon.
later searches for
those features ? La fe mueve montañas.
during predicFon.
52. A virtually endless number of applicaFons...
Customer TransacFon Species Message DiagnosFcs
Sentiment Risk IdenFficaFon RouFng
Churn Legal Docket Suspicious Work Roster Inappropriate
PredicFon ClassificaFon AcFvity Assignment Content
Recommend PoliFcal Uplit Email Career
Products Bias MarkeFng Filtering Counselling
... and many more ...
53. Using the Prediction API
A simple three step process...
Upload your training data to
1. Upload Google Storage
Build a model from your data
2. Train
3. Predict Make new predicFons
54. Step 1: Upload
Upload your training data to Google Storage
• Training data: outputs and input features
• Data format: comma separated value format
(CSV)
"english","To err is human, but to really ..."
"spanish","No hay mal que por bien no venga."
...
Upload to Google Storage
gsutil cp ${data} gs://yourbucket/${data}
55. Step 2: Train
Create a new model by training on data
To train a model:
POST prediction/v1.1/training?data=mybucket%2Fmydata
Training runs asynchronously. To see if it has finished:
GET prediction/v1.1/training/mybucket%2Fmydata
{"data":{
"data":"mybucket/mydata",
"modelinfo":"estimated accuracy: 0.xx"}}}
59. Prediction API Capabilities
Data
• Input Features: numeric or unstructured text
• Output: up to hundreds of discrete categories
Training
• Many machine learning techniques
• Automatically selected
• Performed asynchronously
Access from many platforms:
• Web app from Google App Engine
• Apps Script (e.g. from Google Spreadsheet)
• Desktop app
60. Prediction API v1.1 - features
• Updated Syntax
• Multi-category prediction
o Tag entry with multiple labels
• Continuous Output
o Finer grained prediction rankings based on multiple labels
• Mixed Inputs
o Both numeric and text inputs are now supported
Can combine continuous output with mixed inputs
61. Prediction API Demos
• Creating training data – recipes.csv
• Simple REST access
• Training the prediction engine
• Start predicting!
• A Java Web example
63. Introducing Google BigQuery
• Google's large data adhoc analysis technology
o Analyze massive amounts of data in seconds
• Simple SQL-like query language
• Flexible access
o REST APIs, JSON-RPC, Google Apps Script
65. Many Use Cases ...
InteracFve Tools Trends
Spam DetecFon
Web Dashboards Network
OpFmizaFon
66. Key CapabiliFes of BigQuery
• Scalable: Billions of rows
• Fast: Response in seconds
• Simple: Queries in SQL
• Web Service
o REST
o JSON-RPC
o Google App Scripts
68. Writing Queries
Compact subset of SQL
o SELECT ... FROM ...
WHERE ...
GROUP BY ... ORDER BY ...
LIMIT ...;
Common functions
o Math, String, Time, ...
Statistical approximations
o TOP
o COUNT DISTINCT
69. BigQuery via REST
GET /bigquery/v1/tables/{table name}
GET /bigquery/v1/query?q={query}
Sample JSON Reply:
{
"results": {
"fields": { [
{"id":"COUNT(*)","type":"uint64"}, ... ]
},
"rows": [
{"f":[{"v":"2949"}, ...]},
{"f":[{"v":"5387"}, ...]}, ... ]
}
}
Also supports JSON-RPC
70. Security and Privacy
Standard Google Authentication
• Client Login
• OAuth
• AuthSub
HTTPS support
• protects your credentials
• protects your data
Relies on Google Storage to manage access
71. Large Data Analysis Example
Wikimedia Revision History
Wikimedia Revision history data from: hXp://download.wikimedia.org/enwiki/latest/enwiki‐
latest‐pages‐meta‐history.xml.7z
75. Further info available at:
• Google Storage for Developers
o http://code.google.com/apis/storage
• Prediction API
o http://code.google.com/apis/predict
• BigQuery
o http://code.google.com/apis/bigquery
76. Recap
• Google App Engine
o Google’s PaaS cloud development platform
• Google App Engine for Business
o New enterprise version of App Engine
• Google Storage
o New high speed data storage on Google Cloud
• Prediction API
o New machine learning technology able to predict
outcomes based on sample data
• BigQuery
o New service for Interactive analysis of very large data
sets using SQL