18. OpenSearch
OpenSearch-Geo & Time
keyword - a free-form string to filter search.
Applied to title, descriptions, tags, categories.
Can further filter by using quot;field:keywordquot;, for example keyword=title:dogs
location - toponymic name (e.g. Boston or France) to search within
lat, lon - latitude and longitude of the center of a search. Use with distance
distance - the radius distance, in miles, to search about the center
bbox - the bounding box string to search within. Order is quot;west, south, east northquot;
limit - maximum number of results
page - page number to get the results of
http://mapufacture.com/search.atom?location=Ann+Arbor
20. Platform
• Ruby on Rails
• PostgreSQL with PostGIS
• Memcached
• Lucene and Solr
• Apache
• Mongrel
21. Grow only as fast as necessary
i.e. don’t be a Mentarbator
http://flickr.com/photos/mattdm/153703472/
22. Started with the
little things...
• memcached
• database indexes
• clean code
• ruby-prof, bleakhouse
http://cfis.savagexi.com/articles/2007/07/10/how-to-profile-your-rails-application
• Still can be slow
Find me all the items about Jazz bands playing between
next Friday and Sunday within 5 miles of my house
23. Shared Server
Internet FastCGI
?
?
Rails
• Shared server
?
• FastCGI MySQL
• MySQL
• Cheap
24. Dedicated Server
Internet Apache Mongrel
Solr Rails
• Dedicated server
• Mongrel Lucene PostgreSQL
• PostgreSQL
• Not Cheap
25. +
Decisions
• Scale the aggregator
• Job queues
• Don’t trust instances
• Cache feeds and data sources
26. +
SQS
Dedicated Server
Fast Update
Apache Mongrel
EC2
SSH
Ruby
Rails
PostgreSQL
EC2
Solr
Lucene
SSH
Ruby
S3
27. TLA’s Decoded
• EC2 - Elastic Computing Cloud
more computing than you can shake a stick at
• S3 - Simple Storage Service
a place to keep your stuff
• SQS - Simple Queue Service
easier than Drb
• AWS - Alexa Web Services
who’s the most popular?
28. How to use EC2
http://flickr.com/photos/davebluedevil/17508904/
29. Easy to Start
1. Sign up for Amazon Web Services
http://google.com/search?q=amazon+web+services
2. Get the EC2 Command line tools
http://google.com/search?q=ec2+tools
3. Choose a base AMI (aka OS image)
something nice and stable, like Debian Etch
4. Install-fest
5. Store to S3
30. Debian Etch AMI
Our Install
apt-get install -y apache2 irb rdoc gcc make memcached
build-essential libgeos-c1 postgresql-8.1-postgis
postgresql-client postgis subversion graphicsmagick
postgresql-8.1-postgis postgresql-8.1-plruby postgresql-
dev sun-java5-jre sun-java5-jdk
install ruby and rubygems
gem install -y mongrel mongrel_cluster rake rails
fakeweb hpricot mofo gruff graticule geonames coderay
clusterer feedtools postgres fastercsv libmagick9-dev
configure postgresql with postgis
setup apache for proxying
31. Debian Etch AMI
Our Install
apt-get install -y apache2 irb rdoc gcc make memcached
build-essential libgeos-c1 postgresql-8.1-postgis
postgresql-client postgis subversion graphicsmagick
postgresql-8.1-postgis postgresql-8.1-plruby postgresql-
dev sun-java5-jre sun-java5-jdk
install ruby andDo once, but never again
rubygems
gem install -y mongrel mongrel_cluster rake rails
fakeweb hpricot mofo gruff graticule geonames coderay
clusterer feedtools postgres fastercsv libmagick9-dev
configure postgresql with postgis
setup apache for proxying
54. Where we’re going
EC2
Dedicated Server
Load Balance Apache
Mongrel
PostgreSQL High-Read
Replication
Rails DB
Lucene
User-Edits Index
55. Where we’re going
EC2 EC2 EC2
DB Index DB Index DB Index DB
Aggregators
Dedicated Server
SQS
Load Balance
Aggregators
Batch?
Aggregators
PostgreSQL Lucene
58. ActiveDelegate
# database.yml
# app/models/master_database.rb
login: &login
class MasterDatabase < ActiveRecord::Base
adapter: postgresql
handles_connection_for :master_database
host: localhost
end
port: 5432
production: # app/models/animal.rb
database: mapufacture_local class Animal < ActiveRecord::Base
<<: *login delegates_connection_to :master_database,
:on => [:create, :save, :destroy]
# NOTICE THE NEXT ENTRY/KEY end
master_database:
database: mapufacture
<<: *login
59. Costs
EC2 Bandwidth
Upload: $0.10 per GB
Hour Day Month Specs Download: $0.18 per GB - first 10 TB / month
1.7 GB memory
S $0.10 $2.40 $72 Requests
~1x 1Ghz Xeon
160 GB storage
$0.01 per 1,000 PUT or LIST requests
7.5 GB memory
$0.01 per 10,000 GET and all other requests*
M $0.40 $9.60 $288 ~2x 2Ghz Xeon
850 GB storage
15 GB memory
L $0.80 $19.20 $576 ~4x 2Ghz Xeon
1.7 TB storage
S3 SQS
Storage: $0.15 per GB / Month Messages
Upload: $0.10 per GB $0.10 per 1,000 messages sent
Download: $0.18 per GB - first 10 TB / month
Requests
$0.01 per 1,000 PUT or LIST requests
$0.01 per 10,000 GET and all other requests*