SlideShare a Scribd company logo
1 of 16
Download to read offline
Share what you know




   Sam Kimbrel                               sam@snapguide.com
   Software Engineer

Monday, April 1, 13
What is Snapguide?
                               • 1.5 million uniques/month
                               • ~2000 reqs/min across app
                                 and web

                               • Python (Pyramid/uWSGI/
                                 nginx)

                               • MySQL/Redis
                               • Built primarily on AWS: EC2,
                                 RDS, S3, SQS, SNS,
                                 CloudSearch, CloudFront




                                            daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Snapguide on CloudSearch
           • Beta trial users after mentioning Solr on the phone
                 (seriously!)

           • Primary data set: guides
           • Facets: guide topic, “featured” boolean, visibility/ACL
                 flags

           • “autocomplete” search (more later)




                                                           daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
{
      "lang": "en",
      "fields": {
          "step_count": "14",
          "author_external_id": "qS878yliQ4mxg_9uHt2AZg",
          "author": "Claire Hesseltine",
          "items": [
              "Preheat oven to 325 degrees Fahrenheit.",
              ...
          ],
          "title": "Make Brown Butter Sea Salt Cookies",
          "featured": 1,
          "summary": "The brown butter adds a nutty, caramel-like taste
  to these delicious cookies.",
          "topic": [
              "desserts"
          ],
          "main_image_uuid": "43d201c8fd4b4833b83d3f95d112f1c1",
          "like_count": 761,

                      "public": "true"
             },
             "version": 1364333310,
             "type": "add",
             "id": "9eabff97e32c4244a8205da3fba442e9"
  }                                                     daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Queries
           • Guide text search:
           q=cookies
           • Guide search with topic:
           q=cookies&facet=topic&bq=topic:‘desserts’
           • “Typeahead”/suggestion search:
           bq=(or ‘paper flower’ ‘paper flower*’)




                                              daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Result Ranking
           • Use “Compare Rank Expressions”
           • text_relevance is your friend
           • Goals:
                • Boost popular/featured guides
                • Make title/summary matches worth more than item
                      (supplies, step text) matches




                                                        daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
min(
 cs.text_relevance(
  {"weights":
   {"title":2.5, "author": 1.5, "items":
   0.1, "summary": 1.5},
  "default_weight":1}),
 1000)
+ min(200, like_count / 10)
+ 100*featured


                             daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Offline index updates
           • Extracting guide data to update document is slow
           • Remove update from online web request process
           • Internal-only API endpoints
           • SQS
           • queue_consumer daemon




                                                       daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Offline index updates

                       Web server           SQS




                                      Queue consumer
                       Snapguide
                       DB/Redis




                                         Web server
                                    (dedicated to queues)   CloudSearch




                                                            daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Performance
                      SSL is painful




                                           daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Performance




         but physical proximity (us-west-1) is
                       awesome



                                                 daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Future work
           • Add more domains (users, new features)
           • Search-based suggestion engine
           • Improved ranking/scoring — crawl our social graph




                                                       daniel@snapguide.com • confidential do not distribute



Monday, April 1, 13
Questions?




    www.snapguide.com

Monday, April 1, 13

More Related Content

What's hot

Customizing the custom loop wordcamp 2012
Customizing the custom loop   wordcamp 2012Customizing the custom loop   wordcamp 2012
Customizing the custom loop wordcamp 2012Alexander Sapountzis
 
6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friend6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friendForrest Chang
 
Less is more: Getting Real About Content and Features
Less is more: Getting Real About Content and Features Less is more: Getting Real About Content and Features
Less is more: Getting Real About Content and Features Charlie Morris
 
Social media and internet marketing
Social media and internet marketingSocial media and internet marketing
Social media and internet marketingAidan Crawford
 
Getting Started with Axure Widget Libraries
Getting Started with Axure Widget LibrariesGetting Started with Axure Widget Libraries
Getting Started with Axure Widget LibrariesJayne Spottswood
 
The Coding Designer's Survival Kit - Capital Camp
The Coding Designer's Survival Kit - Capital CampThe Coding Designer's Survival Kit - Capital Camp
The Coding Designer's Survival Kit - Capital Campcanarymason
 

What's hot (7)

Irb Tips and Tricks
Irb Tips and TricksIrb Tips and Tricks
Irb Tips and Tricks
 
Customizing the custom loop wordcamp 2012
Customizing the custom loop   wordcamp 2012Customizing the custom loop   wordcamp 2012
Customizing the custom loop wordcamp 2012
 
6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friend6 reasons Jubilee could be a Rubyist's new best friend
6 reasons Jubilee could be a Rubyist's new best friend
 
Less is more: Getting Real About Content and Features
Less is more: Getting Real About Content and Features Less is more: Getting Real About Content and Features
Less is more: Getting Real About Content and Features
 
Social media and internet marketing
Social media and internet marketingSocial media and internet marketing
Social media and internet marketing
 
Getting Started with Axure Widget Libraries
Getting Started with Axure Widget LibrariesGetting Started with Axure Widget Libraries
Getting Started with Axure Widget Libraries
 
The Coding Designer's Survival Kit - Capital Camp
The Coding Designer's Survival Kit - Capital CampThe Coding Designer's Survival Kit - Capital Camp
The Coding Designer's Survival Kit - Capital Camp
 

Viewers also liked

EDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearchEDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearchMichael Bohlig
 
Amazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and AnalyticsAmazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and AnalyticsMichael Bohlig
 
Building great search – how to overcome common challenges jon handler, aws
Building great search – how to overcome common challenges   jon handler, awsBuilding great search – how to overcome common challenges   jon handler, aws
Building great search – how to overcome common challenges jon handler, awsAmazon Web Services
 
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearchAWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearchAmazon Web Services
 
(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...
(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...
(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...Amazon Web Services
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAmazon Web Services
 
AWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearchAWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearchAmazon Web Services Japan
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data AnalyticsAmazon Web Services
 

Viewers also liked (9)

EDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearchEDU2.0 and Amazon CloudSearch
EDU2.0 and Amazon CloudSearch
 
Amazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and AnalyticsAmazon CloudSearch - Relevance, Ranking, Tuning and Analytics
Amazon CloudSearch - Relevance, Ranking, Tuning and Analytics
 
Building great search – how to overcome common challenges jon handler, aws
Building great search – how to overcome common challenges   jon handler, awsBuilding great search – how to overcome common challenges   jon handler, aws
Building great search – how to overcome common challenges jon handler, aws
 
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearchAWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
AWS Webcast - Build a Scalable Search Engine with the New Amazon CloudSearch
 
30 Must Read CIO Bloggers
30 Must Read CIO Bloggers30 Must Read CIO Bloggers
30 Must Read CIO Bloggers
 
(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...
(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...
(ARC309) Building and Scaling Amazon Cloud Drive to Millions of Users | AWS r...
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
 
AWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearchAWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearch
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
 

Similar to Snapguide - Amazon Cloudsearch

AppEngine Performance Tuning
AppEngine Performance TuningAppEngine Performance Tuning
AppEngine Performance TuningDavid Chen
 
Emergency Toolkit Presentation
Emergency Toolkit PresentationEmergency Toolkit Presentation
Emergency Toolkit PresentationRich Benner
 
Lightweight Documentation: An Agile Approach
Lightweight Documentation: An Agile ApproachLightweight Documentation: An Agile Approach
Lightweight Documentation: An Agile ApproachStephen Ritchie
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling PinterestC4Media
 
OpenStack Doc Overview for Boot Camp
OpenStack Doc Overview for Boot CampOpenStack Doc Overview for Boot Camp
OpenStack Doc Overview for Boot CampAnne Gentle
 
dataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearchdataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearchMathieu Elie
 
Design Systems at Scale
Design Systems at ScaleDesign Systems at Scale
Design Systems at ScaleSarah Federman
 
Untangling the web week1
Untangling the web week1Untangling the web week1
Untangling the web week1Derek Jacoby
 
Play Architecture, Implementation, Shiny Objects, and a Proposal
Play Architecture, Implementation, Shiny Objects, and a ProposalPlay Architecture, Implementation, Shiny Objects, and a Proposal
Play Architecture, Implementation, Shiny Objects, and a ProposalMike Slinn
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseConnected Data World
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search SolutionsFindwise
 
Presentation
PresentationPresentation
Presentationmmarchani
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databasessjwoodman
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryAzavea
 
SVApps presentation
SVApps presentationSVApps presentation
SVApps presentationllkronus
 
Windycityrails page performance
Windycityrails page performanceWindycityrails page performance
Windycityrails page performanceJohn McCaffrey
 
Seasprint2012ploneconferencereportout
Seasprint2012ploneconferencereportoutSeasprint2012ploneconferencereportout
Seasprint2012ploneconferencereportoutableeb
 
Cassandra Day 2014: Interactive Analytics with Cassandra and Spark
Cassandra Day 2014: Interactive Analytics with Cassandra and SparkCassandra Day 2014: Interactive Analytics with Cassandra and Spark
Cassandra Day 2014: Interactive Analytics with Cassandra and SparkEvan Chan
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDBlehresman
 

Similar to Snapguide - Amazon Cloudsearch (20)

AppEngine Performance Tuning
AppEngine Performance TuningAppEngine Performance Tuning
AppEngine Performance Tuning
 
Emergency Toolkit Presentation
Emergency Toolkit PresentationEmergency Toolkit Presentation
Emergency Toolkit Presentation
 
Lightweight Documentation: An Agile Approach
Lightweight Documentation: An Agile ApproachLightweight Documentation: An Agile Approach
Lightweight Documentation: An Agile Approach
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
 
OpenStack Doc Overview for Boot Camp
OpenStack Doc Overview for Boot CampOpenStack Doc Overview for Boot Camp
OpenStack Doc Overview for Boot Camp
 
dataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearchdataviz on d3.js + elasticsearch
dataviz on d3.js + elasticsearch
 
Design Systems at Scale
Design Systems at ScaleDesign Systems at Scale
Design Systems at Scale
 
Icon Fonts FTW
Icon Fonts FTWIcon Fonts FTW
Icon Fonts FTW
 
Untangling the web week1
Untangling the web week1Untangling the web week1
Untangling the web week1
 
Play Architecture, Implementation, Shiny Objects, and a Proposal
Play Architecture, Implementation, Shiny Objects, and a ProposalPlay Architecture, Implementation, Shiny Objects, and a Proposal
Play Architecture, Implementation, Shiny Objects, and a Proposal
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search Solutions
 
Presentation
PresentationPresentation
Presentation
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban Forestry
 
SVApps presentation
SVApps presentationSVApps presentation
SVApps presentation
 
Windycityrails page performance
Windycityrails page performanceWindycityrails page performance
Windycityrails page performance
 
Seasprint2012ploneconferencereportout
Seasprint2012ploneconferencereportoutSeasprint2012ploneconferencereportout
Seasprint2012ploneconferencereportout
 
Cassandra Day 2014: Interactive Analytics with Cassandra and Spark
Cassandra Day 2014: Interactive Analytics with Cassandra and SparkCassandra Day 2014: Interactive Analytics with Cassandra and Spark
Cassandra Day 2014: Interactive Analytics with Cassandra and Spark
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 

More from Michael Bohlig

Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013 Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013 Michael Bohlig
 
Dzone Webinar: Search Patterns with Amazon CloudSearch
Dzone Webinar: Search Patterns with Amazon CloudSearchDzone Webinar: Search Patterns with Amazon CloudSearch
Dzone Webinar: Search Patterns with Amazon CloudSearchMichael Bohlig
 
Delivering Better Search For WordPress - AWS Webcast
Delivering Better Search For WordPress - AWS WebcastDelivering Better Search For WordPress - AWS Webcast
Delivering Better Search For WordPress - AWS WebcastMichael Bohlig
 
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Michael Bohlig
 
Building Great Mobile Search with Productsy and Amazon CloudSearch
Building Great Mobile Search with Productsy and Amazon CloudSearchBuilding Great Mobile Search with Productsy and Amazon CloudSearch
Building Great Mobile Search with Productsy and Amazon CloudSearchMichael Bohlig
 
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Michael Bohlig
 
Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines Michael Bohlig
 
DynamoDB and Amazon Cloudsearch
DynamoDB and Amazon CloudsearchDynamoDB and Amazon Cloudsearch
DynamoDB and Amazon CloudsearchMichael Bohlig
 
Tuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearchTuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearchMichael Bohlig
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Michael Bohlig
 
Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch Michael Bohlig
 

More from Michael Bohlig (11)

Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013 Amazon Cloudsearch Session With Elsevier: re:Invent 2013
Amazon Cloudsearch Session With Elsevier: re:Invent 2013
 
Dzone Webinar: Search Patterns with Amazon CloudSearch
Dzone Webinar: Search Patterns with Amazon CloudSearchDzone Webinar: Search Patterns with Amazon CloudSearch
Dzone Webinar: Search Patterns with Amazon CloudSearch
 
Delivering Better Search For WordPress - AWS Webcast
Delivering Better Search For WordPress - AWS WebcastDelivering Better Search For WordPress - AWS Webcast
Delivering Better Search For WordPress - AWS Webcast
 
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
Using Amazon CloudSearch With Databases - CloudSearch Meetup 061913
 
Building Great Mobile Search with Productsy and Amazon CloudSearch
Building Great Mobile Search with Productsy and Amazon CloudSearchBuilding Great Mobile Search with Productsy and Amazon CloudSearch
Building Great Mobile Search with Productsy and Amazon CloudSearch
 
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
Amazon Redshift - Bay Area CloudSearch Meetup June 19, 2013
 
Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines Amazon CloudSearch User Talk - Naked Wines
Amazon CloudSearch User Talk - Naked Wines
 
DynamoDB and Amazon Cloudsearch
DynamoDB and Amazon CloudsearchDynamoDB and Amazon Cloudsearch
DynamoDB and Amazon Cloudsearch
 
Tuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearchTuning Search Requests - Amazon CloudSearch
Tuning Search Requests - Amazon CloudSearch
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation
 
Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch Geospatial Search With Amazon CloudSearch
Geospatial Search With Amazon CloudSearch
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Snapguide - Amazon Cloudsearch

  • 1. Share what you know Sam Kimbrel sam@snapguide.com Software Engineer Monday, April 1, 13
  • 2. What is Snapguide? • 1.5 million uniques/month • ~2000 reqs/min across app and web • Python (Pyramid/uWSGI/ nginx) • MySQL/Redis • Built primarily on AWS: EC2, RDS, S3, SQS, SNS, CloudSearch, CloudFront daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 3. daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 4. daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 5. daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 6. Snapguide on CloudSearch • Beta trial users after mentioning Solr on the phone (seriously!) • Primary data set: guides • Facets: guide topic, “featured” boolean, visibility/ACL flags • “autocomplete” search (more later) daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 7. { "lang": "en", "fields": { "step_count": "14", "author_external_id": "qS878yliQ4mxg_9uHt2AZg", "author": "Claire Hesseltine", "items": [ "Preheat oven to 325 degrees Fahrenheit.", ... ], "title": "Make Brown Butter Sea Salt Cookies", "featured": 1, "summary": "The brown butter adds a nutty, caramel-like taste to these delicious cookies.", "topic": [ "desserts" ], "main_image_uuid": "43d201c8fd4b4833b83d3f95d112f1c1", "like_count": 761, "public": "true" }, "version": 1364333310, "type": "add", "id": "9eabff97e32c4244a8205da3fba442e9" } daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 8. Queries • Guide text search: q=cookies • Guide search with topic: q=cookies&facet=topic&bq=topic:‘desserts’ • “Typeahead”/suggestion search: bq=(or ‘paper flower’ ‘paper flower*’) daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 9. Result Ranking • Use “Compare Rank Expressions” • text_relevance is your friend • Goals: • Boost popular/featured guides • Make title/summary matches worth more than item (supplies, step text) matches daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 10. min( cs.text_relevance( {"weights": {"title":2.5, "author": 1.5, "items": 0.1, "summary": 1.5}, "default_weight":1}), 1000) + min(200, like_count / 10) + 100*featured daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 11. Offline index updates • Extracting guide data to update document is slow • Remove update from online web request process • Internal-only API endpoints • SQS • queue_consumer daemon daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 12. Offline index updates Web server SQS Queue consumer Snapguide DB/Redis Web server (dedicated to queues) CloudSearch daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 13. Performance SSL is painful daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 14. Performance but physical proximity (us-west-1) is awesome daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 15. Future work • Add more domains (users, new features) • Search-based suggestion engine • Improved ranking/scoring — crawl our social graph daniel@snapguide.com • confidential do not distribute Monday, April 1, 13
  • 16. Questions? www.snapguide.com Monday, April 1, 13