SlideShare una empresa de Scribd logo
1 de 24
Using Solr in Online Travel to Improve User Experience Sudhakar Karegowdra, Esteban Donato Travelocity, May 25TH 2011{ sudhakar.karegowdra, esteban.donato}@travelocity.com
What We Will Cover Travelocity Speakers Background Merchandising & Solr Challenges Solution Sizing and performance data Take Away Location Resolution & Solr Challenges Solution Sizing and performance data Take Away Q&A 3
First Online Travel Agency(OTA) Launched in 1996 Grown to 3,000 employees and is one of the largest travel agencies worldwide Headquartered in Dallas/Fort Worth with satellite offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others 4
Speakers Background ,[object Object]
Principal ArchitectTravelocity.com ,[object Object]
13 + years
Solr/ Lucene 3 years
Implementing Hadoop, Pig and Hive for Data warehouse.
Topic : MerchandisingEsteban Donato Lead Architect     Travelocity.com My experience 10 + years Solr 2 years  Analyzing Mahout and Carrot2 for document clustering engine. Topic : Location Resolution 5
6 Merchandising  By Sudhakar Karegowdra
The Challenge Market Drivers Build Landing Pages with Faceted Navigation Enable Content Segmentation and delivery Support Roll out of Promotions  Roll up Data to a higher level  E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc., Faster time to market new Ideas Rapidly scale to accommodate global brands with disparate data sources 7
The Challenge Traditional Database approach Higher time to market Specialized skill set to design and optimize database structures and queries Aggregation of data and changing of structures quite complex Building Faceted navigation capabilities needs complex logic leading to high maintenance cost 8
Solution - Overview  Data from various sources aggregated and ingested into Solr  Core per Locale and Product Type   Wrapper service to combine some data across product cores and manage configuration rules Solr’s built in Search and Faceting to power the navigation 9
Solution – Architecture View 10 UI Widgets Mobile Services/Business Logic Solr Slaves (Multi Core) Solr Master (Multi Core) Offer Management Tool Oracle ETL Products Deals ……
Solution - Achievements Millions of unique Long Tail Landing Pages E.g., http://www.travelocity.com/hotel-d4980-nevada-las-vegas-hotels_5-star_business-center_green Faster search across products  E.g., Beach Deals under $500 Segmented Content delivery through tagging  Scaled well to distribute the content to different brands, partners and advertisers Opened up for other innovative applications Deals on Map, Deals on Mobile, Wizards etc., 11
Solution – Road Ahead Migration to Solr 3.1  Geo spatial search CSV out put format Query boosting by Search pattern Near Real time Updates Deal and user behavior mining in Hadoop – MapReduce and Solr to Serve the Content Move Slaves to Cloud  12
Sizing & Performance  Index Stats  Number of Cores : 25 Number of Documents : ~ 1 Million Records Response Requests : 70 tps  Average response time : 0.005 seconds (5 ms) Software Versions Solr Version 1.4.0 filterCache size : 30000 Tomcat – 5.5.9 JDK1.6 13
Take Away Semi Structured Storage in Solr helps aggregate disparate sources easily Remember Dynamic fields  Multiple Cores to manage multiple locale data Solr is a great enabler of “Innovations” 14
15 Location Resolution By Esteban Donato
The Challenge How to develop a global location resolution service? Flexibility to changes General enough to cover everyone needs Multi language Performance and scalability  Configurable by site 16
Architecture of the solution 17 Solr Slave Auto-complete Resolution ,[object Object]
Multi-core: each core represents a language
Remote Streaming indexing
CSV formatSolr Master Location DB Batch Job Management Tool ,[object Object]

Más contenido relacionado

Más de lucenerevolution

Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloudlucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiledlucenerevolution
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs lucenerevolution
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchlucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Stormlucenerevolution
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APIlucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucenelucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenallucenerevolution
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside downlucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - finallucenerevolution
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadooplucenerevolution
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...lucenerevolution
 
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting PlatformHow Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting Platformlucenerevolution
 

Más de lucenerevolution (20)

Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...
 
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting PlatformHow Lucene Powers the LinkedIn Segmentation and Targeting Platform
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
 

Último

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 

Último (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Using Solr in Online Travel Shopping to Improve User Experience - By Esteban Donato and Sudhakara Karegowdra

  • 1. Using Solr in Online Travel to Improve User Experience Sudhakar Karegowdra, Esteban Donato Travelocity, May 25TH 2011{ sudhakar.karegowdra, esteban.donato}@travelocity.com
  • 2. What We Will Cover Travelocity Speakers Background Merchandising & Solr Challenges Solution Sizing and performance data Take Away Location Resolution & Solr Challenges Solution Sizing and performance data Take Away Q&A 3
  • 3. First Online Travel Agency(OTA) Launched in 1996 Grown to 3,000 employees and is one of the largest travel agencies worldwide Headquartered in Dallas/Fort Worth with satellite offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others 4
  • 4.
  • 5.
  • 8. Implementing Hadoop, Pig and Hive for Data warehouse.
  • 9. Topic : MerchandisingEsteban Donato Lead Architect Travelocity.com My experience 10 + years Solr 2 years Analyzing Mahout and Carrot2 for document clustering engine. Topic : Location Resolution 5
  • 10. 6 Merchandising By Sudhakar Karegowdra
  • 11. The Challenge Market Drivers Build Landing Pages with Faceted Navigation Enable Content Segmentation and delivery Support Roll out of Promotions Roll up Data to a higher level E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc., Faster time to market new Ideas Rapidly scale to accommodate global brands with disparate data sources 7
  • 12. The Challenge Traditional Database approach Higher time to market Specialized skill set to design and optimize database structures and queries Aggregation of data and changing of structures quite complex Building Faceted navigation capabilities needs complex logic leading to high maintenance cost 8
  • 13. Solution - Overview Data from various sources aggregated and ingested into Solr Core per Locale and Product Type Wrapper service to combine some data across product cores and manage configuration rules Solr’s built in Search and Faceting to power the navigation 9
  • 14. Solution – Architecture View 10 UI Widgets Mobile Services/Business Logic Solr Slaves (Multi Core) Solr Master (Multi Core) Offer Management Tool Oracle ETL Products Deals ……
  • 15. Solution - Achievements Millions of unique Long Tail Landing Pages E.g., http://www.travelocity.com/hotel-d4980-nevada-las-vegas-hotels_5-star_business-center_green Faster search across products E.g., Beach Deals under $500 Segmented Content delivery through tagging Scaled well to distribute the content to different brands, partners and advertisers Opened up for other innovative applications Deals on Map, Deals on Mobile, Wizards etc., 11
  • 16. Solution – Road Ahead Migration to Solr 3.1 Geo spatial search CSV out put format Query boosting by Search pattern Near Real time Updates Deal and user behavior mining in Hadoop – MapReduce and Solr to Serve the Content Move Slaves to Cloud 12
  • 17. Sizing & Performance Index Stats Number of Cores : 25 Number of Documents : ~ 1 Million Records Response Requests : 70 tps Average response time : 0.005 seconds (5 ms) Software Versions Solr Version 1.4.0 filterCache size : 30000 Tomcat – 5.5.9 JDK1.6 13
  • 18. Take Away Semi Structured Storage in Solr helps aggregate disparate sources easily Remember Dynamic fields Multiple Cores to manage multiple locale data Solr is a great enabler of “Innovations” 14
  • 19. 15 Location Resolution By Esteban Donato
  • 20. The Challenge How to develop a global location resolution service? Flexibility to changes General enough to cover everyone needs Multi language Performance and scalability Configurable by site 16
  • 21.
  • 22. Multi-core: each core represents a language
  • 24.
  • 25.
  • 26. Solr schema <dynamicField name="RANK*" type="int" required="false" indexed="true" stored="true" /> <field name="GLS_FULL_SEARCH" type="glsSearchField" required="false" indexed="true" stored="false" multiValued="true"/> <fieldType name="glsSearchField" class="solr.TextField" positionIncrementGap="100“> <analyzer> <tokenizer class="solr.PatternTokenizerFactory" pattern="[/ ]+" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.ISOLatin1AccentFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/> </analyzer> </fieldType> 19
  • 27. Resolution System has to resolve the location requested by the users. Contemplates aliases. Big Apple => New York Contemplates ambiguities. Contemplates misspellings. Lomdon => London NGramDistance algorithm. How to combine distance with relevancy Error suggesting the correct location when it is a prefix. Lond => London 20
  • 28. Spellchecker configuration <fieldType name=" spellcheckType " class="solr.TextField" positionIncrementGap="100“> <analyzer> <tokenizerclass="solr.KeywordTokenizerFactory” /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.ISOLatin1AccentFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/> </analyzer> </fieldType> 21
  • 29. Sizing & Performance 4 cores with ~ 500,000 documents indexed each Response times Auto-complete: 15ms, 20 TPS Resolution: 10ms, 2 TPS Cache configuration queryResultCache: maxSize=1024 documentCache, maxSize=1024 fieldValueCache  & filterCache  disabled 22
  • 30. Wrap Up Performance always as top priority Develop simple but robust services Provide a simple API 23
  • 32. Contact Esteban Donato Esteban.donato@travelocity.com Twitter: @eddonato Sudhakar Karegowdra Sudhakar.karegowdra@travelocity.com Twitter: @skaregowdra https://www.facebook.com/travelocity Twitter: @travelocityand @RoamingGnome 25