SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
Using Solr in Online Travel to
 Improve User Experience

        Sudhakar Karegowdra, Esteban Donato
               Travelocity, May 25TH 2011
{ sudhakar.karegowdra, esteban.donato}@travelocity.com
What We Will Cover
§  Travelocity
§  Speakers Background
§  Merchandising & Solr
   •    Challenges
   •    Solution
   •    Sizing and performance data
   •    Take Away
§  Location Resolution & Solr
   •    Challenges
   •    Solution
   •    Sizing and performance data
   •    Take Away
§  Q&A
                                      3
§  First Online Travel Agency(OTA) Launched in 1996
§  Grown to 3,000 employees and is one of the largest
    travel agencies worldwide
§  Headquartered in Dallas/Fort Worth with satellite
    offices in San Francisco, New York, London,
    Singapore, Bangalore, Buenos Aires to name a few
§  In 2004, the Roaming Gnome became the
    centerpiece of marketing efforts and has become an
    international pop icon
§  Owned by Sabre Holdings - sister companies include
    Travelocity Business, IgoUgo.com, lastminute.com,
    Zuji among others




                                                         4
Speakers Background

§  Sudhakar Karegowdra             §  Esteban Donato
  •  Principal Architect              •  Lead Architect
      Travelocity.com                    Travelocity.com
     §  My experience                   §  My experience
         –  13 + years                       –  10 + years
         –  Solr/ Lucene 3 years             –  Solr 2 years
         –  Implementing Hadoop,             –  Analyzing Mahout and
            Pig and Hive for Data               Carrot2 for document
            warehouse.                          clustering engine.

§  Topic :                         §  Topic :
    Merchandising                       Location Resolution


                                                                       5
Merchandising
By Sudhakar Karegowdra




                         6
The Challenge
§  Market Drivers
   •    Build Landing Pages with Faceted Navigation
   •    Enable Content Segmentation and delivery
   •    Support Roll out of Promotions
   •    Roll up Data to a higher level
         §  E.g., All 5 star hotels in California to bring all the 5 Star
             hotels from SFO,LAX, SAN etc.,
   •  Faster time to market new Ideas
   •  Rapidly scale to accommodate global brands
      with disparate data sources



                                                                             7
The Challenge
§  Traditional Database approach
  •  Higher time to market
  •  Specialized skill set to design and optimize
     database structures and queries
  •  Aggregation of data and changing of structures
     quite complex
  •  Building Faceted navigation capabilities needs
     complex logic leading to high maintenance cost




                                                      8
Solution - Overview
§  Data from various sources aggregated and
    ingested into Solr
   •  Core per Locale and Product Type

§  Wrapper service to combine some data across
    product cores and manage configuration rules

§  Solr’s built in Search and Faceting to power the
    navigation



                                                       9
Solution – Architecture View
                      UI      Widgets       Mobile



                        Services/Business Logic



                           Solr Slaves (Multi Core)


                           Solr Master (Multi Core)

   Offer
Management
             Oracle                ETL
   Tool


                      Deals      Products       ……

                                                      10
Solution - Achievements
§  Millions of unique Long Tail Landing Pages
      §  E.g.,
          http://www.travelocity.com/hotel-d4980-nevada-las-vegas-
          hotels_5-star_business-center_green
§  Faster search across products
      §  E.g., Beach Deals under $500
§  Segmented Content delivery through tagging
§  Scaled well to distribute the content to different
    brands, partners and advertisers
§  Opened up for other innovative applications
      §  Deals on Map, Deals on Mobile, Wizards etc.,


                                                                     11
Solution – Road Ahead
§  Migration to Solr 3.1
   •  Geo spatial search
   •  CSV out put format
§  Query boosting by Search pattern
§  Near Real time Updates
§  Deal and user behavior mining in Hadoop –
    MapReduce and Solr to Serve the Content
§  Move Slaves to Cloud



                                                12
Sizing & Performance
§  Index Stats
      §  Number of Cores : 25
      §  Number of Documents : ~ 1 Million Records
§  Response
      §  Requests : 70 tps
      §  Average response time : 0.005 seconds (5 ms)
§  Software Versions
      §  Solr Version 1.4.0
          –  filterCache size : 30000
      §  Tomcat – 5.5.9
      §  JDK1.6



                                                         13
Take Away
§  Semi Structured Storage in Solr helps
    aggregate disparate sources easily
      Remember Dynamic fields


§  Multiple Cores to manage multiple locale data

§  Solr is a great enabler of “Innovations”




                                                    14
Location Resolution
    By Esteban Donato




                        15
The Challenge
§  How to develop a global location resolution
    service?
§  Flexibility to changes
§  General enough to cover everyone needs
§  Multi language
§  Performance and scalability
§  Configurable by site




                                                  16
Architecture of the solution

                 Auto-complete
                                           Solr Slave
                  Resolution


            § Master/Slave architecture
           § SolrJ client each core
            § Multi-core: binary format
           § Solr response cache
            represents a language          Solr Master
         § Remote Streaming indexing
         § CSV format


Management                                  Batch Job
   Tool                   Location DB




                                                         17
Auto-complete
§  System has to suggest options as the users
    type their desired location
§  Examples “san” => San Francisco, “veg” =>
    Las Vegas
§  Relevancy: not all the locations are equally
    important. “par” => “Paris, France”; “Parana,
    Argentina”
§  Users can search by various fields: location
    code, location name, city code, city name,
    state/province code, state province name,
    country code, country name.
                                                    18
Solr schema
<dynamicField name="RANK*" type="int" required="false" indexed="true" stored="true" />
<field name="GLS_FULL_SEARCH" type="glsSearchField" required="false" indexed="true"
stored="false" multiValued="true" />
<fieldType name="glsSearchField" class="solr.TextField" positionIncrementGap="100“>
  <analyzer>
     <tokenizer class="solr.PatternTokenizerFactory" pattern="[/-t ]+" />
     <filter class="solr.LowerCaseFilterFactory" />
     <filter class="solr.TrimFilterFactory" />
     <filter class="solr.ISOLatin1AccentFilterFactory" />
     <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
     <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement=""
replace="all"/>
  </analyzer>
</fieldType>


                                                                                         19
Resolution
§  System has to resolve the location requested
    by the users.
§  Contemplates aliases. Big Apple => New York
§  Contemplates ambiguities.
§  Contemplates misspellings. Lomdon => London
     §  NGramDistance algorithm.
     §  How to combine distance with relevancy
     §  Error suggesting the correct location when it is a prefix.
         Lond => London




                                                                      20
Spellchecker configuration
<fieldType name=" spellcheckType " class="solr.TextField" positionIncrementGap="100“>
  <analyzer>
     <tokenizer class="solr.KeywordTokenizerFactory” />
     <filter class="solr.LowerCaseFilterFactory" />
     <filter class="solr.TrimFilterFactory" />
     <filter class="solr.ISOLatin1AccentFilterFactory" />
     <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
     <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement=""
replace="all"/>
  </analyzer>
</fieldType>




                                                                                        21
Sizing & Performance
§  4 cores with ~ 500,000 documents indexed
    each
§  Response times
  •  Auto-complete: 15ms, 20 TPS
  •  Resolution: 10ms, 2 TPS
§  Cache configuration
  •  queryResultCache: maxSize=1024
  •  documentCache, maxSize=1024
  •  fieldValueCache & filterCache disabled



                                               22
Wrap Up
§  Performance always as top priority
§  Develop simple but robust services
§  Provide a simple API




                                         23
Q&A




      24
Contact
§  Esteban Donato
  •  Esteban.donato@travelocity.com
  •  Twitter: @eddonato
§  Sudhakar Karegowdra
  •  Sudhakar.karegowdra@travelocity.com
  •  Twitter: @skaregowdra



  https://www.facebook.com/travelocity
  Twitter: @travelocity and
  @RoamingGnome

                                           25

Más contenido relacionado

Destacado

Using Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right JobUsing Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right JobLucidworks (Archived)
 
Transforming the house hunting experience
Transforming the house hunting experienceTransforming the house hunting experience
Transforming the house hunting experienceLucidworks (Archived)
 
Descritores de linguagem
Descritores de linguagemDescritores de linguagem
Descritores de linguagemgindri
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条彰 村地
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様彰 村地
 
Mujer, pajaro y estrella
Mujer, pajaro y estrellaMujer, pajaro y estrella
Mujer, pajaro y estrellaguest986e5ae
 
Updated: You Have An Idea ... Do You Have A Business?
Updated: You Have An Idea ...  Do You Have A Business?Updated: You Have An Idea ...  Do You Have A Business?
Updated: You Have An Idea ... Do You Have A Business?Marty Kaszubowski
 
Searching The United States Code with Solr/Lucene
Searching The United States Code with Solr/LuceneSearching The United States Code with Solr/Lucene
Searching The United States Code with Solr/LuceneLucidworks (Archived)
 
The scene- I love you like a love song Selena Gomez
The scene- I love you like a love song Selena GomezThe scene- I love you like a love song Selena Gomez
The scene- I love you like a love song Selena Gomeztanica
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
ブラウザー勉強会始めました
ブラウザー勉強会始めましたブラウザー勉強会始めました
ブラウザー勉強会始めました彰 村地
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search PerformanceLucidworks (Archived)
 
Gaiety Hotel - full version
Gaiety Hotel - full versionGaiety Hotel - full version
Gaiety Hotel - full versiondummypackages
 

Destacado (20)

Using Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right JobUsing Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right Job
 
Transforming the house hunting experience
Transforming the house hunting experienceTransforming the house hunting experience
Transforming the house hunting experience
 
Juan gris
Juan grisJuan gris
Juan gris
 
Descritores de linguagem
Descritores de linguagemDescritores de linguagem
Descritores de linguagem
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条
 
Sample2
Sample2Sample2
Sample2
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様
 
Mains aux fleurs
Mains aux fleursMains aux fleurs
Mains aux fleurs
 
Mujer, pajaro y estrella
Mujer, pajaro y estrellaMujer, pajaro y estrella
Mujer, pajaro y estrella
 
Updated: You Have An Idea ... Do You Have A Business?
Updated: You Have An Idea ...  Do You Have A Business?Updated: You Have An Idea ...  Do You Have A Business?
Updated: You Have An Idea ... Do You Have A Business?
 
Searching The United States Code with Solr/Lucene
Searching The United States Code with Solr/LuceneSearching The United States Code with Solr/Lucene
Searching The United States Code with Solr/Lucene
 
Presentation: IT Wizard Summer Camp
Presentation: IT Wizard Summer CampPresentation: IT Wizard Summer Camp
Presentation: IT Wizard Summer Camp
 
The scene- I love you like a love song Selena Gomez
The scene- I love you like a love song Selena GomezThe scene- I love you like a love song Selena Gomez
The scene- I love you like a love song Selena Gomez
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
All Data Big and Small
All Data Big and SmallAll Data Big and Small
All Data Big and Small
 
Portades
PortadesPortades
Portades
 
ブラウザー勉強会始めました
ブラウザー勉強会始めましたブラウザー勉強会始めました
ブラウザー勉強会始めました
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
 
Gaiety Hotel - full version
Gaiety Hotel - full versionGaiety Hotel - full version
Gaiety Hotel - full version
 
How To Get The Justin Bieber Smile
How To Get The Justin Bieber SmileHow To Get The Justin Bieber Smile
How To Get The Justin Bieber Smile
 

Similar a Using Solr in Online Travel Shopping to Improve User Experience

Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systexJames Chen
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Anthony Baker
 
Using solr in online travel to improve  user experience - By Karegowdra Sudha...
Using solr in online travel to improve  user experience - By Karegowdra Sudha...Using solr in online travel to improve  user experience - By Karegowdra Sudha...
Using solr in online travel to improve  user experience - By Karegowdra Sudha...lucenerevolution
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverLucidworks (Archived)
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloudImaginea
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The CloudImaginea
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Chris Richardson
 
An Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrAn Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrLucidworks (Archived)
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesRahul Singh
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesAnant Corporation
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitectureMaheedhar Gunturu
 
Transitioning from Java to Scala for Spark - March 13, 2019
Transitioning from Java to Scala for Spark - March 13, 2019Transitioning from Java to Scala for Spark - March 13, 2019
Transitioning from Java to Scala for Spark - March 13, 2019Gravy Analytics
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheAmazon Web Services
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDBMongoDB
 

Similar a Using Solr in Online Travel Shopping to Improve User Experience (20)

Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, London
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
 
Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)Introduction to Apache Geode (Cork, Ireland)
Introduction to Apache Geode (Cork, Ireland)
 
Using solr in online travel to improve  user experience - By Karegowdra Sudha...
Using solr in online travel to improve  user experience - By Karegowdra Sudha...Using solr in online travel to improve  user experience - By Karegowdra Sudha...
Using solr in online travel to improve  user experience - By Karegowdra Sudha...
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CIT
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
An Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrAn Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache Solr
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 
SimplifyStreamingArchitecture
SimplifyStreamingArchitectureSimplifyStreamingArchitecture
SimplifyStreamingArchitecture
 
Transitioning from Java to Scala for Spark - March 13, 2019
Transitioning from Java to Scala for Spark - March 13, 2019Transitioning from Java to Scala for Spark - March 13, 2019
Transitioning from Java to Scala for Spark - March 13, 2019
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCache
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 

Más de Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 

Más de Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 

Último

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 

Último (20)

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 

Using Solr in Online Travel Shopping to Improve User Experience

  • 1. Using Solr in Online Travel to Improve User Experience Sudhakar Karegowdra, Esteban Donato Travelocity, May 25TH 2011 { sudhakar.karegowdra, esteban.donato}@travelocity.com
  • 2. What We Will Cover §  Travelocity §  Speakers Background §  Merchandising & Solr •  Challenges •  Solution •  Sizing and performance data •  Take Away §  Location Resolution & Solr •  Challenges •  Solution •  Sizing and performance data •  Take Away §  Q&A 3
  • 3. §  First Online Travel Agency(OTA) Launched in 1996 §  Grown to 3,000 employees and is one of the largest travel agencies worldwide §  Headquartered in Dallas/Fort Worth with satellite offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few §  In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon §  Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others 4
  • 4. Speakers Background §  Sudhakar Karegowdra §  Esteban Donato •  Principal Architect •  Lead Architect Travelocity.com Travelocity.com §  My experience §  My experience –  13 + years –  10 + years –  Solr/ Lucene 3 years –  Solr 2 years –  Implementing Hadoop, –  Analyzing Mahout and Pig and Hive for Data Carrot2 for document warehouse. clustering engine. §  Topic : §  Topic : Merchandising Location Resolution 5
  • 6. The Challenge §  Market Drivers •  Build Landing Pages with Faceted Navigation •  Enable Content Segmentation and delivery •  Support Roll out of Promotions •  Roll up Data to a higher level §  E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc., •  Faster time to market new Ideas •  Rapidly scale to accommodate global brands with disparate data sources 7
  • 7. The Challenge §  Traditional Database approach •  Higher time to market •  Specialized skill set to design and optimize database structures and queries •  Aggregation of data and changing of structures quite complex •  Building Faceted navigation capabilities needs complex logic leading to high maintenance cost 8
  • 8. Solution - Overview §  Data from various sources aggregated and ingested into Solr •  Core per Locale and Product Type §  Wrapper service to combine some data across product cores and manage configuration rules §  Solr’s built in Search and Faceting to power the navigation 9
  • 9. Solution – Architecture View UI Widgets Mobile Services/Business Logic Solr Slaves (Multi Core) Solr Master (Multi Core) Offer Management Oracle ETL Tool Deals Products …… 10
  • 10. Solution - Achievements §  Millions of unique Long Tail Landing Pages §  E.g., http://www.travelocity.com/hotel-d4980-nevada-las-vegas- hotels_5-star_business-center_green §  Faster search across products §  E.g., Beach Deals under $500 §  Segmented Content delivery through tagging §  Scaled well to distribute the content to different brands, partners and advertisers §  Opened up for other innovative applications §  Deals on Map, Deals on Mobile, Wizards etc., 11
  • 11. Solution – Road Ahead §  Migration to Solr 3.1 •  Geo spatial search •  CSV out put format §  Query boosting by Search pattern §  Near Real time Updates §  Deal and user behavior mining in Hadoop – MapReduce and Solr to Serve the Content §  Move Slaves to Cloud 12
  • 12. Sizing & Performance §  Index Stats §  Number of Cores : 25 §  Number of Documents : ~ 1 Million Records §  Response §  Requests : 70 tps §  Average response time : 0.005 seconds (5 ms) §  Software Versions §  Solr Version 1.4.0 –  filterCache size : 30000 §  Tomcat – 5.5.9 §  JDK1.6 13
  • 13. Take Away §  Semi Structured Storage in Solr helps aggregate disparate sources easily Remember Dynamic fields §  Multiple Cores to manage multiple locale data §  Solr is a great enabler of “Innovations” 14
  • 14. Location Resolution By Esteban Donato 15
  • 15. The Challenge §  How to develop a global location resolution service? §  Flexibility to changes §  General enough to cover everyone needs §  Multi language §  Performance and scalability §  Configurable by site 16
  • 16. Architecture of the solution Auto-complete Solr Slave Resolution § Master/Slave architecture § SolrJ client each core § Multi-core: binary format § Solr response cache represents a language Solr Master § Remote Streaming indexing § CSV format Management Batch Job Tool Location DB 17
  • 17. Auto-complete §  System has to suggest options as the users type their desired location §  Examples “san” => San Francisco, “veg” => Las Vegas §  Relevancy: not all the locations are equally important. “par” => “Paris, France”; “Parana, Argentina” §  Users can search by various fields: location code, location name, city code, city name, state/province code, state province name, country code, country name. 18
  • 18. Solr schema <dynamicField name="RANK*" type="int" required="false" indexed="true" stored="true" /> <field name="GLS_FULL_SEARCH" type="glsSearchField" required="false" indexed="true" stored="false" multiValued="true" /> <fieldType name="glsSearchField" class="solr.TextField" positionIncrementGap="100“> <analyzer> <tokenizer class="solr.PatternTokenizerFactory" pattern="[/-t ]+" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.ISOLatin1AccentFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/> </analyzer> </fieldType> 19
  • 19. Resolution §  System has to resolve the location requested by the users. §  Contemplates aliases. Big Apple => New York §  Contemplates ambiguities. §  Contemplates misspellings. Lomdon => London §  NGramDistance algorithm. §  How to combine distance with relevancy §  Error suggesting the correct location when it is a prefix. Lond => London 20
  • 20. Spellchecker configuration <fieldType name=" spellcheckType " class="solr.TextField" positionIncrementGap="100“> <analyzer> <tokenizer class="solr.KeywordTokenizerFactory” /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.ISOLatin1AccentFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/> </analyzer> </fieldType> 21
  • 21. Sizing & Performance §  4 cores with ~ 500,000 documents indexed each §  Response times •  Auto-complete: 15ms, 20 TPS •  Resolution: 10ms, 2 TPS §  Cache configuration •  queryResultCache: maxSize=1024 •  documentCache, maxSize=1024 •  fieldValueCache & filterCache disabled 22
  • 22. Wrap Up §  Performance always as top priority §  Develop simple but robust services §  Provide a simple API 23
  • 23. Q&A 24
  • 24. Contact §  Esteban Donato •  Esteban.donato@travelocity.com •  Twitter: @eddonato §  Sudhakar Karegowdra •  Sudhakar.karegowdra@travelocity.com •  Twitter: @skaregowdra https://www.facebook.com/travelocity Twitter: @travelocity and @RoamingGnome 25