SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
CASE STUDY: SPAREBANK1 GRUPPEN
            Sébastien Muller
Customer Requirement

 ”Better portal search”
Project background

 •SpareBank1 Gruppen
        • 19	
  individual	
  bank	
  portals	
  and	
  1	
  forside
 •Boost 25 umbrella project
        • ”Seman7c”	
  URLs:
        h>ps://www2.sparebank1.no/9898/3_privat?
        _nfpb=true&_nfls=false&_pageLabel=page_privat_innhold&pId=1233149
        354625&_
        • New	
  search	
  GUI
 •CMS with no easy way of telling which bank has published what
        • Mass	
  duplica7ons
        • Access	
  to	
  other	
  portal	
  specific	
  ar7cles
        • Webcrawlers
What is better search?
      At the very least :
      • Relevant hits
      • Facetting
      • Query completion
      • Spelling check and suggestions
      • Basic search analytics
Relevant hits

     •   Relevancy = ”.. The quality of results returned from
         a query...”
               •   Based	
  on	
  hits	
  in	
  fields	
  generated	
  from	
  document	
  
                   processing
     •   Clean and meta-data rich index
               •   Pushed	
  from	
  CMS	
  or	
  extracted	
  by	
  crawlers
Relevant hits
Relevant hits
Crawling and Indexing

     • Clean and meta-data rich index
     • OpenPipeline
             • Ignore	
  irrelevant	
  ar7cles
             • Extract	
  ar7cle	
  text	
  contents
             • Detect	
  duplicates
             • Facet	
  data
             • Populate	
  index	
  fields	
  including	
  *_qc	
  and	
  *_sp	
  fields
Crawling and Indexing

     •   Crawlers will be as smart as you make them
                 • Very	
  rigid	
  logic
                 • Heavily	
  reliant	
  on	
  ar7cle	
  quality
                 • Don’t	
  blame	
  the	
  crawler

     https://www2.sparebank1.no/portal/4702/3_privat?
     _nfpb=true&_n!s=false&_pageLabel=page_privat_innhold&pId=1233149354625&_n!s=false


     https://www2.sparebank1.no/portal/9898/3_privat?
     _nfpb=true&_n!s=false&_pageLabel=page_privat_innhold&pId=1233149354625&_n!s=false
Relevant hits

     Scoring model
     <bean id="qf"
     class="com."ndwise.jelly"sh.solr.querymodi"er.dismax.StaticQueryFieldSetter">
         <property name="queryFields">
          <list value-type="java.lang.String">
            <value>keyword^4</value>
            <value>content1^8</value>
            <value>content2^3</value>
            <value>content3^2</value>
            <value>stem1^1.5</value>
            <value>stem2^1.2</value>
            <value>stem3</value>
          </list>
        </property>
      </bean>
Relevant hits

     • Spell checker
                • Request	
  handler	
  for	
  each	
  bank
                • Index	
  based


     •Stop	
  words
Result
Result
System Architecture

     • Solr is incredibly !exible
              • Master/slave
     • Security constraints
              • Search	
  services	
  available	
  publicly
              • Search	
  analy7cs	
  available	
  internally	
  but	
  limited
              • Indexing	
  
System Architecture
System Architecture
System Architecture
Quality Assurance

     • Crawler friendly content modi"cations
             • Edit
             • Delete
             • Add
             • Share
             • Risk	
  analyse	
  etc
Lessons Learnt

     • Scope creep
     • Garbage in, garbage out
     • Documentation is only useful if it gets read

Más contenido relacionado

Destacado

Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Findability Day 2015   Anders Fors - Volvo Bus - A cost efficient R&D with EX...Findability Day 2015   Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...Findwise
 
What Practitioners Say (Think) About Enterprise Search
What Practitioners Say (Think) About Enterprise SearchWhat Practitioners Say (Think) About Enterprise Search
What Practitioners Say (Think) About Enterprise SearchFindwise
 
Enterprise Search Case Study: Jönköping County
Enterprise Search Case Study: Jönköping CountyEnterprise Search Case Study: Jönköping County
Enterprise Search Case Study: Jönköping CountyFindwise
 
Karnov Super power your search with Text Analytics - Findability Day 2014
Karnov Super power your search with Text Analytics - Findability Day 2014Karnov Super power your search with Text Analytics - Findability Day 2014
Karnov Super power your search with Text Analytics - Findability Day 2014Findwise
 
Accessing Enterprise Content with Mobile Search
Accessing Enterprise Content with Mobile SearchAccessing Enterprise Content with Mobile Search
Accessing Enterprise Content with Mobile SearchFindwise
 
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...Findwise
 
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...Findwise
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise SearchFindwise
 
Findability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPRFindability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPRFindwise
 
Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?Findwise
 
The Why and How of Findability
The Why and How of FindabilityThe Why and How of Findability
The Why and How of FindabilityFindwise
 
Logganalys med Elastic & Findwise
Logganalys med Elastic & FindwiseLogganalys med Elastic & Findwise
Logganalys med Elastic & FindwiseFindwise
 
Findability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experienceFindability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experienceFindwise
 
Anaphora Resolution
Anaphora ResolutionAnaphora Resolution
Anaphora ResolutionFindwise
 
What is Hydra?
What is Hydra?What is Hydra?
What is Hydra?Findwise
 
Findability Day 2016 - SKF case study
Findability Day 2016 - SKF case studyFindability Day 2016 - SKF case study
Findability Day 2016 - SKF case studyFindwise
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindwise
 
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any messFindability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any messFindwise
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindwise
 

Destacado (19)

Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Findability Day 2015   Anders Fors - Volvo Bus - A cost efficient R&D with EX...Findability Day 2015   Anders Fors - Volvo Bus - A cost efficient R&D with EX...
Findability Day 2015 Anders Fors - Volvo Bus - A cost efficient R&D with EX...
 
What Practitioners Say (Think) About Enterprise Search
What Practitioners Say (Think) About Enterprise SearchWhat Practitioners Say (Think) About Enterprise Search
What Practitioners Say (Think) About Enterprise Search
 
Enterprise Search Case Study: Jönköping County
Enterprise Search Case Study: Jönköping CountyEnterprise Search Case Study: Jönköping County
Enterprise Search Case Study: Jönköping County
 
Karnov Super power your search with Text Analytics - Findability Day 2014
Karnov Super power your search with Text Analytics - Findability Day 2014Karnov Super power your search with Text Analytics - Findability Day 2014
Karnov Super power your search with Text Analytics - Findability Day 2014
 
Accessing Enterprise Content with Mobile Search
Accessing Enterprise Content with Mobile SearchAccessing Enterprise Content with Mobile Search
Accessing Enterprise Content with Mobile Search
 
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...Findability Day 2015   Liam Holley - Dassault systems - Insight and discovery...
Findability Day 2015 Liam Holley - Dassault systems - Insight and discovery...
 
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...Findability Day 2015   Mattias Ellison - Findwise - Enterprise Search and fin...
Findability Day 2015 Mattias Ellison - Findwise - Enterprise Search and fin...
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
 
Findability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPRFindability Day 2016 - Get started with GDPR
Findability Day 2016 - Get started with GDPR
 
Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?Findability Day 2016 - What is GDPR?
Findability Day 2016 - What is GDPR?
 
The Why and How of Findability
The Why and How of FindabilityThe Why and How of Findability
The Why and How of Findability
 
Logganalys med Elastic & Findwise
Logganalys med Elastic & FindwiseLogganalys med Elastic & Findwise
Logganalys med Elastic & Findwise
 
Findability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experienceFindability Day 2016 - Structuring content for user experience
Findability Day 2016 - Structuring content for user experience
 
Anaphora Resolution
Anaphora ResolutionAnaphora Resolution
Anaphora Resolution
 
What is Hydra?
What is Hydra?What is Hydra?
What is Hydra?
 
Findability Day 2016 - SKF case study
Findability Day 2016 - SKF case studyFindability Day 2016 - SKF case study
Findability Day 2016 - SKF case study
 
Findability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaborationFindability Day 2016 - Enterprise social collaboration
Findability Day 2016 - Enterprise social collaboration
 
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any messFindability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
Findability Day 2015 - Abby Covert - Keynote - How to make sense of any mess
 
Findability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learningFindability Day 2016 - Big data analytics and machine learning
Findability Day 2016 - Big data analytics and machine learning
 

Similar a Enterprise Search Case Study: SpareBank1 Gruppen

Elasticsearch in production New York Meetup at Twitter October 2014
Elasticsearch in production New York Meetup at Twitter October 2014Elasticsearch in production New York Meetup at Twitter October 2014
Elasticsearch in production New York Meetup at Twitter October 2014beiske
 
Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014beiske
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your WebsiteAcquia
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBAndrew Siemer
 
DrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtDrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtNick Santamaria
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDaveEdwards12
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfVaralakshmiKC
 
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Jon Peck
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overviewscrazzl
 
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUGJon Peck
 
DOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using SplunkDOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using SplunkOutlyer
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMNeo4j
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slidesmahavir_a
 
Node.js Dublin Meetup April 2014
Node.js Dublin Meetup April 2014Node.js Dublin Meetup April 2014
Node.js Dublin Meetup April 2014Damian Beresford
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Flink Forward
 
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000DDeveloping and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000Ddclsocialmedia
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...MongoDB
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkinskiwilkins
 

Similar a Enterprise Search Case Study: SpareBank1 Gruppen (20)

Elasticsearch in production New York Meetup at Twitter October 2014
Elasticsearch in production New York Meetup at Twitter October 2014Elasticsearch in production New York Meetup at Twitter October 2014
Elasticsearch in production New York Meetup at Twitter October 2014
 
Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014Elasticsearch in production Boston Meetup October 2014
Elasticsearch in production Boston Meetup October 2014
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your Website
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDB
 
DrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtDrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an Afterthought
 
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malwareDefcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
Defcon 22-wesley-mc grew-instrumenting-point-of-sale-malware
 
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdfPhishing Website Detection by Machine Learning Techniques Presentation.pdf
Phishing Website Detection by Machine Learning Techniques Presentation.pdf
 
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
Auditing Drupal Sites for Performance, Content and Optimal Configuration - SA...
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overview
 
Drupal Site Audit - SFDUG
Drupal Site Audit - SFDUGDrupal Site Audit - SFDUG
Drupal Site Audit - SFDUG
 
DOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using SplunkDOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using Splunk
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Node.js Dublin Meetup April 2014
Node.js Dublin Meetup April 2014Node.js Dublin Meetup April 2014
Node.js Dublin Meetup April 2014
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
 
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000DDeveloping and Implementing a QA Plan During Your Legacy Data to S1000D
Developing and Implementing a QA Plan During Your Legacy Data to S1000D
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
 
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_WilkinsMongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
MongoDB Revised Sharding Guidelines MongoDB 3.x_Kimberly_Wilkins
 

Más de Findwise

White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017Findwise
 
AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017Findwise
 
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017Findwise
 
Findwise and IBM Watson
Findwise and IBM WatsonFindwise and IBM Watson
Findwise and IBM WatsonFindwise
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findwise
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findwise
 
Findability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindwise
 
Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365Findwise
 
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...Findwise
 
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...Findwise
 
Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!Findwise
 
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015   Joachim Dahl - Virtual Works - 360 degree view of the ...Findability Day 2015   Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...Findwise
 
BigData med logganalys
BigData med logganalysBigData med logganalys
BigData med logganalysFindwise
 
Intranet focus search strategy a z - from Findability Day 2014
Intranet focus search strategy a z - from Findability Day 2014Intranet focus search strategy a z - from Findability Day 2014
Intranet focus search strategy a z - from Findability Day 2014Findwise
 
Findability Day 2014 Neo4j how graph data boost your insights
Findability Day 2014 Neo4j how graph data boost your insightsFindability Day 2014 Neo4j how graph data boost your insights
Findability Day 2014 Neo4j how graph data boost your insightsFindwise
 
Martin White it's not the technology it's the content
Martin White it's not the technology it's the contentMartin White it's not the technology it's the content
Martin White it's not the technology it's the contentFindwise
 
Models and beer Findability Day 2014
Models and beer Findability Day 2014Models and beer Findability Day 2014
Models and beer Findability Day 2014Findwise
 
Designing the search experience the language of discovery - Findability Day 2014
Designing the search experience the language of discovery - Findability Day 2014Designing the search experience the language of discovery - Findability Day 2014
Designing the search experience the language of discovery - Findability Day 2014Findwise
 
IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014
IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014
IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014Findwise
 

Más de Findwise (19)

White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017White Arkitekter - Findability Day Roadshow 2017
White Arkitekter - Findability Day Roadshow 2017
 
AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017AI och maskininlärning - Findability Day Roadshow 2017
AI och maskininlärning - Findability Day Roadshow 2017
 
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
De kognitiva eran med IBM Watson - Findability Day Roadshow 2017
 
Findwise and IBM Watson
Findwise and IBM WatsonFindwise and IBM Watson
Findwise and IBM Watson
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016Findability Day 2016 - Enterprise Search and Findability Survey 2016
Findability Day 2016 - Enterprise Search and Findability Survey 2016
 
Findability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligenceFindability Day 2016 - Augmented intelligence
Findability Day 2016 - Augmented intelligence
 
Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365Digital workplace och informationshantering i office 365
Digital workplace och informationshantering i office 365
 
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
Findability Day 2015 - Mickel Grönroos - Findwise - How to increase safety on...
 
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
Findability Day 2015 - Noel Garry - IBM - Information governance and a 360 de...
 
Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!Findability Day 2015 - Martin White - The future is search!
Findability Day 2015 - Martin White - The future is search!
 
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015   Joachim Dahl - Virtual Works - 360 degree view of the ...Findability Day 2015   Joachim Dahl - Virtual Works - 360 degree view of the ...
Findability Day 2015 Joachim Dahl - Virtual Works - 360 degree view of the ...
 
BigData med logganalys
BigData med logganalysBigData med logganalys
BigData med logganalys
 
Intranet focus search strategy a z - from Findability Day 2014
Intranet focus search strategy a z - from Findability Day 2014Intranet focus search strategy a z - from Findability Day 2014
Intranet focus search strategy a z - from Findability Day 2014
 
Findability Day 2014 Neo4j how graph data boost your insights
Findability Day 2014 Neo4j how graph data boost your insightsFindability Day 2014 Neo4j how graph data boost your insights
Findability Day 2014 Neo4j how graph data boost your insights
 
Martin White it's not the technology it's the content
Martin White it's not the technology it's the contentMartin White it's not the technology it's the content
Martin White it's not the technology it's the content
 
Models and beer Findability Day 2014
Models and beer Findability Day 2014Models and beer Findability Day 2014
Models and beer Findability Day 2014
 
Designing the search experience the language of discovery - Findability Day 2014
Designing the search experience the language of discovery - Findability Day 2014Designing the search experience the language of discovery - Findability Day 2014
Designing the search experience the language of discovery - Findability Day 2014
 
IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014
IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014
IBM Big Data Analytics - Cognitive Computing and Watson - Findability Day 2014
 

Último

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Enterprise Search Case Study: SpareBank1 Gruppen

  • 1. CASE STUDY: SPAREBANK1 GRUPPEN Sébastien Muller
  • 3. Project background •SpareBank1 Gruppen • 19  individual  bank  portals  and  1  forside •Boost 25 umbrella project • ”Seman7c”  URLs: h>ps://www2.sparebank1.no/9898/3_privat? _nfpb=true&_nfls=false&_pageLabel=page_privat_innhold&pId=1233149 354625&_ • New  search  GUI •CMS with no easy way of telling which bank has published what • Mass  duplica7ons • Access  to  other  portal  specific  ar7cles • Webcrawlers
  • 4. What is better search? At the very least : • Relevant hits • Facetting • Query completion • Spelling check and suggestions • Basic search analytics
  • 5. Relevant hits • Relevancy = ”.. The quality of results returned from a query...” • Based  on  hits  in  fields  generated  from  document   processing • Clean and meta-data rich index • Pushed  from  CMS  or  extracted  by  crawlers
  • 8. Crawling and Indexing • Clean and meta-data rich index • OpenPipeline • Ignore  irrelevant  ar7cles • Extract  ar7cle  text  contents • Detect  duplicates • Facet  data • Populate  index  fields  including  *_qc  and  *_sp  fields
  • 9. Crawling and Indexing • Crawlers will be as smart as you make them • Very  rigid  logic • Heavily  reliant  on  ar7cle  quality • Don’t  blame  the  crawler https://www2.sparebank1.no/portal/4702/3_privat? _nfpb=true&_n!s=false&_pageLabel=page_privat_innhold&pId=1233149354625&_n!s=false https://www2.sparebank1.no/portal/9898/3_privat? _nfpb=true&_n!s=false&_pageLabel=page_privat_innhold&pId=1233149354625&_n!s=false
  • 10. Relevant hits Scoring model <bean id="qf" class="com."ndwise.jelly"sh.solr.querymodi"er.dismax.StaticQueryFieldSetter"> <property name="queryFields"> <list value-type="java.lang.String"> <value>keyword^4</value> <value>content1^8</value> <value>content2^3</value> <value>content3^2</value> <value>stem1^1.5</value> <value>stem2^1.2</value> <value>stem3</value> </list> </property> </bean>
  • 11. Relevant hits • Spell checker • Request  handler  for  each  bank • Index  based •Stop  words
  • 14. System Architecture • Solr is incredibly !exible • Master/slave • Security constraints • Search  services  available  publicly • Search  analy7cs  available  internally  but  limited • Indexing  
  • 18. Quality Assurance • Crawler friendly content modi"cations • Edit • Delete • Add • Share • Risk  analyse  etc
  • 19. Lessons Learnt • Scope creep • Garbage in, garbage out • Documentation is only useful if it gets read

Notas del editor

  1. No query completion, spellcheck, duplicate detection, contextual search…. Basically useless
  2. This is what we learnt at the beginning: 20 “distinct” portals Bank selector, semantic URLs….. One bank portal for example of 1.5k docs about 50% were duplicates Group publications are made available via the CMS but individual banks are under no obligation to publish the article on their portal and there’s no indication as to whether or not the have Had to use webcrawlers rather than pushing new content from the CMS directly to the indexing service – will come back to that
  3. For us…. Search at the very least means: High speed queries (yay solr) High speed indexing (yay solr, boo crawlers  up to 2 hours for 20k docs on test server) Basic search analytics = query list with hit count and “no hit” count, average queries per time period etc. allows sparebank1 to see what people are searching for most and not finding the most More advanced = click through information, used to tune the relevancy model Pagination, look and feel etc
  4. Full definition on the solr wiki -&gt; at a very basic level, you get what you search for. SB1s existing search did not return relevant results Online portal used by general public != application search -&gt; queries will not be very “focused” ie. Looking for general key words rather than a specific user/file/ID etc. These were the only “reliable” (NOT ALWAYS GUARANTEED) bits of “meta data” we could get from the articles -&gt; subtitles used &lt;b&gt; instead of &lt;hx&gt; How do you determine facets based on that? How do you determine for which bank an article is targeted?
  5. Precision = “Percent of documents returned that are relevant” -&gt; 0% Recall = “Percent of relevant documents returned” -&gt; 0%
  6. Recall is low Precision is low 0 indication (on other hits) as to why they were returned Further reveals poor or non-existent relevancy/scoring model
  7. Couldn’t rely on the CMS, too many articles/documents without relevant/necessary meta data  webcrawlers! 1 crawler per bank -&gt; 20 crawlers Using regex to: Drop all articles without the bank id in the url Drop all .css etc From the html: &lt;title&gt; &lt;abstract&gt; &lt;body&gt; &lt;b&gt; -&gt; used to later build relevancy model Duplicate detection based on hash of text content Facets taken from URL with regex looking for first tag after the bank id  each bank had subtly different facets Qc based on titles sp based on title, body and description *_ for dynamic fields (solr schema)  context specific qc, results and spelling suggestions
  8. If no &lt;abstract&gt; then that wouldn’t be added to the index and wouldn’t show up in the results page If an article included a link that met the rules, regardless of the validity/relevance of the content to the particular bank, it would be crawled and indexed Rogue articles! SpareBank were convinced there was something wrong with our implementation of the search because they were getting results from other banks Based on the crawler’s log, found that the bank in question had a page that linked (using https://www2.sparebank1.no/irrelevant_bank_id/article) to the “rogue” article https://www2.sparebank1.no/portal/4702/3_privat?_nfpb=true&amp;_nfls=false&amp;_pageLabel=page_privat_innhold&amp;pId=1233149354625&amp;_nfls=false https://www2.sparebank1.no/portal/9898/3_privat?_nfpb=true&amp;_nfls=false&amp;_pageLabel=page_privat_innhold&amp;pId=1233149354625&amp;_nfls=false -  Can also see the difference in facets
  9. Title uber alles Keyword = collection/facet CONTENT1 = TITLE CONTENT2 = DESCRIPTION CONTENT3 = BODY Stem1 = title Stem2 = sub_title Stem3 = body
  10. Additional request handler for each bank’s spell checker to do a filtered search for matches against the misspelt query
  11. Demo search GUI (changed a lot since then) facets are now only by section/collection and each bank has it’s own individual GUI  which they now use internally to find what bank has published what Find “rogue” articles that shouldn’t appear on other bank portals eg. Russland for sb1 nord-norge showing up elsewhere even after crawler URL rules filtering
  12. Went for regex query completion allowing for inword rather than only beginning of word Forsikring was a good example as it’s very rarely at the beginning of the word/phrase in norwegian
  13. Started very basic and got increasingly more complex Index size/document count was never an issue so didn’t need sharding
  14. From this…. With all 3 services on 1 tomcat/machine Very naïve, little idea of the scope of the proejct ie. 20 banks etc
  15. To this with each service on a different tomcat/machine Upon realising that we’d need a crawler per bank, each potentially indexing and writing to solr simultaneously, split the solr instances to optimize indexing and search time
  16. The end product! Users do a search on the portal servers, which are in the dmz Searches are logged to the search statistic servers, which store everything in a database behind an internal firewall The portal solr search slave servers check for replication updates at regular intervals The indexing master servers crawl daily and push updates to the portal servers Each master can have several slaves If one master dies, switch to another
  17. Ie. Upon making a change in the cms and publishing it, how well did the crawlers respond and what did it take to make the new content searchable? New documents weren’t showing up despite being published or shared For NEW articles to be found, they needed to be linked to (with a link that would adhere to the crawler URL rules) from an existing crawlable page
  18. Exactly what I learnt while studying, ie. What not to do…. Lovely little plan that they agreed on and then ignored from the first week onwards Our fault for not getting them to be more specific about “better search” Their fault for not telling us about their environment requirements eg. Linux installation location, RPMs etc Their fault for not telling us about their documentation requirements (took about 2 months) Ours for not asking? My fault for suggesting more and more shiney features Search will only ever be as good as the underlying information management system and content owners/authors practices, should we address that? Start from the bottom up with a search centric information management system and best practices/SOPs Frequently getting emails and phone calls from different people asking the same questions, the answers to which are all in the documentation… but no one wants to read that (or write it)