SlideShare una empresa de Scribd logo
1 de 29
Intro to Solr
DrupalCon
Portland
Andrew Riley
Director of Drupal Development
@andrewmriley
Agenda
Search?
Why
Solr?
Searching
Behind
the
Scenes
Search?
What is Search?
Search (v): to go or look through (a place, area, etc.)
carefully in order to find something missing or lost: I
searched the desk for the letter.
Source: http://dictionary.reference.com/browse/search
@Mediacurren
t
Why Users Search
• Navigation doesn't make sense
• It can be faster
• Lots of data
• Frequent data changes
• Might just be looking for something
@Mediacurren
t
Search Problems
• Search accuracy
• Too much data
• Slow response
• Wrong results
@Mediacurren
t
Why
Solr?
History
Solr was initially created in 2004 as an in-house
project for CNET. It was open sourced in 2006 and
donated to the Apache Software Foundation.
@Mediacurren
t
Lucene
• Solr is a layer on top of Lucene
• Lucene is a library
• Solr stores files in Lucene format
*http://wiki.apache.org/solr/SolrPerformanceData
@Mediacurren
t
Speed
Search speed is important!
@Mediacurren
t
Speed
Source: Web Performance Today http://j.mp/12h8wLZ
@Mediacurren
t
Speed
• Important!
• It scales well
• No database required
• Clustering & Sharding
• Netflix runs 1.2MM q/day on 4 servers*
*http://wiki.apache.org/solr/SolrPerformanceData
@Mediacurren
t
Natural Results
• Stemming: Blogging vs. Blog
• Stop Word Removal: The
• Synonyms: Tissue vs Kleenex
• Highly Configurable
@Mediacurren
t
Drupal Search
• Not stemmed by default
• Queries the database
• Stores tokenized words in a single large
table
• Much slower to index
@Mediacurren
t
VS
@Mediacurren
t
Searching
Ordering
• Score
• Comes from Lucene
• Not "out of 100"
• Bigger score first
More Info: http://lucene.apache.org/core/3_6_1/scoring.html
???
201
200
199
184
@Mediacurren
t
Facets
• Users do the work
• Fixes too much data
• Native to Solr
• Requires the Facet API
module
• Shopping Sites
@Mediacurren
t
Behind the
Scenes
Index?
• Index contains Documents
• Documents have Fields
• Fields have Terms
• ~2 minutes for updates
• Uses Lucene syntax
@Mediacurren
t
Tokenizing
• Splits words and numbers
"this" "is" "blogging"
• Excludes Stopwords
"this" "blogging"
• Handles Stemming (if enabled)
"this" "blog"
• Very configurable
@Mediacurren
t
Bias
• Adjusts the order of search results
• Works on: Content Type, Fields,
Comments, Promoted to Home Page and
more
• Can be dynamic with custom modules.
@Mediacurren
t
Recap
Modules
• Apache Solr (apachesolr)
• Facet API (facetapi)
• Chaos tool suite (ctools)
@Mediacurren
t
Overall
• Search is becoming more and more
important
• You want to control your search results
• If you don't provide a good search
experience, somebody else will.
• Solr doesn't have to be complex.
• Solr is fast and scales.
@Mediacurren
t
Thank You!
Questions?
@Mediacurrent Mediacurrent.com
andrew.riley@mediacurrent.com
@andrewmriley
slideshare.net/mediacurre
nt

Más contenido relacionado

La actualidad más candente

Library Mashups & APIs
Library Mashups & APIsLibrary Mashups & APIs
Library Mashups & APIs
librarywebchic
 
We Don't Need Roads: A Developers Look Into SQL Server Indexes
We Don't Need Roads: A Developers Look Into SQL Server IndexesWe Don't Need Roads: A Developers Look Into SQL Server Indexes
We Don't Need Roads: A Developers Look Into SQL Server Indexes
Richie Rump
 

La actualidad más candente (10)

State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Apache Lucene 4
Apache Lucene 4Apache Lucene 4
Apache Lucene 4
 
Content Strategy for WordPress: Case Study
Content Strategy for WordPress: Case StudyContent Strategy for WordPress: Case Study
Content Strategy for WordPress: Case Study
 
Library Mashups & APIs
Library Mashups & APIsLibrary Mashups & APIs
Library Mashups & APIs
 
Beyond google oct10
Beyond google oct10Beyond google oct10
Beyond google oct10
 
Database Developers: the most important developers on earth?
Database Developers: the most important developers on earth?Database Developers: the most important developers on earth?
Database Developers: the most important developers on earth?
 
OpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemOpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene Ecosystem
 
DataSploit - Tool Demo at Null Bangalore - March Meet.
DataSploit - Tool Demo at Null Bangalore - March Meet. DataSploit - Tool Demo at Null Bangalore - March Meet.
DataSploit - Tool Demo at Null Bangalore - March Meet.
 
Fun! with the Twitter API
Fun! with the Twitter APIFun! with the Twitter API
Fun! with the Twitter API
 
We Don't Need Roads: A Developers Look Into SQL Server Indexes
We Don't Need Roads: A Developers Look Into SQL Server IndexesWe Don't Need Roads: A Developers Look Into SQL Server Indexes
We Don't Need Roads: A Developers Look Into SQL Server Indexes
 

Destacado

Elegant CSS Design In Drupal: LESS vs Sass
Elegant CSS Design In Drupal: LESS vs SassElegant CSS Design In Drupal: LESS vs Sass
Elegant CSS Design In Drupal: LESS vs Sass
Mediacurrent
 

Destacado (6)

Drupal CEO Panel Discussion
Drupal CEO Panel DiscussionDrupal CEO Panel Discussion
Drupal CEO Panel Discussion
 
Features everywhere
Features everywhere Features everywhere
Features everywhere
 
FileMaker-Drupal Synchronization
FileMaker-Drupal SynchronizationFileMaker-Drupal Synchronization
FileMaker-Drupal Synchronization
 
Elegant CSS Design In Drupal: LESS vs Sass
Elegant CSS Design In Drupal: LESS vs SassElegant CSS Design In Drupal: LESS vs Sass
Elegant CSS Design In Drupal: LESS vs Sass
 
Learning the Craft of Marketing and Selling Drupal Services
Learning the Craft of Marketing and Selling Drupal ServicesLearning the Craft of Marketing and Selling Drupal Services
Learning the Craft of Marketing and Selling Drupal Services
 
Contrib First
Contrib FirstContrib First
Contrib First
 

Similar a Intro to Solr in Drupal

Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCamp
GokulD
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
Erik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 
Practical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrPractical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+Solr
Jake Mannix
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
WO Community
 
Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...
Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...
Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...
Lucidworks
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
Agnes Molnar
 
Search enabled applications with lucene.net
Search enabled applications with lucene.netSearch enabled applications with lucene.net
Search enabled applications with lucene.net
Willem Meints
 

Similar a Intro to Solr in Drupal (20)

Sitecore search absolute basics
Sitecore search absolute basicsSitecore search absolute basics
Sitecore search absolute basics
 
Sugblr sitecore search - absolute basics
Sugblr sitecore search - absolute basicsSugblr sitecore search - absolute basics
Sugblr sitecore search - absolute basics
 
Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCamp
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Practical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and SparkPractical Machine Learning for Smarter Search with Solr and Spark
Practical Machine Learning for Smarter Search with Solr and Spark
 
Practical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+SolrPractical Machine Learning for Smarter Search with Spark+Solr
Practical Machine Learning for Smarter Search with Spark+Solr
 
#DWCAU 2019 The Evolution of Finding Stuff
#DWCAU 2019 The Evolution of Finding Stuff#DWCAU 2019 The Evolution of Finding Stuff
#DWCAU 2019 The Evolution of Finding Stuff
 
Solr for Data Science
Solr for Data ScienceSolr for Data Science
Solr for Data Science
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Discovery Interfaces
Discovery InterfacesDiscovery Interfaces
Discovery Interfaces
 
Solr
SolrSolr
Solr
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
Building Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and ElasticsearchBuilding Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and Elasticsearch
 
Building Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and ElasticsearchBuilding Search Engines - Lucene, SolR and Elasticsearch
Building Search Engines - Lucene, SolR and Elasticsearch
 
Lectio Praecursoria: Search Interfaces on the Web: Querying and Characterizin...
Lectio Praecursoria: Search Interfaces on the Web: Querying and Characterizin...Lectio Praecursoria: Search Interfaces on the Web: Querying and Characterizin...
Lectio Praecursoria: Search Interfaces on the Web: Querying and Characterizin...
 
Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...
Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...
Challenges of Simple Documents: When Basic isn't so Basic - Cassandra Targett...
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
 
Search enabled applications with lucene.net
Search enabled applications with lucene.netSearch enabled applications with lucene.net
Search enabled applications with lucene.net
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for Drupal
 

Más de Mediacurrent

Más de Mediacurrent (20)

Penn State News: Pivoting to Decoupled Drupal with Gatsby
Penn State News: Pivoting to Decoupled Drupal with GatsbyPenn State News: Pivoting to Decoupled Drupal with Gatsby
Penn State News: Pivoting to Decoupled Drupal with Gatsby
 
Evolving How We Measure Digital Success in Higher Ed
Evolving How We Measure Digital Success in Higher EdEvolving How We Measure Digital Success in Higher Ed
Evolving How We Measure Digital Success in Higher Ed
 
Penn State scales static Drupal to new heights
Penn State scales static Drupal to new heightsPenn State scales static Drupal to new heights
Penn State scales static Drupal to new heights
 
Delivering Meaningful Digital Experiences in Higher Ed
Delivering Meaningful Digital Experiences in Higher EdDelivering Meaningful Digital Experiences in Higher Ed
Delivering Meaningful Digital Experiences in Higher Ed
 
Content Strategy: Building Connections with Your Audience
Content Strategy: Building Connections with Your AudienceContent Strategy: Building Connections with Your Audience
Content Strategy: Building Connections with Your Audience
 
Decoupled Drupal and Gatsby in the Real World
Decoupled Drupal and Gatsby in the Real WorldDecoupled Drupal and Gatsby in the Real World
Decoupled Drupal and Gatsby in the Real World
 
A Better Way to Build and Manage Sites with Rain for Drupal 9
A Better Way to Build and Manage Sites with Rain for Drupal 9A Better Way to Build and Manage Sites with Rain for Drupal 9
A Better Way to Build and Manage Sites with Rain for Drupal 9
 
Drupal Security: What You Need to Know
Drupal Security: What You Need to KnowDrupal Security: What You Need to Know
Drupal Security: What You Need to Know
 
Leveraging Design Systems to Streamline Web Projects
Leveraging Design Systems to Streamline Web ProjectsLeveraging Design Systems to Streamline Web Projects
Leveraging Design Systems to Streamline Web Projects
 
Reimagining Your Higher Ed Web Strategy
Reimagining Your Higher Ed Web StrategyReimagining Your Higher Ed Web Strategy
Reimagining Your Higher Ed Web Strategy
 
How to Digitally Transform Higher Ed with Drupal
How to Digitally Transform Higher Ed with DrupalHow to Digitally Transform Higher Ed with Drupal
How to Digitally Transform Higher Ed with Drupal
 
Is my website accessible? Common mistakes (and how to fix them)
Is my website accessible? Common mistakes (and how to fix them)Is my website accessible? Common mistakes (and how to fix them)
Is my website accessible? Common mistakes (and how to fix them)
 
Managing Images In Large Scale Drupal 8 & 9 Websites
Managing Images In Large Scale Drupal 8 & 9 WebsitesManaging Images In Large Scale Drupal 8 & 9 Websites
Managing Images In Large Scale Drupal 8 & 9 Websites
 
Paragraphs v Layout Builder - The Final Showdown
Paragraphs v Layout Builder - The Final ShowdownParagraphs v Layout Builder - The Final Showdown
Paragraphs v Layout Builder - The Final Showdown
 
MagMutual.com: On the JAMStack with Gatsby and Drupal 8
 MagMutual.com: On the JAMStack with Gatsby and Drupal 8 MagMutual.com: On the JAMStack with Gatsby and Drupal 8
MagMutual.com: On the JAMStack with Gatsby and Drupal 8
 
Creating an Organizational Culture of Giving Back to Drupal
Creating an Organizational Culture of Giving Back to DrupalCreating an Organizational Culture of Giving Back to Drupal
Creating an Organizational Culture of Giving Back to Drupal
 
Level Up Your Team: Front-End Development Best Practices
Level Up Your Team: Front-End Development Best PracticesLevel Up Your Team: Front-End Development Best Practices
Level Up Your Team: Front-End Development Best Practices
 
Best Practices for Moving to Drupal 9
Best Practices for Moving to Drupal 9Best Practices for Moving to Drupal 9
Best Practices for Moving to Drupal 9
 
How to Prove Marketing ROI: Overcoming Digital Marketing Challenges
How to Prove Marketing ROI: Overcoming Digital Marketing ChallengesHow to Prove Marketing ROI: Overcoming Digital Marketing Challenges
How to Prove Marketing ROI: Overcoming Digital Marketing Challenges
 
Prepare Your Drupal 9 Action Plan
Prepare Your Drupal 9 Action Plan Prepare Your Drupal 9 Action Plan
Prepare Your Drupal 9 Action Plan
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Intro to Solr in Drupal

Notas del editor

  1. In this example Walmart found that conversion rates were directly affected by site load times. While this example is for sites it still applies to search.