SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Enterprise Search in
Plone using Solr
Calvin Hendryx-Parker
Plone Conference 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Java Based
• Full-Text Search
• Web Services API
• Standards Based Interfaces
• Scalable
• XML Configuration
• Extensible
What is Solr?
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Indexing
• Query
Playing with Solr
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Data Schema
• Faceted Search
• Administrative
Interface
• Incremental Updates
• Supports Sharding
• Index Databases, Local
Files and Web Pages
• Supports Multiple
Indexes
Solr Features
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Stopwords
• Synonyms
• Highlighted Context
Snippets
• Spelling Suggestions
• More Like This
Suggestions
• Supports Rich
Documents
Solr Features
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010Solr Performance
• Wiktionary Dataset
• 49.5 Millions lines of XML
• 1.3 GB of data
• 1.7 Million Pages Indexed in 5.5 hours
• ZODB Size after import 1.1GB
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
collective.solr
Integration Options
with Plone
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Monkey Patching
• Relies on collective.indexing
• Duplicates all indexes
• Sub-Optimal Integration with Zope Transactions
• Relies on Thread Locals
collective.solr Issues
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
What to do?
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Reevaluate
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• No Monkey Patching
• Simpler Code
Solr Integration as a
Catalog Index
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• ZCatalog Index
• Doesn't depend on
Plone
• Utilizes new
foreign_connections
Connection Method
• Pass through Solr
Queries
• Direct access to the
Solr Response
Enter alm.solrindex
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Still handled by the ZCatalog
• Could change in the future
Sorting
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Handle Parsing Attributes for Indexing
• Translate field-specific queries to Solr
• Registered as Zope Utilities
alm.solrindex Field
Handlers
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
<html>
<body>
<h3>Code Sample</h3>
<p>Replace this text!</p>
</body>
</html>
Example Handler
class TextFieldHandler(DefaultFieldHandler):
def parse_query(self, field, field_query):
name = field.name
request = {name: field_query}
record = parseIndexRequest(request, name, ('query',))
if not record.keys:
return None
query_str = ' '.join(record.keys)
if not query_str:
return None
return {'q': u'+%s:%s' % (name, quote_query(query_str))}
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• GenericSetup Profile
• Tests
• Uses solrpy instead of
the unsupported
solr.py
Other alm.solrindex
Features
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Can replace several ZCatalog indexes
• Remove any indexes you have replaced
• Use it for all Text Indexes
• Still Utilize the ZCatalog Indexes for Everything Else
Tips
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Demo
Project Gutenburg Data
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Questions?
Wednesday, October 27, 2010
Check out
sixfeetup.com/demos
Wednesday, October 27, 2010

Más contenido relacionado

Similar a Enterprise search in plone using solr

Instant ECM with SharePoint 2010
Instant ECM with SharePoint 2010Instant ECM with SharePoint 2010
Instant ECM with SharePoint 2010Corey Roth
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrSease
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
Why RESTful Design for the Cloud is Best
Why RESTful Design for the Cloud is BestWhy RESTful Design for the Cloud is Best
Why RESTful Design for the Cloud is BestGalder Zamarreño
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsMelanie Courtot
 
PojoSR or OSGi (µ)Services For the Rest of Us
PojoSR or OSGi (µ)Services For the Rest of UsPojoSR or OSGi (µ)Services For the Rest of Us
PojoSR or OSGi (µ)Services For the Rest of UsOSGiUsers
 
Tagging search solution design Advanced edition
Tagging search solution design Advanced editionTagging search solution design Advanced edition
Tagging search solution design Advanced editionAlexander Tokarev
 
Tagging search solution design
Tagging search solution designTagging search solution design
Tagging search solution designAlexander Tokarev
 
Building Software Backend (Web API)
Building Software Backend (Web API)Building Software Backend (Web API)
Building Software Backend (Web API)Alexander Goida
 
"How Mozilla Uses Selenium"
"How Mozilla Uses Selenium""How Mozilla Uses Selenium"
"How Mozilla Uses Selenium"Stephen Donner
 
Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...
Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...
Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...Janet Crum
 
Ontology Web Services
Ontology Web ServicesOntology Web Services
Ontology Web ServicesTrish Whetzel
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for DrupalChris Caple
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCampGokulD
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes WorkshopErik Hatcher
 

Similar a Enterprise search in plone using solr (20)

Instant ECM with SharePoint 2010
Instant ECM with SharePoint 2010Instant ECM with SharePoint 2010
Instant ECM with SharePoint 2010
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
Why RESTful Design for the Cloud is Best
Why RESTful Design for the Cloud is BestWhy RESTful Design for the Cloud is Best
Why RESTful Design for the Cloud is Best
 
Planidoo & Zotonic
Planidoo & ZotonicPlanidoo & Zotonic
Planidoo & Zotonic
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
 
PojoSR or OSGi (µ)Services For the Rest of Us
PojoSR or OSGi (µ)Services For the Rest of UsPojoSR or OSGi (µ)Services For the Rest of Us
PojoSR or OSGi (µ)Services For the Rest of Us
 
Tagging search solution design Advanced edition
Tagging search solution design Advanced editionTagging search solution design Advanced edition
Tagging search solution design Advanced edition
 
Tagging search solution design
Tagging search solution designTagging search solution design
Tagging search solution design
 
Building Software Backend (Web API)
Building Software Backend (Web API)Building Software Backend (Web API)
Building Software Backend (Web API)
 
Apache Solr
Apache SolrApache Solr
Apache Solr
 
ActiveRecord 2.3
ActiveRecord 2.3ActiveRecord 2.3
ActiveRecord 2.3
 
"How Mozilla Uses Selenium"
"How Mozilla Uses Selenium""How Mozilla Uses Selenium"
"How Mozilla Uses Selenium"
 
Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...
Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...
Introduction to Zotero: A Free, Open-Source Tool to Manage and Share Citation...
 
Ontology Web Services
Ontology Web ServicesOntology Web Services
Ontology Web Services
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for Drupal
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCamp
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 

Más de Calvin Hendryx-Parker

Más de Calvin Hendryx-Parker (8)

Plone and Drupal -- CMS Coexistance in Higher Education
Plone and Drupal -- CMS Coexistance in Higher EducationPlone and Drupal -- CMS Coexistance in Higher Education
Plone and Drupal -- CMS Coexistance in Higher Education
 
Plone roadmap
Plone roadmapPlone roadmap
Plone roadmap
 
How to seal the deal
How to seal the dealHow to seal the deal
How to seal the deal
 
2010 py ohio supervisor talk
2010 py ohio supervisor talk2010 py ohio supervisor talk
2010 py ohio supervisor talk
 
Social Networking Tools Session Three
Social Networking Tools Session ThreeSocial Networking Tools Session Three
Social Networking Tools Session Three
 
Social Networking Tools Session One
Social Networking Tools   Session OneSocial Networking Tools   Session One
Social Networking Tools Session One
 
Social Networking Tools Session Two
Social Networking Tools   Session TwoSocial Networking Tools   Session Two
Social Networking Tools Session Two
 
Plone's Anatomy
Plone's AnatomyPlone's Anatomy
Plone's Anatomy
 

Último

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Último (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Enterprise search in plone using solr