SlideShare una empresa de Scribd logo
1 de 24
Descargar para leer sin conexión
Patrick Beaucamp
           Founder of the Vanilla Project
       Mail : Patrick.beaucamp@bpm-conseil.com




How to Gain Greater Business Intelligence
     with Vanilla from Solr/Lucene




                 LuceneRevolution, Boston        1
Presentation Agenda
Vanilla powered by Lucene
- Report Indexation, Search Interface
- External document management
- evolution & constraints

Step to Solr/Lucene Adoption
- Indexation, Storage, Search
- Embeded Solr/Lucene
- External Solr/Lucene Platform

Keys Benefit for Vanilla powered by Solr/Lucene
- Cluster Architecture
- Cache Mechanism
- Support for enhanced search language


                             LuceneRevolution, Boston   2
Some Vanilla Features
Flash maps and charts : Reports, Cubes and Dashboard




   Vanilla Apps : Android and Iphone




                              LuceneRevolution, Boston   3
Vanilla Powered by Lucene (1/6)
Vanilla is a full Business Intelligence Platform that provide :
- Reporting, Olap, Dashboard, Kpi, Maps Visualisation
- Etl, Workflow, Document Management search Engine




                          LuceneRevolution, Boston                4
Vanilla Powered by Lucene (2/6)
Report Indexation
- Search engine is Apache Lucene (summer 2010)
- External Document & Vanilla Report are indexed
- Different Indexation strategy for documents :
    – No indexation
    – Real Time indexation
    – Late Indexation

2 modules to manage indexation strategy
- Enterprise Services to set document property
- Norparena to Manage Indexation



                        LuceneRevolution, Boston   5
Vanilla Powered by Lucene (3/6)
Search Interface
- Search Interface available from Vanilla Portal
- Search against Lucene index (inside Vanilla)
- Search result is combined with Security on documents
   – List contains all documents
   – Documents are ordered based on popularity




                       LuceneRevolution, Boston          6
Vanilla Powered by Lucene (4/6)
External document management
- various document format are available (Lucene)
- additional properties can be set on documents, for later
useage in search criteria
- check In / check Out on document for versioning
- search is run on the latest document version




                       LuceneRevolution, Boston         7
Vanilla Powered by Lucene (5/6)
Evolution and constraints
- No clustering available for search engine (embeded Api),
as opposed to Vanilla Report Services
- Limitation in language and keywords (internal search)
- No cache to manage search resultset, as opposed to
Vanilla dataset, powered by Memcached

 - request from customers to be compliant with enterprise
search engine → need to setup an external search
architecture




                       LuceneRevolution, Boston         8
Vanilla Powered by Lucene (6/6)
   Embeded Lucene Api inside Vanilla Platform - Video




                    LuceneRevolution, Boston            9
Step to Solr/Lucene Adoption (1/9)
   Solr/Lucene is the natural evolution of any embeded Lucene platform

Solr Version : 3.5

Indexation
Vanilla Lucene Index can be transfert & read by a Solr/Lucene
(a Solr/Lucene index is not usable inside Vanilla Platform)

Storage
Vanilla search Indexed can be managed by a Solr/Lucene platform

Search
Search language is compliant




                                LuceneRevolution, Boston                 10
Step to Solr/Lucene Adoption (2/9)
                Embeded Solr/Lucene inside Vanilla Platform

No need for any changed in Vanilla code : use of solrj Api

Immediatly provide additional features such as new Keywords

Potential upgrade to Solr/Lucene Enterprise




                                LuceneRevolution, Boston      11
Step to Solr/Lucene Adoption (3/9)
From Embeded Lucene to Embeded Solr/Lucene inside Vanilla Platform




                          LuceneRevolution, Boston               12
Step to Solr/Lucene Adoption (4/9)
    Embeded Solr/Lucene inside Vanilla Platform - Video




                      LuceneRevolution, Boston            13
Step to Solr/Lucene Adoption (5/9)
                Solr/Lucene Platform with a Vanilla Platform

Need for changes in Vanilla code, to separate document management, indexation
& search Api → 10 man days workload

Document Management Api
Easy to move to any Cmis compliancy


Indexation & Search Api
Solr/Lucene oriented & compliant, but now open to any other Search Platform




                               LuceneRevolution, Boston                       14
Step to Solr/Lucene Adoption (6/9)
                                  Coding Before

Example of Code (Api) Before the split

 - Direct use of the Lucene Api




 - Parse the document content using Apache TIKA


 - Generate Lucene's queries




                                  LuceneRevolution, Boston   15
Step to Solr/Lucene Adoption (7/9)
                                  Coding After

Example of Code (Api) After the split

 - Easy to use Solrj Api



 - Distributed search




 - Indexation with automatic parsing (using Apache Tika)




                                LuceneRevolution, Boston   16
Step to Solr/Lucene Adoption (8/9)
    Solr/Lucene Platform with Vanilla Platform - Screenshot




                        LuceneRevolution, Boston              17
Step to Solr/Lucene Adoption (9/9)
     Solr/Lucene Platform with Vanilla Platform - Video




                      LuceneRevolution, Boston            18
Key Benefits for Vanilla Powered
          by Solr/Lucene (1/4)
Clustering Search Architecture, outside of Vanilla

Search results clustering implementation (CarrotClusteringEngine) is based on the
Carrot2 framework.




                                 LuceneRevolution, Boston                     19
Key Benefits for Vanilla Powered
          by Solr/Lucene (2/4)
Additional query language to perform search

Solr Uses the Lucene Search Library and Extends it!

- A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys
- Powerful Extensions to the Lucene Query Language
- Faceted Search and Filtering
- Geospatial Search
- Advanced, Configurable Text Analysis




                               LuceneRevolution, Boston                 20
Key Benefits for Vanilla Powered
          by Solr/Lucene (3/4)
New methods to manage result set (binary, Xml, Json)

Solr enterprise search server with a REST-like API.
You put documents in it (called "indexing") via
     XML, JSON or binary over HTTP.
You query it via HTTP GET
     and receive XML, JSON, or binary results

- Advanced Full-Text Search Capabilities
- Optimized for High Volume Web Traffic
- Standards Based Open Interfaces - XML,JSON and HTTP




                                LuceneRevolution, Boston   21
Key Benefits for Vanilla Powered
          by Solr/Lucene (4/4)
Cache Mechanism

Solr caches are associated with an Index Searcher

Three cache implementations :
solr.LRUCache (LRU = Least Recently Used in memory),
solr.FastLRUCache,
solr.LFUCache (Least Frequenty Used)

Many configuration parameters for cache optimisation




                               LuceneRevolution, Boston   22
Next Steps
Upgrade to Solr 4.0

New features for Document cycle Management

Roadmap for better Internationalisation :
- 10 languages available (not Japaneese)
- Search Translation management




                              LuceneRevolution, Boston   23
Documentations and tutorials available on our Web sites:

www.bpm-conseil.com and forge.bpm-conseil.com

               Thanks for your attention




                       LuceneRevolution, Boston               24

Más contenido relacionado

Similar a How to Gain Greater Business Intelligence from Lucene/Solr

The power of faceted search in alfresco
The power of faceted search in alfrescoThe power of faceted search in alfresco
The power of faceted search in alfresco
XeniT Solutions nv
 
Soa4 all technical achievements final
Soa4 all technical achievements finalSoa4 all technical achievements final
Soa4 all technical achievements final
John Domingue
 

Similar a How to Gain Greater Business Intelligence from Lucene/Solr (20)

What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
What’s new in apache lucene 3.0
What’s new in apache lucene 3.0What’s new in apache lucene 3.0
What’s new in apache lucene 3.0
 
What’s new in apache lucene 3.0
What’s new in apache lucene 3.0What’s new in apache lucene 3.0
What’s new in apache lucene 3.0
 
What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0
 
What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.
 
The power of faceted search in alfresco
The power of faceted search in alfrescoThe power of faceted search in alfresco
The power of faceted search in alfresco
 
Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010
 
Alfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform UpdateAlfresco Day Roma 2015: Platform Update
Alfresco Day Roma 2015: Platform Update
 
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...201511 -  Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Bo...
 
Alfresco 4.0 - A Complete Introduction
 Alfresco 4.0 - A Complete Introduction Alfresco 4.0 - A Complete Introduction
Alfresco 4.0 - A Complete Introduction
 
Alfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
Alfresco Day Vienna 2015 - Technical Track - Developer Platform UpdatesAlfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
Alfresco Day Vienna 2015 - Technical Track - Developer Platform Updates
 
Soa4 all technical achievements final
Soa4 all technical achievements finalSoa4 all technical achievements final
Soa4 all technical achievements final
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 
Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
 
What’s New in Solr 1.4
What’s New in Solr 1.4What’s New in Solr 1.4
What’s New in Solr 1.4
 

Más de lucenerevolution

Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
lucenerevolution
 

Más de lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Último

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

How to Gain Greater Business Intelligence from Lucene/Solr

  • 1. Patrick Beaucamp Founder of the Vanilla Project Mail : Patrick.beaucamp@bpm-conseil.com How to Gain Greater Business Intelligence with Vanilla from Solr/Lucene LuceneRevolution, Boston 1
  • 2. Presentation Agenda Vanilla powered by Lucene - Report Indexation, Search Interface - External document management - evolution & constraints Step to Solr/Lucene Adoption - Indexation, Storage, Search - Embeded Solr/Lucene - External Solr/Lucene Platform Keys Benefit for Vanilla powered by Solr/Lucene - Cluster Architecture - Cache Mechanism - Support for enhanced search language LuceneRevolution, Boston 2
  • 3. Some Vanilla Features Flash maps and charts : Reports, Cubes and Dashboard Vanilla Apps : Android and Iphone LuceneRevolution, Boston 3
  • 4. Vanilla Powered by Lucene (1/6) Vanilla is a full Business Intelligence Platform that provide : - Reporting, Olap, Dashboard, Kpi, Maps Visualisation - Etl, Workflow, Document Management search Engine LuceneRevolution, Boston 4
  • 5. Vanilla Powered by Lucene (2/6) Report Indexation - Search engine is Apache Lucene (summer 2010) - External Document & Vanilla Report are indexed - Different Indexation strategy for documents : – No indexation – Real Time indexation – Late Indexation 2 modules to manage indexation strategy - Enterprise Services to set document property - Norparena to Manage Indexation LuceneRevolution, Boston 5
  • 6. Vanilla Powered by Lucene (3/6) Search Interface - Search Interface available from Vanilla Portal - Search against Lucene index (inside Vanilla) - Search result is combined with Security on documents – List contains all documents – Documents are ordered based on popularity LuceneRevolution, Boston 6
  • 7. Vanilla Powered by Lucene (4/6) External document management - various document format are available (Lucene) - additional properties can be set on documents, for later useage in search criteria - check In / check Out on document for versioning - search is run on the latest document version LuceneRevolution, Boston 7
  • 8. Vanilla Powered by Lucene (5/6) Evolution and constraints - No clustering available for search engine (embeded Api), as opposed to Vanilla Report Services - Limitation in language and keywords (internal search) - No cache to manage search resultset, as opposed to Vanilla dataset, powered by Memcached - request from customers to be compliant with enterprise search engine → need to setup an external search architecture LuceneRevolution, Boston 8
  • 9. Vanilla Powered by Lucene (6/6) Embeded Lucene Api inside Vanilla Platform - Video LuceneRevolution, Boston 9
  • 10. Step to Solr/Lucene Adoption (1/9) Solr/Lucene is the natural evolution of any embeded Lucene platform Solr Version : 3.5 Indexation Vanilla Lucene Index can be transfert & read by a Solr/Lucene (a Solr/Lucene index is not usable inside Vanilla Platform) Storage Vanilla search Indexed can be managed by a Solr/Lucene platform Search Search language is compliant LuceneRevolution, Boston 10
  • 11. Step to Solr/Lucene Adoption (2/9) Embeded Solr/Lucene inside Vanilla Platform No need for any changed in Vanilla code : use of solrj Api Immediatly provide additional features such as new Keywords Potential upgrade to Solr/Lucene Enterprise LuceneRevolution, Boston 11
  • 12. Step to Solr/Lucene Adoption (3/9) From Embeded Lucene to Embeded Solr/Lucene inside Vanilla Platform LuceneRevolution, Boston 12
  • 13. Step to Solr/Lucene Adoption (4/9) Embeded Solr/Lucene inside Vanilla Platform - Video LuceneRevolution, Boston 13
  • 14. Step to Solr/Lucene Adoption (5/9) Solr/Lucene Platform with a Vanilla Platform Need for changes in Vanilla code, to separate document management, indexation & search Api → 10 man days workload Document Management Api Easy to move to any Cmis compliancy Indexation & Search Api Solr/Lucene oriented & compliant, but now open to any other Search Platform LuceneRevolution, Boston 14
  • 15. Step to Solr/Lucene Adoption (6/9) Coding Before Example of Code (Api) Before the split - Direct use of the Lucene Api - Parse the document content using Apache TIKA - Generate Lucene's queries LuceneRevolution, Boston 15
  • 16. Step to Solr/Lucene Adoption (7/9) Coding After Example of Code (Api) After the split - Easy to use Solrj Api - Distributed search - Indexation with automatic parsing (using Apache Tika) LuceneRevolution, Boston 16
  • 17. Step to Solr/Lucene Adoption (8/9) Solr/Lucene Platform with Vanilla Platform - Screenshot LuceneRevolution, Boston 17
  • 18. Step to Solr/Lucene Adoption (9/9) Solr/Lucene Platform with Vanilla Platform - Video LuceneRevolution, Boston 18
  • 19. Key Benefits for Vanilla Powered by Solr/Lucene (1/4) Clustering Search Architecture, outside of Vanilla Search results clustering implementation (CarrotClusteringEngine) is based on the Carrot2 framework. LuceneRevolution, Boston 19
  • 20. Key Benefits for Vanilla Powered by Solr/Lucene (2/4) Additional query language to perform search Solr Uses the Lucene Search Library and Extends it! - A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys - Powerful Extensions to the Lucene Query Language - Faceted Search and Filtering - Geospatial Search - Advanced, Configurable Text Analysis LuceneRevolution, Boston 20
  • 21. Key Benefits for Vanilla Powered by Solr/Lucene (3/4) New methods to manage result set (binary, Xml, Json) Solr enterprise search server with a REST-like API. You put documents in it (called "indexing") via XML, JSON or binary over HTTP. You query it via HTTP GET and receive XML, JSON, or binary results - Advanced Full-Text Search Capabilities - Optimized for High Volume Web Traffic - Standards Based Open Interfaces - XML,JSON and HTTP LuceneRevolution, Boston 21
  • 22. Key Benefits for Vanilla Powered by Solr/Lucene (4/4) Cache Mechanism Solr caches are associated with an Index Searcher Three cache implementations : solr.LRUCache (LRU = Least Recently Used in memory), solr.FastLRUCache, solr.LFUCache (Least Frequenty Used) Many configuration parameters for cache optimisation LuceneRevolution, Boston 22
  • 23. Next Steps Upgrade to Solr 4.0 New features for Document cycle Management Roadmap for better Internationalisation : - 10 languages available (not Japaneese) - Search Translation management LuceneRevolution, Boston 23
  • 24. Documentations and tutorials available on our Web sites: www.bpm-conseil.com and forge.bpm-conseil.com Thanks for your attention LuceneRevolution, Boston 24