SlideShare a Scribd company logo
1 of 24
Download to read offline
Lucene/Solr
  Case Studies
     Presented by Erik Hatcher
          March 25, 2009
          ApacheCon EU
            Amsterdam
erik.hatcher@lucidimagination.com




                                    1
Lucene Development
          with Ant

<index index=quot;${index.base.dir}/indexquot;
     xmlns=quot;antlib:org.apache.lucene.antquot;>
  <fileset dir=quot;${files.dir}quot;/>
</index>




                                             2
Rossetti Archive
•   Purpose: Dramatically improve findability and
    serendipitous discovery of Rossetti materials

•   Data Source: TEI-like XML

•   Challenges: case sensitive search, academic
    relevance tuning

•   Technologies: Lucene, Tapestry

•   http://www.rossettiarchive.org/rose/


                                                    3
4
5
Collex
•   Purpose: Build search/discover/share platform for
    scholarly objects, starting with 19th century
    domain (NINES) but aiming general purpose.

•   Data Sources: RDF and MARC

•   Challenges: Tagging update speed

•   Technologies: Solr, Ruby on Rails

•   http://www.collex.org



                                                        6
7
8
9
Blacklight
•   Purpose: Open source scalable clean next generation library
    discovery interface.

•   Data Sources: MARC, Fedora, EAD, ... anything

•   Challenges: academia, competitors

•   Technologies: Solr, Ruby, Rails, Java indexer(SolrMarc)

•   quot;A process, not a productquot;

    •   http://code4lib.org/node/177

•   http://blacklightopac.org/



                                                                  10
11
12
13
Blacklight story

• Bethany Nowviskie, Bess Sadler, and Erik
  Hatcher. “Adapting an Open Source,
  Scholarly Web 2.0 System for Findability in
  Library Catalogs.” Library 2.0 Initiatives in
  Academic Libraries. (Laura Cohen, ed.).
  Association of College and Research
  Libraries: Chicago, 2008.



                                                  14
Flare

• Distilled from Blacklight development,
  proof-of-concept Rails plugin
• Features: suggest, saved searches, pie chart
  faceting, Simile Timeline/Exhibit integration
• http://wiki.apache.org/solr/Flare

                                                  15
Solritas

• Light-weight Velocity templated Solr output
• rapid prototyping
• http://wiki.apache.org/solr/Solritas


                                                16
LucidFind
• Purpose: Company technology showcase,
  community focused service. Indexed
  lucene.apache.org/*: wiki, web, code, issues,
  e-mail, nice UI
• Challenges: None to speak of
• Technologies: Solr, PHP, Ant
• http://www.lucidimagination.com/search
                                                  17
Lucene
                              Wiki
                  Lucene
                   Email                 Lucene
                                      Issue Tracker


         Lucene
                                                      Lucid
          Web
                                                      Blog



Lucene
 Code                                                     Lucid

                           Solr                           CMS




                                         Search Results




                                                                  18
19
Powered by Ant
$ ant -p
Buildfile: build.xml

Main targets:

 archive-focus-logs    Archive rolling log files for posterity
 commit                Commit to Solr
 delete-source         Delete specified source from index
 index-code            Index Lucene projects code
 index-issues          Index JIRA issues and comments
 index-lia             Index Lucene in Action (1st edition)
 index-lucid           Index Lucid site and articles
 index-mail            Index mail
 index-web             Index Lucene web content
 index-wiki            Index Lucene wiki
 optimize              Optimize Solr index




                                                                 20
Questions?



             21
Answer:
 quot;it dependsquot;




                22
lucidimagination.com

                       23
e-book now available!
      Print coming this summer
  http://www.manning.com/hatcher3
                                    24

More Related Content

Similar to Lucene Case Studies ApacheCon EU 2009

ARIADNE federation
ARIADNE federationARIADNE federation
ARIADNE federationJoris Klerkx
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
Linq (from the inside)
Linq (from the inside)Linq (from the inside)
Linq (from the inside)Mike Clement
 
O2 presentation jan 09 - v1.00
O2  presentation   jan 09 - v1.00O2  presentation   jan 09 - v1.00
O2 presentation jan 09 - v1.00Dinis Cruz
 
Software Architecture Erosion and Modernization
Software Architecture Erosion and ModernizationSoftware Architecture Erosion and Modernization
Software Architecture Erosion and Modernizationbmerkle
 
LibX 2.0
LibX 2.0LibX 2.0
LibX 2.0eby
 
Cohere: Towards Web 2.0 Argumentation
Cohere: Towards Web 2.0 ArgumentationCohere: Towards Web 2.0 Argumentation
Cohere: Towards Web 2.0 ArgumentationSimon Buckingham Shum
 
Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...
Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...
Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...Nilesh Panchal
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Togethernullhandle
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.inovex GmbH
 
Culerity and Headless Full Stack Integration Testing
Culerity and Headless Full Stack Integration TestingCulerity and Headless Full Stack Integration Testing
Culerity and Headless Full Stack Integration TestingPatrick Huesler
 
Is Enterprise Search Ripe for Open Source Disruption?
Is Enterprise Search Ripe for Open Source Disruption?Is Enterprise Search Ripe for Open Source Disruption?
Is Enterprise Search Ripe for Open Source Disruption?Enterprise 2.0 Conference
 
Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015ontopic
 
Library discovery: past, present and some futures
Library discovery: past, present and some futuresLibrary discovery: past, present and some futures
Library discovery: past, present and some futureslisld
 
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Erik Hatcher
 
Web Technology Trends (early 2009)
Web Technology Trends (early 2009)Web Technology Trends (early 2009)
Web Technology Trends (early 2009)Prodosh Banerjee
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 

Similar to Lucene Case Studies ApacheCon EU 2009 (20)

ARIADNE federation
ARIADNE federationARIADNE federation
ARIADNE federation
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Linq (from the inside)
Linq (from the inside)Linq (from the inside)
Linq (from the inside)
 
O2 presentation jan 09 - v1.00
O2  presentation   jan 09 - v1.00O2  presentation   jan 09 - v1.00
O2 presentation jan 09 - v1.00
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Software Architecture Erosion and Modernization
Software Architecture Erosion and ModernizationSoftware Architecture Erosion and Modernization
Software Architecture Erosion and Modernization
 
LibX 2.0
LibX 2.0LibX 2.0
LibX 2.0
 
Cohere: Towards Web 2.0 Argumentation
Cohere: Towards Web 2.0 ArgumentationCohere: Towards Web 2.0 Argumentation
Cohere: Towards Web 2.0 Argumentation
 
Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...
Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...
Ruby on-rails-101-presentation-slides-for-a-five-day-introductory-course-1194...
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Together
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.
 
Culerity and Headless Full Stack Integration Testing
Culerity and Headless Full Stack Integration TestingCulerity and Headless Full Stack Integration Testing
Culerity and Headless Full Stack Integration Testing
 
Is Enterprise Search Ripe for Open Source Disruption?
Is Enterprise Search Ripe for Open Source Disruption?Is Enterprise Search Ripe for Open Source Disruption?
Is Enterprise Search Ripe for Open Source Disruption?
 
Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015
 
Library discovery: past, present and some futures
Library discovery: past, present and some futuresLibrary discovery: past, present and some futures
Library discovery: past, present and some futures
 
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...
 
Web Technology Trends (early 2009)
Web Technology Trends (early 2009)Web Technology Trends (early 2009)
Web Technology Trends (early 2009)
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Tel Vortrag
Tel VortragTel Vortrag
Tel Vortrag
 

More from Erik Hatcher

Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Erik Hatcher
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered LibrariesErik Hatcher
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - ChicagoErik Hatcher
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and TricksErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development TutorialErik Hatcher
 

More from Erik Hatcher (20)

Ted Talk
Ted TalkTed Talk
Ted Talk
 
Solr Payloads
Solr PayloadsSolr Payloads
Solr Payloads
 
it's just search
it's just searchit's just search
it's just search
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Solr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
 
Solr Powered Libraries
Solr Powered LibrariesSolr Powered Libraries
Solr Powered Libraries
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago
 
Query Parsing - Tips and Tricks
Query Parsing - Tips and TricksQuery Parsing - Tips and Tricks
Query Parsing - Tips and Tricks
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Solr Flair
Solr FlairSolr Flair
Solr Flair
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

Lucene Case Studies ApacheCon EU 2009

  • 1. Lucene/Solr Case Studies Presented by Erik Hatcher March 25, 2009 ApacheCon EU Amsterdam erik.hatcher@lucidimagination.com 1
  • 2. Lucene Development with Ant <index index=quot;${index.base.dir}/indexquot; xmlns=quot;antlib:org.apache.lucene.antquot;> <fileset dir=quot;${files.dir}quot;/> </index> 2
  • 3. Rossetti Archive • Purpose: Dramatically improve findability and serendipitous discovery of Rossetti materials • Data Source: TEI-like XML • Challenges: case sensitive search, academic relevance tuning • Technologies: Lucene, Tapestry • http://www.rossettiarchive.org/rose/ 3
  • 4. 4
  • 5. 5
  • 6. Collex • Purpose: Build search/discover/share platform for scholarly objects, starting with 19th century domain (NINES) but aiming general purpose. • Data Sources: RDF and MARC • Challenges: Tagging update speed • Technologies: Solr, Ruby on Rails • http://www.collex.org 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. Blacklight • Purpose: Open source scalable clean next generation library discovery interface. • Data Sources: MARC, Fedora, EAD, ... anything • Challenges: academia, competitors • Technologies: Solr, Ruby, Rails, Java indexer(SolrMarc) • quot;A process, not a productquot; • http://code4lib.org/node/177 • http://blacklightopac.org/ 10
  • 11. 11
  • 12. 12
  • 13. 13
  • 14. Blacklight story • Bethany Nowviskie, Bess Sadler, and Erik Hatcher. “Adapting an Open Source, Scholarly Web 2.0 System for Findability in Library Catalogs.” Library 2.0 Initiatives in Academic Libraries. (Laura Cohen, ed.). Association of College and Research Libraries: Chicago, 2008. 14
  • 15. Flare • Distilled from Blacklight development, proof-of-concept Rails plugin • Features: suggest, saved searches, pie chart faceting, Simile Timeline/Exhibit integration • http://wiki.apache.org/solr/Flare 15
  • 16. Solritas • Light-weight Velocity templated Solr output • rapid prototyping • http://wiki.apache.org/solr/Solritas 16
  • 17. LucidFind • Purpose: Company technology showcase, community focused service. Indexed lucene.apache.org/*: wiki, web, code, issues, e-mail, nice UI • Challenges: None to speak of • Technologies: Solr, PHP, Ant • http://www.lucidimagination.com/search 17
  • 18. Lucene Wiki Lucene Email Lucene Issue Tracker Lucene Lucid Web Blog Lucene Code Lucid Solr CMS Search Results 18
  • 19. 19
  • 20. Powered by Ant $ ant -p Buildfile: build.xml Main targets: archive-focus-logs Archive rolling log files for posterity commit Commit to Solr delete-source Delete specified source from index index-code Index Lucene projects code index-issues Index JIRA issues and comments index-lia Index Lucene in Action (1st edition) index-lucid Index Lucid site and articles index-mail Index mail index-web Index Lucene web content index-wiki Index Lucene wiki optimize Optimize Solr index 20
  • 24. e-book now available! Print coming this summer http://www.manning.com/hatcher3 24