SlideShare a Scribd company logo
1 of 16
Download to read offline
Oslo Solr Community   March 20th 2012




                What is new in Solr 4.0ß
                     Jan Høydahl




Sponsors:
Agenda                                        2




–   Solr/Lucene 4 ß, what, when?
–   Near-Realtime-Search
–   SolrCloud
–   Better Spellchecker
–   Flex – smaller index
–   Pluggable Ranking
–   Sort by Function
–   Result field aliasing and pseudo fields
–   Pivot facets
–   Join query
–   New Admin GUI

– And what about Solr 3.6 ?
4.0 beta?                                 3




–   Never released a public beta before
–   So many changes, it makes sense
–   Time frame??
–   Stability
Near-Realtime-Search                                                              4




– Before:
      • Add, add add add (not searchable)
      • Commit (new segment written → searchable)
– 4.0:
      •    In-memory index
      •    Add
      •    Soft-commit-(within/auto)
      •    Real-time GET:
<!-- realtime get handler, guaranteed to return the latest stored fields of
          any document, without the need to commit or open a new searcher.   The
          current implementation relies on the updateLog feature being enabled.
-->
 <requestHandler name="/get" class="solr.RealTimeGetHandler">
      <lst name="defaults">
          <str name="omitHeader">true</str>
      </lst>
 </requestHandler>
Solr Cloud                                                          5




         – Solr Cloud is the popular name for an initiative to make Solr
           more easily scalable and managable in a distributed world
         – Enables centralized configuration and cluster status
           monitoring
         – Solr 4.0ß contains the first features
               •   Apache ZooKeeper support, including built-in ZK
               •   Support for auto distributed/LB query (by means of ZK)
               •   Fault tolerant indexing and recovery
               •   Add a new node and let it discover its role and sync up
         – Expected features to come
               • Tools to manage the config in ZK
               • Re-balancing of shards



http://wiki.apache.org/solr/SolrCloud
6

      Solr Cloud...

– New concepts:
   • Collection: Cores making up one data set
   • ZooKeeper: Central coordination server
– Easier distributed search:
   • /solr/web/select?q=*:*&distrib=true
       – This queries all cores in same "collection"
– Easier distributed indexing:
   • http://<any.server>/solr/web/update...
Solr Cloud on the index side...   7




http://wiki.apache.org/solr/SolrCloud
Better spellchecker         8




– Direct SpellChecker
– Automaton based
  (no extra lucene-index)
– No long build times
– Better performance
– Better accuracy (?)
Flex – smaller index              9




– Lucene's Flex APIs
– Lets you plug in your own
  codecs
– Greater flexibility in how
  you can represent the
  binary index
– Opens up for many new
  features
   •   DocValues
   •   Pluggable ranking
   •   TEXT index
   •   Store as UTF-8[]
   •   Or other encoding for
       space saving for Chinese
Pluggable Ranking                              10




– Lucene uses TF/IDF and VSM
– Now support for BM25




– Plug your own!
– Hopefully attracts researchers
– Also, pluggable Similarity class per field
Sort by Function                        11




– q=foo&sort=sub(price,discount) desc
– q=foo&sort=dist(2, x, y, 0, 0) asc
Result field aliasing and pseudo fields                            12




– Aliasing:
   • q=foo&fl=score,tittel:title,rabattpris:sub(price,discount)
– Field name globbing:
   • q=foo&fl=score,t*
– Pseudo fields:
   • q=foo&fl=score,[explain],[docid],[shard],[value v=42 t=int]
Pivot facets                       13




– Multi dimensional facets
   • &facet.pivot=cat,popularity
Join query                            14




– Simple Join feature (inner join)
– &q={!join from=manu_id to=id}ipod
New Admin GUI   15
Solr 3.6                                                16




– SOLR-2764*: NorwegianLightStemmer,
              NorwegianMinimalStemmer
– SOLR-2202*: Money/Currency FieldType
– SOLR-2826*: URLClassify Update Processor
– SOLR-3056: Japanese field type in schema.xml
– SOLR-3026*: eDismax user fields
– SOLR-3140*: omitNorms default for all numeric field types
– SOLR-2901*: Upgrade Solr to Tika 1.0
– SOLR-1709: Distributed Date and Range Faceting
– SOLR-2487*: Do not include slf4j-jdk14 jar in WAR
– SOLR-2509*: spellcheck StringIndexOutOfBoundsException

* Committed by Jan Høydahl

More Related Content

What's hot

How to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterHow to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr cluster
lucenerevolution
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
thelabdude
 
Scaling search in Oak with Solr
Scaling search in Oak with Solr Scaling search in Oak with Solr
Scaling search in Oak with Solr
Tommaso Teofili
 

What's hot (20)

Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of Collections
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
 
How to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterHow to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr cluster
 
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMBuilding and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environment
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
Oak / Solr integration
Oak / Solr integrationOak / Solr integration
Oak / Solr integration
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
What's New on AWS and What it Means to You
What's New on AWS and What it Means to YouWhat's New on AWS and What it Means to You
What's New on AWS and What it Means to You
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
 
Intro to Apache Solr
Intro to Apache SolrIntro to Apache Solr
Intro to Apache Solr
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
 
Scaling search in Oak with Solr
Scaling search in Oak with Solr Scaling search in Oak with Solr
Scaling search in Oak with Solr
 

Viewers also liked

Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptables
zubin71
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
Benjamin Good
 
Light steel villa catalogue log
Light steel villa catalogue logLight steel villa catalogue log
Light steel villa catalogue log
eishimachinery
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
Benjamin Good
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
Benjamin Good
 

Viewers also liked (20)

Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmm
 
IMSafer Angel Round
IMSafer Angel RoundIMSafer Angel Round
IMSafer Angel Round
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptables
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorse
 
Light steel villa catalogue log
Light steel villa catalogue logLight steel villa catalogue log
Light steel villa catalogue log
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLC
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogue
 
Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010
 
Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative Spirit
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 

Similar to Oslo Solr MeetUp March 2012 - Solr4 alpha

Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
lucenerevolution
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
lucenerevolution
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
Erik Hatcher
 

Similar to Oslo Solr MeetUp March 2012 - Solr4 alpha (20)

What's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xWhat's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.x
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
From Lucene to Solr 4 Trunk
From Lucene to Solr 4 TrunkFrom Lucene to Solr 4 Trunk
From Lucene to Solr 4 Trunk
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
 
Meet Solr For The Tirst Again
Meet Solr For The Tirst AgainMeet Solr For The Tirst Again
Meet Solr For The Tirst Again
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
Solr 101
Solr 101Solr 101
Solr 101
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys"
 

More from Cominvent AS

Open source breakfast norge findwise
Open source breakfast norge findwiseOpen source breakfast norge findwise
Open source breakfast norge findwise
Cominvent AS
 

More from Cominvent AS (7)

Solr's missing plugin ecosystem
Solr's missing plugin ecosystemSolr's missing plugin ecosystem
Solr's missing plugin ecosystem
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søk
 
Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010
 
Open source breakfast norge findwise
Open source breakfast norge findwiseOpen source breakfast norge findwise
Open source breakfast norge findwise
 
Frokostseminar mai 2010 solr open source cominvent as
Frokostseminar mai 2010 solr open source cominvent asFrokostseminar mai 2010 solr open source cominvent as
Frokostseminar mai 2010 solr open source cominvent as
 
Migrating Fast to Solr
Migrating Fast to SolrMigrating Fast to Solr
Migrating Fast to Solr
 
Cominvent AS company Presentation
Cominvent AS company PresentationCominvent AS company Presentation
Cominvent AS company Presentation
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Oslo Solr MeetUp March 2012 - Solr4 alpha

  • 1. Oslo Solr Community March 20th 2012 What is new in Solr 4.0ß Jan Høydahl Sponsors:
  • 2. Agenda 2 – Solr/Lucene 4 ß, what, when? – Near-Realtime-Search – SolrCloud – Better Spellchecker – Flex – smaller index – Pluggable Ranking – Sort by Function – Result field aliasing and pseudo fields – Pivot facets – Join query – New Admin GUI – And what about Solr 3.6 ?
  • 3. 4.0 beta? 3 – Never released a public beta before – So many changes, it makes sense – Time frame?? – Stability
  • 4. Near-Realtime-Search 4 – Before: • Add, add add add (not searchable) • Commit (new segment written → searchable) – 4.0: • In-memory index • Add • Soft-commit-(within/auto) • Real-time GET: <!-- realtime get handler, guaranteed to return the latest stored fields of any document, without the need to commit or open a new searcher. The current implementation relies on the updateLog feature being enabled. --> <requestHandler name="/get" class="solr.RealTimeGetHandler"> <lst name="defaults"> <str name="omitHeader">true</str> </lst> </requestHandler>
  • 5. Solr Cloud 5 – Solr Cloud is the popular name for an initiative to make Solr more easily scalable and managable in a distributed world – Enables centralized configuration and cluster status monitoring – Solr 4.0ß contains the first features • Apache ZooKeeper support, including built-in ZK • Support for auto distributed/LB query (by means of ZK) • Fault tolerant indexing and recovery • Add a new node and let it discover its role and sync up – Expected features to come • Tools to manage the config in ZK • Re-balancing of shards http://wiki.apache.org/solr/SolrCloud
  • 6. 6 Solr Cloud... – New concepts: • Collection: Cores making up one data set • ZooKeeper: Central coordination server – Easier distributed search: • /solr/web/select?q=*:*&distrib=true – This queries all cores in same "collection" – Easier distributed indexing: • http://<any.server>/solr/web/update...
  • 7. Solr Cloud on the index side... 7 http://wiki.apache.org/solr/SolrCloud
  • 8. Better spellchecker 8 – Direct SpellChecker – Automaton based (no extra lucene-index) – No long build times – Better performance – Better accuracy (?)
  • 9. Flex – smaller index 9 – Lucene's Flex APIs – Lets you plug in your own codecs – Greater flexibility in how you can represent the binary index – Opens up for many new features • DocValues • Pluggable ranking • TEXT index • Store as UTF-8[] • Or other encoding for space saving for Chinese
  • 10. Pluggable Ranking 10 – Lucene uses TF/IDF and VSM – Now support for BM25 – Plug your own! – Hopefully attracts researchers – Also, pluggable Similarity class per field
  • 11. Sort by Function 11 – q=foo&sort=sub(price,discount) desc – q=foo&sort=dist(2, x, y, 0, 0) asc
  • 12. Result field aliasing and pseudo fields 12 – Aliasing: • q=foo&fl=score,tittel:title,rabattpris:sub(price,discount) – Field name globbing: • q=foo&fl=score,t* – Pseudo fields: • q=foo&fl=score,[explain],[docid],[shard],[value v=42 t=int]
  • 13. Pivot facets 13 – Multi dimensional facets • &facet.pivot=cat,popularity
  • 14. Join query 14 – Simple Join feature (inner join) – &q={!join from=manu_id to=id}ipod
  • 16. Solr 3.6 16 – SOLR-2764*: NorwegianLightStemmer, NorwegianMinimalStemmer – SOLR-2202*: Money/Currency FieldType – SOLR-2826*: URLClassify Update Processor – SOLR-3056: Japanese field type in schema.xml – SOLR-3026*: eDismax user fields – SOLR-3140*: omitNorms default for all numeric field types – SOLR-2901*: Upgrade Solr to Tika 1.0 – SOLR-1709: Distributed Date and Range Faceting – SOLR-2487*: Do not include slf4j-jdk14 jar in WAR – SOLR-2509*: spellcheck StringIndexOutOfBoundsException * Committed by Jan Høydahl