SlideShare a Scribd company logo
1 of 19
Download to read offline
Meet Solr for the first time again 
Varun Thacker
Apache Solr has a huge install base and tremendous momentum 
Solr is both established & growing 
250,000+ 
most widely used search 
solution on the planet. 8M+ total downloads 
monthly downloads 
You use Solr everyday. 
Solr has tens of thousands 
of applications in production. 
2500+ open Solr jobs. 
Activity Summary 
30 Day summary 
Aug 18 - Sep 17 2014 
• 128 Commits 
• 18 Contributors 
12 Month Summary 
Sep 17, 2013 - Sep 17, 2014 
• 1351 Commits 
• 29 Contributors 
via https://www.openhub.net/p/solr
Search - Until recently 
• Large organizations (Enterprise) 
• Expensive 
• Complex 
• $$$$$
New Age Search 
• Everyone… startups, websites 
• Special use cases 
• E-commerce 
• Mails and personal data 
• Personal data - Across devices 
• Social and Local! 
• Analytics
Decision making! 
• Short time frame 
• Confidence measure: 
• Getting started quick 
• Configure and see the tip of the iceberg 
• Issues only uncover later in the story
Until recently… 
• Getting started: 
• Download 
• java -jar start.jar 
• SolrCloud, getting started…. 
• Download 
• Copy example directory ‘x’ times over. 
• java -Dbootstrap_confdir=./solr/collection1/conf - 
Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar 
• java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar 
• It runs!
Times… they are a changin… 
• Download 
• cd solr 
• Standalone: bin/solr start 
• SolrCloud, example, interactive: 
• bin/solr start -e cloud (< 2 minutes!)
Let’s index some data 
• Flexible JSON Indexing - Solr supports any JSON 
document and the document can be indexed in 
the required format in Solr 
• More reading: https://lucidworks.com/blog/ 
indexing-custom-json-data/
Managed Schema 
• Solr is the schema owner 
• REST APIs - Hide the implementation details 
• Schema-less mode 
• Update and Addition of Fields and FieldTypes 
• More reading: https://lucidworks.com/blog/ 
schemaless-solr-part-1/
Configuration APIs 
• Configure Solr using APIs 
• solrconfig.xml… What did you say?
Solr Scale Toolkit 
• Easily deploy SolrCloud clusters 
• Live patching and rolling restarts 
• Dependency on AWS soon to go away 
• Chef or Puppet still are valid approaches 
• More reading: http://lucidworks.com/blog/ 
introducing-the-solr-scale-toolkit/
Talking about the Admin UI… 
• Already improved from 3.x 
• Uploading documents 
• Collections API is coming soon 
Collection Actions
Recently Added Features 
• Document expiration and Time To Live (TTL) 
• Cursors: Efficient Deep Paging 
• Export Sorted Result Sets 
• SSL support in SolrCloud 
• Distributed Pivot Faceting 
• Suggester v2 
• CollapsingQParserPlugin 
• ReRankingQParserPlugin 
• Collections API improvements
There’s so much more coming up… 
• Schema Bulk API 
• Distributed IDF 
• Query DSL 
• Cross Data-center replication 
• Cluster Backup and Restore 
• SOLR - Make an application, not ‘war’.
It’s easy.. and stable! 
• Benchmarking 
• Tons of users testing it 
• Evolving test framework
Solr scalability is unmatched. 
• 10TB+ Index Size 
• 10 Billion+ Documents 
• 100 Million+ Daily Requests
Where is it headed? 
• Download 
• See that server directory? 
• Use start scripts 
• Send a document, or a few… 
• Things don’t really look the way they should? 
• Use the schema APIs 
• Add fields… not enough? 
• Add field types and then add fields 
• Configure Solr using REST APIs 
For Production: 
• Use Solr Scale Toolkit to deploy, 
patch and manage! 
• Configure Solr using REST APIs
Lucidworks Fusion 
Intelligent Search Services/API 
Recommendation Module Signal Processing Analytics Service 
Enrichment Analytics Store 
⚒ Services 
Discovery Engine 
Analyst 
Workbench 
eCommerce 
Solution 
Admin/ 
Management 
SiLK Log 
Analysis 
Search/ 
Discovery 
Partner 
Solutions 
Connector 
Framework
Connect @ 
https://twitter.com/varunthacker 
http://in.linkedin.com/in/varunthacker 
varun.thacker@lucidworks.com

More Related Content

What's hot

Scaling search in Oak with Solr
Scaling search in Oak with Solr Scaling search in Oak with Solr
Scaling search in Oak with Solr
Tommaso Teofili
 
Nutch as a Web data mining platform
Nutch as a Web data mining platformNutch as a Web data mining platform
Nutch as a Web data mining platform
abial
 

What's hot (17)

Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
Configuration management
Configuration managementConfiguration management
Configuration management
 
Introduction to apache nutch
Introduction to apache nutchIntroduction to apache nutch
Introduction to apache nutch
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
Get started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache MesosGet started with Developing Frameworks in Go on Apache Mesos
Get started with Developing Frameworks in Go on Apache Mesos
 
Making Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache MesosMaking Apache Kafka Elastic with Apache Mesos
Making Apache Kafka Elastic with Apache Mesos
 
Apache HDFS - Lab Assignment
Apache HDFS - Lab AssignmentApache HDFS - Lab Assignment
Apache HDFS - Lab Assignment
 
SphinxSE with MySQL
SphinxSE with MySQLSphinxSE with MySQL
SphinxSE with MySQL
 
Apache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exerciseApache Hadoop & Hive installation with movie rating exercise
Apache Hadoop & Hive installation with movie rating exercise
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on Mesos
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
 
Scaling search in Oak with Solr
Scaling search in Oak with Solr Scaling search in Oak with Solr
Scaling search in Oak with Solr
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)
 
Nutch as a Web data mining platform
Nutch as a Web data mining platformNutch as a Web data mining platform
Nutch as a Web data mining platform
 

Similar to Meet Solr For The Tirst Again

Similar to Meet Solr For The Tirst Again (20)

Ease of use in Apache Solr
Ease of use in Apache SolrEase of use in Apache Solr
Ease of use in Apache Solr
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of Collections
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
Real Time Indexing and Search - Ashwani Kapoor & Girish Gudla, Trulia
Real Time Indexing and Search - Ashwani Kapoor & Girish Gudla, TruliaReal Time Indexing and Search - Ashwani Kapoor & Girish Gudla, Trulia
Real Time Indexing and Search - Ashwani Kapoor & Girish Gudla, Trulia
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
 
Fusion on Kubernetes - Alan Eugenio & Joe Streeky, Lucidworks
Fusion on Kubernetes - Alan Eugenio & Joe Streeky, LucidworksFusion on Kubernetes - Alan Eugenio & Joe Streeky, Lucidworks
Fusion on Kubernetes - Alan Eugenio & Joe Streeky, Lucidworks
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
How do Solr and Azure Search compare?
How do Solr and Azure Search compare?How do Solr and Azure Search compare?
How do Solr and Azure Search compare?
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
 
What's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xWhat's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.x
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
Laravel and SOLR
Laravel and SOLRLaravel and SOLR
Laravel and SOLR
 
Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & ...
Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & ...Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & ...
Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & ...
 
Your Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
Your Big Data Stack is Too Big!: Presented by Timothy Potter, LucidworksYour Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
Your Big Data Stack is Too Big!: Presented by Timothy Potter, Lucidworks
 
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
Building a Lightweight Discovery Interface for China's Patents@NYC Solr/Lucen...
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Solr 4
Solr 4Solr 4
Solr 4
 

Recently uploaded

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
chumtiyababu
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 

Recently uploaded (20)

Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 

Meet Solr For The Tirst Again

  • 1. Meet Solr for the first time again Varun Thacker
  • 2. Apache Solr has a huge install base and tremendous momentum Solr is both established & growing 250,000+ most widely used search solution on the planet. 8M+ total downloads monthly downloads You use Solr everyday. Solr has tens of thousands of applications in production. 2500+ open Solr jobs. Activity Summary 30 Day summary Aug 18 - Sep 17 2014 • 128 Commits • 18 Contributors 12 Month Summary Sep 17, 2013 - Sep 17, 2014 • 1351 Commits • 29 Contributors via https://www.openhub.net/p/solr
  • 3. Search - Until recently • Large organizations (Enterprise) • Expensive • Complex • $$$$$
  • 4. New Age Search • Everyone… startups, websites • Special use cases • E-commerce • Mails and personal data • Personal data - Across devices • Social and Local! • Analytics
  • 5. Decision making! • Short time frame • Confidence measure: • Getting started quick • Configure and see the tip of the iceberg • Issues only uncover later in the story
  • 6. Until recently… • Getting started: • Download • java -jar start.jar • SolrCloud, getting started…. • Download • Copy example directory ‘x’ times over. • java -Dbootstrap_confdir=./solr/collection1/conf - Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar • java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar • It runs!
  • 7. Times… they are a changin… • Download • cd solr • Standalone: bin/solr start • SolrCloud, example, interactive: • bin/solr start -e cloud (< 2 minutes!)
  • 8. Let’s index some data • Flexible JSON Indexing - Solr supports any JSON document and the document can be indexed in the required format in Solr • More reading: https://lucidworks.com/blog/ indexing-custom-json-data/
  • 9. Managed Schema • Solr is the schema owner • REST APIs - Hide the implementation details • Schema-less mode • Update and Addition of Fields and FieldTypes • More reading: https://lucidworks.com/blog/ schemaless-solr-part-1/
  • 10. Configuration APIs • Configure Solr using APIs • solrconfig.xml… What did you say?
  • 11. Solr Scale Toolkit • Easily deploy SolrCloud clusters • Live patching and rolling restarts • Dependency on AWS soon to go away • Chef or Puppet still are valid approaches • More reading: http://lucidworks.com/blog/ introducing-the-solr-scale-toolkit/
  • 12. Talking about the Admin UI… • Already improved from 3.x • Uploading documents • Collections API is coming soon Collection Actions
  • 13. Recently Added Features • Document expiration and Time To Live (TTL) • Cursors: Efficient Deep Paging • Export Sorted Result Sets • SSL support in SolrCloud • Distributed Pivot Faceting • Suggester v2 • CollapsingQParserPlugin • ReRankingQParserPlugin • Collections API improvements
  • 14. There’s so much more coming up… • Schema Bulk API • Distributed IDF • Query DSL • Cross Data-center replication • Cluster Backup and Restore • SOLR - Make an application, not ‘war’.
  • 15. It’s easy.. and stable! • Benchmarking • Tons of users testing it • Evolving test framework
  • 16. Solr scalability is unmatched. • 10TB+ Index Size • 10 Billion+ Documents • 100 Million+ Daily Requests
  • 17. Where is it headed? • Download • See that server directory? • Use start scripts • Send a document, or a few… • Things don’t really look the way they should? • Use the schema APIs • Add fields… not enough? • Add field types and then add fields • Configure Solr using REST APIs For Production: • Use Solr Scale Toolkit to deploy, patch and manage! • Configure Solr using REST APIs
  • 18. Lucidworks Fusion Intelligent Search Services/API Recommendation Module Signal Processing Analytics Service Enrichment Analytics Store ⚒ Services Discovery Engine Analyst Workbench eCommerce Solution Admin/ Management SiLK Log Analysis Search/ Discovery Partner Solutions Connector Framework
  • 19. Connect @ https://twitter.com/varunthacker http://in.linkedin.com/in/varunthacker varun.thacker@lucidworks.com