SlideShare una empresa de Scribd logo
1 de 20
First steps with Solr 4.7
Achraf KRID
Plan
•Definition
•Installation (with tomcat 7)
•Querying, faceting
•Indexation
•Configuration
•Symfony Integration
•Comparaison
Solr ?
SolrTM is the popular, blazing fast open source enterprise search platform from
the Apache LuceneTM project. Its major features include powerful full-text
search, hit highlighting, faceted search, near real-time indexing, dynamic
clustering, database integration, rich document (e.g., Word, PDF) handling, and
geospatial search.
Install Tomcat 7 + Solr 4.7
● Sudo apt-get install tomcat7 tomcat7-admin
● /etc/tomcat7/tomcat-users.xml
● http://localhost:8080/manager/html
● curl http://archive.apache.org/dist/lucene/solr/4.7.2/solr-4.7.2.tgz | tar xvz
● cp ~/solr-4.7.2/example/lib/ext/* /usr/share/tomcat7/lib/
● cp ~/solr-4.7.2/dist/solr-4.7.2.war /var/lib/tomcat7/webapps/solr.war
● cp -R ~/solr-4.6.1/example/solr /var/lib/tomcat
● chown -R tomcat7:tomcat7 /var/lib/tomcat7/solr
<tomcat-users>
<role rolename="manager-gui"/>
<user username="root" password="root" roles="manager-gui,admin-gui"/>
</tomcat-users>
Querying Data
java -jar start.jar
java -jar post.jar *.xml
http://solr/select?q=electronics
http://solr/select?q=electronics&sort=price+desc
http://solr/select?q=electronics&rows=50&start=50
http://solr/select?q=electronics&fl=name+price
http://solr/select?q=electronics&fq=inStock:true
Facets
&facet=true&facet.field=cat&facet.field=inStock
&facet.query=price:[0 TO 10]&facet.query=price:[10 TO *]
Schema.xml
●
01:<?xml version="1.0" encoding="UTF-8" ?>
●
02:<schema name="example" version="1.1">
●
03: <types>
●
04: <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
●
05: </types>
●
06:
●
07:
●
08: <fields>
●
09: <field name="id" type="string" indexed="true" stored="true" required="true" />
●
10: <field name="category" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true"/>
●
11: <field name="size" type="string" indexed="true" stored="true"/>
●
12: <field name="text" type="string" indexed="true" stored="false" multiValued="true"/>
●
13: </fields>
●
14:
●
15: <uniqueKey>id</uniqueKey>
●
16: <defaultSearchField>text</defaultSearchField>
●
17: <solrQueryParser defaultOperator="AND"/>
●
18:
●
19: <copyField source="id" dest="text"/>
●
20: <copyField source="category" dest="text"/>
●
21: <copyField source="size" dest="text"/>
●
22:
●
23:</schema>
Where You Describe Your Data
Schema.xml
●<field> Describes How You Deal With Specific Named Fields
●<dynamicField> Describes How To Deal With Fields That Match A Glob
(Unless There Is A Specific <field> For Them)
●<copyField> Describes How To Construct Fields From Other Fields
<field name="title" type="text" stored=”false” />
<dynamicField name="price*" type="sfloat" indexed="true" />
<copyField source="*" dest="catchall" />
Schema.xml
Analyzer
<fieldType name="text_greek" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>
Schema.xml
Tokenizers : A Tokenizer splits a stream of characters (from each individual
field value) into a series of tokens.There can be only one Tokenizer in each
Analyzer.
Token Filters :Tokens produced by the Tokenizer are passed through a series
of Token Filters that add, change, or remove tokens. The field is then indexed by
the resulting token stream.
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
AnalysisTool: Output
Solrconfig.xml
solrconfig.xml is where you configure options for how this Solr instance should
behave.
Solrconfig.xml
 DataImportHandler
● Create lib/ directory in /var/lib/tomcat/solr
● Add the data import handler jars to lib/ :
cp ~/solr-4.7.2/dist/solr-dataimporthandler-*.jar /var/lib/tomcat7/solr/lib
● Add DBMS Driver (for mysql → mysql-connector-java-bin.jar)
● Add requestHndler to solr/conf/solrconfig.xml
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
Data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://127.0.0.1/restofraisweb" user="root" password="root" />
<document name="restofrais">
<entity name="Restaurant" query="SELECT id, nomRestaurant, ville_restaurant_id,code_postal_id,
type_cuisine_id FROM Restaurant">
<field column="id" name="id" />
<field column="nomRestaurant" name="nomRestaurant" />
<entity name="Ville" query="SELECT designation FROM Ville where id=$
{Restaurant.ville_restaurant_id}" >
<field column="designation" name="ville" />
</entity>
<entity name="Boisson" query="SELECT designation FROM restaurants_boissons inner join
Boisson on Boisson.id = restaurants_boissons.boisson_id where restaurants_boissons.restaurant_id=$
{Restaurant.id}" >
<field column="designation" name="boissons" />
</entity>
</entity>
</document>
</dataConfig>
Symfony Integration
•Solr->Solarium->nelmio_solarium
•Composer.json
•AppKernel.php
•Config/config.yml
{
"require": {
"nelmio/solarium-bundle": "2.*"
}
}
public function registerBundles()
{
$bundles = array(
...
new NelmioSolariumBundleNelmioSolariumBundle(),
...
);
...
}
nelmio_solarium:
endpoints:
default:
host: 172.16.0.219
port: 8080
path: /solr/restofrais
# core: solr
timeout: 5
clients:
default:
client_class:
SolariumCoreClientClient
adapter_class:
SolariumCoreClientAdapterHttp
Symfony Integration
• /**
• * @Route("/hello/{name}", name="_demo_hello")
• * @Template()
• */
• public function helloAction($name)
• {
•
• $client = $this->get('solarium.client');
•
• $select = $client->createSelect();
•
• $select->setQuery("ville:".$name);
•
• $results = $client->select($select);
• return array(
• 'search_results' => $results,
• );
Apache Solr vs ElasticSearch
Solr & ElasticSearch
•Lucene Apache Based
•Faceting
•Boosting
Solr :
•Pivot Facets
•One set of fields per schema, one schema per core
ElasticSearch :
•REST API
•Structured Query DSL
•Percolation
Apache Solr vs ElasticSearch
Apache Solr vs ElasticSearch
Links
●Solr wiki
http://wiki.apache.org/solr/
●Install Solr 4.6 with Tomcat 7 on Debian 7 :
http://pacoup.com/2014/02/05/install-solr-4-6-with-tomcat-7-on-debian-7
●solr-vs-elasticsearch.com :
http://solr-vs-elasticsearch.com
●Nelmio Solarium Bundle :
https://github.com/nelmio/NelmioSolariumBundle
●The Many Facets of Apache Solr, Yonik Seeley
http://www.youtube.com/watch?v=LyjiLYN-qIk

Más contenido relacionado

La actualidad más candente

Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't oneRecon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Artem I. Baranov
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
Ronald Bradford
 
Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]
RootedCON
 

La actualidad más candente (20)

Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't oneRecon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
Recon2013 alex ionescu-i got 99 problems but a kernel pointer ain't one
 
Network programming
Network programmingNetwork programming
Network programming
 
CNIT 127: Ch 2: Stack Overflows in Linux
CNIT 127: Ch 2: Stack Overflows in LinuxCNIT 127: Ch 2: Stack Overflows in Linux
CNIT 127: Ch 2: Stack Overflows in Linux
 
Linux audit framework
Linux audit frameworkLinux audit framework
Linux audit framework
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
 
Operating Systems - A Primer
Operating Systems - A PrimerOperating Systems - A Primer
Operating Systems - A Primer
 
127 Ch 2: Stack overflows on Linux
127 Ch 2: Stack overflows on Linux127 Ch 2: Stack overflows on Linux
127 Ch 2: Stack overflows on Linux
 
Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]Hernan Ochoa - WCE Internals [RootedCON 2011]
Hernan Ochoa - WCE Internals [RootedCON 2011]
 
CNIT 127 Ch 3: Shellcode
CNIT 127 Ch 3: ShellcodeCNIT 127 Ch 3: Shellcode
CNIT 127 Ch 3: Shellcode
 
2009-08-24 The Linux Audit Subsystem Deep Dive
2009-08-24 The Linux Audit Subsystem Deep Dive2009-08-24 The Linux Audit Subsystem Deep Dive
2009-08-24 The Linux Audit Subsystem Deep Dive
 
Memory profiler and garbage collector in C#
Memory profiler and garbage collector in C#Memory profiler and garbage collector in C#
Memory profiler and garbage collector in C#
 
ROP 輕鬆談
ROP 輕鬆談ROP 輕鬆談
ROP 輕鬆談
 
CNIT 127: Ch 3: Shellcode
CNIT 127: Ch 3: ShellcodeCNIT 127: Ch 3: Shellcode
CNIT 127: Ch 3: Shellcode
 
Inheritance
InheritanceInheritance
Inheritance
 
Wait for your fortune without Blocking!
Wait for your fortune without Blocking!Wait for your fortune without Blocking!
Wait for your fortune without Blocking!
 
One Shellcode to Rule Them All: Cross-Platform Exploitation
One Shellcode to Rule Them All: Cross-Platform ExploitationOne Shellcode to Rule Them All: Cross-Platform Exploitation
One Shellcode to Rule Them All: Cross-Platform Exploitation
 
Percona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorialPercona Live 2017 ­- Sharded cluster tutorial
Percona Live 2017 ­- Sharded cluster tutorial
 
CNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 2: Stack overflows on LinuxCNIT 127 Ch 2: Stack overflows on Linux
CNIT 127 Ch 2: Stack overflows on Linux
 
Orchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and MspectatorOrchestrated Functional Testing with Puppet-spec and Mspectator
Orchestrated Functional Testing with Puppet-spec and Mspectator
 
Course lecture - An introduction to the Return Oriented Programming
Course lecture - An introduction to the Return Oriented ProgrammingCourse lecture - An introduction to the Return Oriented Programming
Course lecture - An introduction to the Return Oriented Programming
 

Destacado

M6 thai-2551
M6 thai-2551M6 thai-2551
M6 thai-2551
Walk4Fun
 
Fake book mary
Fake book maryFake book mary
Fake book mary
110374
 
Redes socales 1
Redes socales   1Redes socales   1
Redes socales 1
pipemaho
 
เอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่านเอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่าน
naaikawaii
 
Fake book mary
Fake book maryFake book mary
Fake book mary
110374
 
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาคโครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
Walk4Fun
 
แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์
Walk4Fun
 
Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02
kilar
 

Destacado (20)

Engish day
Engish dayEngish day
Engish day
 
งานนำเสนอ1
งานนำเสนอ1งานนำเสนอ1
งานนำเสนอ1
 
M6 thai-2551
M6 thai-2551M6 thai-2551
M6 thai-2551
 
Fake book mary
Fake book maryFake book mary
Fake book mary
 
E-waste
E-wasteE-waste
E-waste
 
Redes socales 1
Redes socales   1Redes socales   1
Redes socales 1
 
เอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่านเอกสารประกอบวิชาการอ่าน
เอกสารประกอบวิชาการอ่าน
 
Fake book mary
Fake book maryFake book mary
Fake book mary
 
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาคโครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
โครงงานคอมพิวเตอร์ เรื่อง-ประเพณีไทยสี่ภาค
 
future of gmos
  future of gmos  future of gmos
future of gmos
 
Dronhub.net presentation
Dronhub.net presentationDronhub.net presentation
Dronhub.net presentation
 
Our people tab 3
Our people tab 3Our people tab 3
Our people tab 3
 
Introduction to Xamarin
Introduction to XamarinIntroduction to Xamarin
Introduction to Xamarin
 
แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์แบบร่างโครงงานคอมพิมเตอร์
แบบร่างโครงงานคอมพิมเตอร์
 
BG holidays
BG holidaysBG holidays
BG holidays
 
REC76_profile
REC76_profileREC76_profile
REC76_profile
 
Viviana Cristiglio (May 27th 2014)
Viviana Cristiglio (May 27th 2014)Viviana Cristiglio (May 27th 2014)
Viviana Cristiglio (May 27th 2014)
 
Pengumuman seleksi casn_ggd_2016
Pengumuman seleksi casn_ggd_2016Pengumuman seleksi casn_ggd_2016
Pengumuman seleksi casn_ggd_2016
 
икона
иконаикона
икона
 
Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02Economics objectivequestionbank-100528201411-phpapp02
Economics objectivequestionbank-100528201411-phpapp02
 

Similar a Solr a.b-ab

Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
Erik Hatcher
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
JSGB
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
Erik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Erik Hatcher
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
Erik Hatcher
 

Similar a Solr a.b-ab (20)

Apache solr liferay
Apache solr liferayApache solr liferay
Apache solr liferay
 
Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solr
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than EverApache Solr 1.4 – Faster, Easier, and More Versatile than Ever
Apache Solr 1.4 – Faster, Easier, and More Versatile than Ever
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdution
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
 
NIO-Grizly.pdf
NIO-Grizly.pdfNIO-Grizly.pdf
NIO-Grizly.pdf
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)Introduction to Laravel Framework (5.2)
Introduction to Laravel Framework (5.2)
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Solr a.b-ab

  • 1. First steps with Solr 4.7 Achraf KRID
  • 2. Plan •Definition •Installation (with tomcat 7) •Querying, faceting •Indexation •Configuration •Symfony Integration •Comparaison
  • 3. Solr ? SolrTM is the popular, blazing fast open source enterprise search platform from the Apache LuceneTM project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search.
  • 4. Install Tomcat 7 + Solr 4.7 ● Sudo apt-get install tomcat7 tomcat7-admin ● /etc/tomcat7/tomcat-users.xml ● http://localhost:8080/manager/html ● curl http://archive.apache.org/dist/lucene/solr/4.7.2/solr-4.7.2.tgz | tar xvz ● cp ~/solr-4.7.2/example/lib/ext/* /usr/share/tomcat7/lib/ ● cp ~/solr-4.7.2/dist/solr-4.7.2.war /var/lib/tomcat7/webapps/solr.war ● cp -R ~/solr-4.6.1/example/solr /var/lib/tomcat ● chown -R tomcat7:tomcat7 /var/lib/tomcat7/solr <tomcat-users> <role rolename="manager-gui"/> <user username="root" password="root" roles="manager-gui,admin-gui"/> </tomcat-users>
  • 5. Querying Data java -jar start.jar java -jar post.jar *.xml http://solr/select?q=electronics http://solr/select?q=electronics&sort=price+desc http://solr/select?q=electronics&rows=50&start=50 http://solr/select?q=electronics&fl=name+price http://solr/select?q=electronics&fq=inStock:true
  • 7. Schema.xml ● 01:<?xml version="1.0" encoding="UTF-8" ?> ● 02:<schema name="example" version="1.1"> ● 03: <types> ● 04: <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> ● 05: </types> ● 06: ● 07: ● 08: <fields> ● 09: <field name="id" type="string" indexed="true" stored="true" required="true" /> ● 10: <field name="category" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true"/> ● 11: <field name="size" type="string" indexed="true" stored="true"/> ● 12: <field name="text" type="string" indexed="true" stored="false" multiValued="true"/> ● 13: </fields> ● 14: ● 15: <uniqueKey>id</uniqueKey> ● 16: <defaultSearchField>text</defaultSearchField> ● 17: <solrQueryParser defaultOperator="AND"/> ● 18: ● 19: <copyField source="id" dest="text"/> ● 20: <copyField source="category" dest="text"/> ● 21: <copyField source="size" dest="text"/> ● 22: ● 23:</schema> Where You Describe Your Data
  • 8. Schema.xml ●<field> Describes How You Deal With Specific Named Fields ●<dynamicField> Describes How To Deal With Fields That Match A Glob (Unless There Is A Specific <field> For Them) ●<copyField> Describes How To Construct Fields From Other Fields <field name="title" type="text" stored=”false” /> <dynamicField name="price*" type="sfloat" indexed="true" /> <copyField source="*" dest="catchall" />
  • 9. Schema.xml Analyzer <fieldType name="text_greek" class="solr.TextField"> <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/> </fieldType>
  • 10. Schema.xml Tokenizers : A Tokenizer splits a stream of characters (from each individual field value) into a series of tokens.There can be only one Tokenizer in each Analyzer. Token Filters :Tokens produced by the Tokenizer are passed through a series of Token Filters that add, change, or remove tokens. The field is then indexed by the resulting token stream. <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.LowerCaseFilterFactory"/> </analyzer>
  • 12. Solrconfig.xml solrconfig.xml is where you configure options for how this Solr instance should behave.
  • 13. Solrconfig.xml  DataImportHandler ● Create lib/ directory in /var/lib/tomcat/solr ● Add the data import handler jars to lib/ : cp ~/solr-4.7.2/dist/solr-dataimporthandler-*.jar /var/lib/tomcat7/solr/lib ● Add DBMS Driver (for mysql → mysql-connector-java-bin.jar) ● Add requestHndler to solr/conf/solrconfig.xml <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler>
  • 14. Data-config.xml <dataConfig> <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://127.0.0.1/restofraisweb" user="root" password="root" /> <document name="restofrais"> <entity name="Restaurant" query="SELECT id, nomRestaurant, ville_restaurant_id,code_postal_id, type_cuisine_id FROM Restaurant"> <field column="id" name="id" /> <field column="nomRestaurant" name="nomRestaurant" /> <entity name="Ville" query="SELECT designation FROM Ville where id=$ {Restaurant.ville_restaurant_id}" > <field column="designation" name="ville" /> </entity> <entity name="Boisson" query="SELECT designation FROM restaurants_boissons inner join Boisson on Boisson.id = restaurants_boissons.boisson_id where restaurants_boissons.restaurant_id=$ {Restaurant.id}" > <field column="designation" name="boissons" /> </entity> </entity> </document> </dataConfig>
  • 15. Symfony Integration •Solr->Solarium->nelmio_solarium •Composer.json •AppKernel.php •Config/config.yml { "require": { "nelmio/solarium-bundle": "2.*" } } public function registerBundles() { $bundles = array( ... new NelmioSolariumBundleNelmioSolariumBundle(), ... ); ... } nelmio_solarium: endpoints: default: host: 172.16.0.219 port: 8080 path: /solr/restofrais # core: solr timeout: 5 clients: default: client_class: SolariumCoreClientClient adapter_class: SolariumCoreClientAdapterHttp
  • 16. Symfony Integration • /** • * @Route("/hello/{name}", name="_demo_hello") • * @Template() • */ • public function helloAction($name) • { • • $client = $this->get('solarium.client'); • • $select = $client->createSelect(); • • $select->setQuery("ville:".$name); • • $results = $client->select($select); • return array( • 'search_results' => $results, • );
  • 17. Apache Solr vs ElasticSearch Solr & ElasticSearch •Lucene Apache Based •Faceting •Boosting Solr : •Pivot Facets •One set of fields per schema, one schema per core ElasticSearch : •REST API •Structured Query DSL •Percolation
  • 18. Apache Solr vs ElasticSearch
  • 19. Apache Solr vs ElasticSearch
  • 20. Links ●Solr wiki http://wiki.apache.org/solr/ ●Install Solr 4.6 with Tomcat 7 on Debian 7 : http://pacoup.com/2014/02/05/install-solr-4-6-with-tomcat-7-on-debian-7 ●solr-vs-elasticsearch.com : http://solr-vs-elasticsearch.com ●Nelmio Solarium Bundle : https://github.com/nelmio/NelmioSolariumBundle ●The Many Facets of Apache Solr, Yonik Seeley http://www.youtube.com/watch?v=LyjiLYN-qIk