Apache Solr is an enterprise search platform built on Apache Lucene. It provides fast, scalable search functionality and allows for spell checking, highlighting, faceting and more. Solr configurations are defined in schema.xml and solrconfig.xml files which specify fields, analyzers, caching and other settings. Documents are indexed and queried via HTTP requests to Solr servers. Liferay can integrate with Solr to offload search indexing and querying for improved performance in clustered environments.
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Apache solr liferay
1. Apache Solr
Enterprise search platform
from the Apache Lucene project
Rivet Logic Corporation
1800 Alexander Bell Drive
Suite 400
Reston, VA 20191
Ph: 703.955.3480 Fax: 703.234.7711
2. What is Solr?
● Search Server
● Built upon Apache Lucene
● Fast, very
● Scalable, query load and collection size
● Interoperable
● Extensible
● Lucene power exposed over HTTP
● Spell checking, highlighting, faceting and etc.
● Caching
● Replication
● Distributed search
7. Response Writer
● A Response Writer generates the formatted response of
a search.
● The wt parameter selects the Response Writer to be
used
● json, php, phps, python, ruby, xml, xslt, velocity
<queryResponseWriter name="xslt" class="org.apache.solr.request.XSLTResponseWriter">
<int name="xsltCacheLifetimeSeconds">5</int>
</queryResponseWriter>
8. Analyzers, Tokenizers, Filters
● The Analyzer class is a native Lucene concept that determines
how tokens are produced from a piece of text
<fieldType name="nametext" class="solr.TextField">
<analyzer class="org.apache.lucene.analysis.WhitespaceAnalyzer"/>
</fieldType>
● The job of a tokenizer is to break up a stream of text into
tokens
● A token looks at each Token in the stream sequentially
and decides whether to pass it along, replace it or discard
it
<fieldType name="text" class="solr.TextField">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
</analyzer>
</fieldType>
9. Other features
● Highlighting
○ &hl=true&hl.fl=developers
● Synonyms
○ <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
● Spell check
○ The spell check component can return a list of alternative spelling
suggestions.
○ <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
● Content Streams
○ Allows Solr server to fetch local or remote data itself. Must enable remote streaming in
solrconfig.xml
● Solr Cell
○ leveraging Tika, extracts and indexes rich documents such as Word, PDF, HTML, and many
other types
● More like this
○ http://wiki.apache.org/solr/MoreLikeThis
10. Indexing with solrJ
SolrServer solr =
new CommonsHttpSolrServer(
new URL("http://localhost:8983/solr"));
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "EXAMPLEDOC01");
doc.addField("title", "NOVAJUG SolrJ Example");
solr.add(doc);
solr.commit(); // after a batch, not per document
solr.optimize(); // periodically, if/when needed
11. Data Import Handler
● Indexes relational database, XML data, and e-mail
sources
● Supports full and incremental/delta indexing
● Highly extensible with custom data sources,
transformers, etc
● http://wiki.apache.org/solr/DataImportHandler
12. Replication
● Master is polled
● Replicant pulls Lucene index and optionally also Solr
configuration files
● Query throughput scaling: replicate and load balance
● http://wiki.apache.org/solr/SolrReplication
14. Liferay + Solr: Motivation
● Centralizing search index in clustered Liferay
environment
● Performance improvement
○ Re-indexing costs too much for large DB's
○ Often time indexes of Liferay deployments in a cluster are not
synchronized
15. Liferay + Solr: Configuration 1
Install Solr (http://lucene.apache.org/solr)
Setting up environment variables
● SOLR_HOME = /${solr installed folder}
● JAVA_OPTS = "$JAVA_OPTS -Dsolr.solr.home=$SOLR_HOME/example/solr/data"
solr.xml
● Place the file under ${tomcat}/conf/Catalina/localhost/ with following content
<?xml version="1.0" encoding="utf-8">
<Context docBase="$SOLR_HOME/apache-solr-1.4.0.war"
debug="0" crossContext="true">
<Environment name="solr/home" type="java.lang.String"
value="$SOLR_HOME" override="true" />
</Context>
16. Liferay + Solr: Configuration 2
schema.xml
● This file tells Solr how to index the data coming from Liferay, and can be
customized for your installation.
● Copy this file from solr-web plugin to $SOLR_HOME/conf (you may have
to create the conf directory) in your Solr home folder.
... <fields>
<field name="comments" type="text" indexed="true" stored="true" />
<field name="content" type="text" indexed="true" stored="true" />
<field name="description" type="text" indexed="true" stored="true" />
<field name="name" type="text" indexed="true" stored="true" />
<field name="properties" type="text" indexed="true" stored="true" />
<field name="title" type="text" indexed="true" stored="true" />
<field name="uid" type="string" indexed="true" stored="true" />
<field name="url" type="text" indexed="true" stored="true" />
<field name="userName" type="text" indexed="true" stored="true" />
<field name="version" type="text" indexed="true" stored="true" />
<dynamicField name="*" type="string" indexed="true" stored="true" />
</fields>
<uniqueKey>uid</uniqueKey>
<defaultSearchField>content</defaultSearchField>
... <copyField source="comments" dest="content"/> ... ...
17. Liferay + Solr: Configuration 3
Copy WAR file
● Copy the WAR file $SOLR_HOME/dist/apache-solr-${solr.version}.war
into $SOLR_HOME/example; where ${solr.version} represents Solr
version number, i.e., 1.4.0.
Start Liferay/tomcat
● Solr will be picked up and "solr" will be deployed automatically under
${tomcat}/webapps folder
Install solr-web Liferay plugin
● Latest Liferay plugin can be checked out from the following location
http://svn.liferay.com/repos/public/plugins/trunk/webs/solr-web
● Build the checked out plugin and deploy it
18. Liferay + Solr: Configuration 4
Final Step
● We need to rebuild Liferay search indexes
● Control Panel > Server Administration
20. Liferay + Solr: Back to the default?
● Simply undeploy solr-web plugin
● Rebuild search indexes using the control panel described
in the previous step