Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio

Eche un vistazo a continuación

1 de 45 Anuncio

Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany

Descargar para leer sin conexión

The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.

André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.

The core Search frameworks in Liferay 7 have been significantly retooled to benefit not only from Liferay's new modular architecture, but also from one of the most innovative players in the market: Elasticsearch, which replaces Lucene as the default search engine in Portal. This session will cover topics like clustering and scalability, unveil improvements (both Elasticsearch and Solr) like aggregations, filters, geolocation, "more like this" and other new query types, and also hot new features for the Enterprise like out-of-the-box Marvel cluster monitoring and Shield security.

André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.

Anuncio
Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

Similares a Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany (20)

Anuncio

Más reciente (20)

Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany

  1. 1. Harnessing The Power of Search André Ricardo Barreto de Oliveira ("Arbo") Software Engineer - Team Lead - Search Darmstadt, Germany 7 October, 2015
  2. 2. What's Search and why is it so cool?
  3. 3. The dawn of Search
  4. 4. Searching higher
  5. 5. Search and the Digital Experience
  6. 6. Understanding Search
  7. 7. Inside the Search Engine The Index
  8. 8. Inside the Search Engine The Index Documents
  9. 9. Inside the Search Engine The Index Documents Fields
  10. 10. Inside the Search Engine The Index Documents Fields Not that different from ye olde database?...
  11. 11. Indexing documents PUT /megacorp/employee/1 { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] } PUT /megacorp/employee/2 { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] } PUT /megacorp/employee/3 { "first_name" : "Douglas", "last_name" : "Fir", "age" : 35, "about": "I like to build cabinets", "interests": [ "forestry" ] }
  12. 12. Queries and Filters GET /megacorp/employee/_search?q=last_name:Smith "hits": [ { "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] GET /megacorp/employee/_search { "query" : { "filtered" : { "filter" : { "range" : { "age" : { "gt" : 21 } } }, "query" : { "match" : { "last_name" : "smith" } } } } }
  13. 13. Full-Text Search GET /megacorp/employee/_search { "query" : { "match" : { "about" : "rock climbing" } } } "hits": [ { "_score": 0.16273327, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_score": 0.016878016, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ]
  14. 14. Analysis and Analyzers Set the shape to semi-transparent by calling Set_Trans(5) Standard analyzer set, the, shape, to, semi, transparent, by, calling, set_trans, 5 Simple analyzer set, the, shape, to, semi, transparent, by, calling, set, trans Whitespace analyzer Set, the, shape, to, semi-transparent, by, calling, Set_Trans(5) English language analyzer set, shape, semi, transpar, call, set_tran, 5
  15. 15. Field mappings { "number_of_clicks": { "type": "integer" } } { "tag": { "type": "string", "index": "not_analyzed" } } { "tweet": { "type": "string", "analyzer": "english" } }
  16. 16. Analytics and Aggregations GET /megacorp/employee/_search { "query": { "match": { "last_name": "smith" } }, "aggs" : { "all_interests" : { "terms" : { "field" : "interests" }, "aggs" : { "avg_age" : { "avg" : { "field" : "age" } } } } } } "buckets": [ { "key": "music", "doc_count": 2, "avg_age": { "value": 28.5 } }, { "key": "sports", "doc_count": 1, "avg_age": { "value": 25 } } ]
  17. 17. The Liferay Search Infrastructure
  18. 18. The Liferay Search architecture Liferay Portal Assets: web content, message boards, wiki pages... Search infrastructure (Magic happens here) Search engine(s) Indices, documents, analysis...
  19. 19. The Liferay Search Engine plugins public interface SearchEngine { public IndexSearcher getIndexSearcher(); public IndexWriter getIndexWriter(); } public class ElasticsearchSearchEngine extends BaseSearchEngine public class ElasticsearchIndexSearcher extends BaseIndexSearcher public class ElasticsearchIndexWriter extends BaseIndexWriter public class SolrSearchEngine extends BaseSearchEngine public class SolrIndexSearcher extends BaseIndexSearcher public class SolrIndexWriter extends BaseIndexWriter
  20. 20. Solr: schema.xml <fields> <field indexed="true" name="articleId" stored="true" type="string_keyword_lowercase" /> <field indexed="true" name="companyId" stored="true" type="long" /> <field indexed="true" name="emailAddress" stored="true" type="string" /> </fields> The Liferay Document Mappings Elasticsearch: liferay-type-mappings.json "LiferayDocumentType": { "properties": { "articleId": { "analyzer": "keyword_lowercase", "store": "yes", "type": "string" }, "companyId": { "index": "not_analyzed", "store": "yes", "type": "string" }, "emailAddress": { "index": "not_analyzed", "store": "yes", "type": "string" } } }
  21. 21. From Portal assets to Index documents… public interface Indexer<T> { public Document getDocument(T object); } public class JournalArticleIndexer extends BaseIndexer<JournalArticle> { protected Document doGetDocument(JournalArticle journalArticle) { Document document = getBaseModelDocument(CLASS_NAME, journalArticle); document.addText( LocalizationUtil.getLocalizedName(Field.CONTENT, languageId), content); document.addKeyword( Field.VERSION, journalArticle.getVersion()); document.addDate( "displayDate", journalArticle.getDisplayDate()); } } public class MBMessageIndexer extends BaseIndexer<MBMessage> { protected Document doGetDocument(MBMessage mbMessage) { Document document = getBaseModelDocument(CLASS_NAME, mbMessage); document.addText( Field.CONTENT, processContent(mbMessage)); document.addKeyword( "discussion", discussion == null ? false : true); if (mbMessage.isAnonymous()) { document.remove(Field.USER_NAME); } } } public interface Document { public void addKeyword(String name, String value); public void addNumber(String name, long value); }
  22. 22. … from Search Box to queries and filters public class JournalArticleIndexer extends BaseIndexer<JournalArticle> { public void postProcessSearchQuery( BooleanQuery searchQuery, BooleanFilter fullQueryBooleanFilter, SearchContext searchContext) { addSearchTerm(searchQuery, searchContext, Field.ARTICLE_ID, false); addSearchLocalizedTerm(searchQuery, searchContext, Field.CONTENT, false); addSearchLocalizedTerm(searchQuery, searchContext, Field.TITLE, false); addSearchTerm(searchQuery, searchContext, Field.USER_NAME, false); } } public class MBThreadIndexer extends BaseIndexer<MBThread> { public void postProcessContextBooleanFilter( BooleanFilter contextBooleanFilter, SearchContext searchContext) { contextBooleanFilter.addRequiredTerm( "discussion", discussion); if ((endDate > 0) && (startDate > 0)) { contextBooleanFilter.addRangeTerm( "lastPostDate", startDate, endDate); } } }
  23. 23. Classic query types (and filters) TermQuery / TermFilter "term" : { "locale" : "de_DE" } TermRangeQuery / RangeTermFilter "range" : { "age" : { "gte" : 8, "lte" : 42 } } WildcardQuery "wildcard" : { "company" : "L*ray" } StringQuery "query_string": { "query": "(content:this OR name:this) AND (content:that OR name:that)" } BooleanQuery / BooleanFilter "bool" : { "must" : { "term" : { "locale" : "de_DE" } }, "must_not" : { "range" : { "age" : { "from" : 8, "to" : 42 } } }, "should" : [ { "wildcard" : { "company" : "L*ray" } }, { "term" : { "product" : "Portal" } } ] }
  24. 24. Speaking to the Search Engine public interface Query { public BooleanFilter getPreBooleanFilter(); public Filter getPostFilter(); } public interface Filter { public Boolean isCached(); } public class StringQueryTranslatorImpl implements StringQueryTranslator { public QueryBuilder translate(StringQuery stringQuery) { // Elasticsearch Client Java API return QueryBuilders.queryStringQuery(stringQuery.getQuery()); } } public class ElasticsearchIndexSearcher extends BaseIndexSearcher { protected SearchResponse doSearch( SearchContext searchContext, Query query) { // Elasticsearch Client Java API Client client = _elasticsearchConnectionManager.getClient(); SearchRequestBuilder searchRequestBuilder = client.prepareSearch( getSelectedIndexNames(queryConfig, searchContext)); QueryBuilder queryBuilder = _queryTranslator.translate( query, searchContext); searchRequestBuilder.setQuery(queryBuilder); SearchResponse searchResponse = searchRequestBuilder.get(); return searchResponse; } }
  25. 25. Search in Liferay 7
  26. 26. What's new in Liferay 7 Liferay 6 ● Embedded Lucene by default ● Remote: Solr only ● Solr 4 ● Portal-centric Lucene clustering Liferay 7 ● Embedded Elasticsearch by default ● Remote: Elasticsearch and Solr ● Solr 5.x and SolrCloud ● Native, transparent Elasticsearch clustering ● Queries + Filters + Boosting + Geolocation ● Extensibility and modularization ● Enterprise extras ○ Shield for security ○ Marvel for cluster monitoring ○ Kibana for visualization
  27. 27. New Queries MatchQuery "match" : { "subject" : { "query" : "Liferay Portal", "type" : "phrase" } } MoreLikeThisQuery "more_like_this" : { "fields" : ["title", "content"], "like_text" : "Search In Liferay 7", "min_term_freq" : 1, "max_query_terms" : 12 } DisMaxQuery "dis_max" : { "tie_breaker" : 0.7, "queries" : [ { "term" : { "age" : 34 } }, { "term" : { "age" : 35 } } ] } FuzzyQuery "fuzzy" : { "user" : { "value" : "ed", "fuzziness" : 2, "max_expansions": 100 } } MatchAllQuery / MatchAllFilter "match_all" : { "boost" : 1.2 } MultiMatchQuery "multi_match" : { "query": "Enterprise. Open Source. For Life", "type": "most_fields", "fields": [ "title", "title.original", "title.shingles" ] }
  28. 28. New Filters ExistsFilter "exists" : { "field" : "emailAddress" } MissingFilter "missing" : { "field" : "emailAddress" } PrefixFilter "prefix" : { "product" : "life" } TermsFilter "terms" : { "locale" : ["de_DE", "pt_BR", "en_CA"] } QueryFilter "fquery" : { "query" : { "bool" : { "must" : [ { "wildcard" : { "company" : "L*ray" } }, { "term" : { "product" : "Portal" } } ] } }, "_cache" : true }
  29. 29. Geolocation filters GeoDistanceFilter "geo_distance" : { "distance" : "12km", "pin.location" : { "lat" : 40, "lon" : -70 } } GeoBoundingBoxFilter "geo_bounding_box" : { "pin.location" : { "top_left" : { "lat" : 40.73, "lon" : -74.1 }, "bottom_right" : { "lat" : 40.01, "lon" : -71.12 } } } GeoDistanceRangeFilter "geo_distance_range" : { "from" : "200km", "to" : "400km", "pin.location" : { "lat" : 40, "lon" : -70 } } GeoPolygonFilter "geo_polygon" : { "person.location" : { "points" : [ [-70, 40], [-80, 30], [-90, 20] ] } }
  30. 30. Query-time boosting "should": [ { "match": { "title": { "query": "Liferay Portal", "boost": 2 } } }, { "match": { "content": { "query": "Liferay Portal", } } } ]
  31. 31. New Aggregations: Top Hits "terms": { "field": "conference", "size": 2 }, "aggs": { "talks": { "top_hits": { "size" : 1, "sort": [ { "attendees": { "order": "desc" } } ] } } } { "key": "Liferay DEVCON", "talks": { "hits": [ { "_source": { "title": "The Power of Search" } } ] } }, { "key": "Liferay North America Symposium", "talks": { "hits": [ { "_source": { "title": "The ELK Stack" } } ] } }
  32. 32. New Aggregations: Extended Stats "extended_stats" : { "field" : "attendees" } "attendees_per_talk_stats": { "count": 9, "min": 72, "max": 99, "avg": 86, "sum": 774, "sum_of_squares": 67028, "std_deviation": 7.180219742846005 }
  33. 33. Modularity and Search ● OSGi ● Liferay's default Search Engine: now a plugin in itself ● Extension points in Search ○ Node Settings contributors → fine tune your cluster ○ Index Settings contributors → fine tune your shards and logs ○ Analyzers and Mappings contributors → fine tune your fields and queries
  34. 34. Liferay 7: Enter Elasticsearch
  35. 35. Why Elasticsearch? Best of breed Built for modern web applications Distributed and clusterable by design Lucene based Multi-tenancy Great vendor support Great monitoring tools: Marvel, Logstash
  36. 36. Great for Developers Open Source Amazing documentation High "just works" factor, e.g. zero-config indexing and clustering REST for queries, health, admin - everything Update live settings programmatically Great Java Client API Pretty JSON for talks ;-)
  37. 37. Clustering with Liferay and Elasticsearch Production mode Dev mode
  38. 38. Scaling and tuning made easy
  39. 39. Enterprise-level Search in Liferay 7 EE
  40. 40. Security: Shield Protect your Liferay index with a username and password SSL/TLS encryption for traffic within the Liferay Elasticsearch cluster Elasticsearch plugin - no need for an external security solution Restrict access to Liferay Portal instances with IP filtering
  41. 41. Monitoring: Marvel
  42. 42. Visualization: Kibana
  43. 43. Thanks and happy searching! http://j.mp/SearchLiferayDevcon2015 andre.oliveira@liferay.com github.com/arboliveira @arbocombr

×