SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
#LSNA17
#LSNA17
#LSNA17
SEO Relevance
Pages Liferay assets
Whole text is indexed Key/value docs are indexed
Opaque ranking criteria Scored queries, filters, field types
Reverse engineer Fine tune
Third party algorithms Search engine that you control
#LSNA17
GET /_search?explain
{
"query" : {
"term" : { "tag" : "LSNA17" }
}
}
GET /index/type/0/ _explain?q=user_id:2
"value" : 2.7051764,
"description" : "score(doc=0,freq=1.0), product of:",
"details" : [ {
"value" : 0.66422296,
"description" : "queryWeight, product of:",
"details" : [ {
"value" : 4.0726933,
"description" : "idf(docFreq=4, maxDocs=108)"
}, {
"value" : 0.16309182,
"description" : "queryNorm"
} ]
}, {
"value" : 4.0726933,
"description" : "fieldWeight in 0, product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(freq=1.0), with freq of:",
"details" : [ {
"value" : 1.0,
"description" : "termFreq=1.0"
} ]
}, {
"value" : 4.0726933,
"description" : "idf(docFreq=4, maxDocs=108)"
}, {
"value" : 1.0,
"description" : "fieldNorm(doc=0)"
"failure to match filter: cache(user_id:[2 TO 2])"
#LSNA17
query = apple eclipse zzz yyy xxx qqq kkk ttt rrr
2.345 doc1: apple banana
2.345 doc2: eclipse moon sun
16.415 doc3: zzz yyy xxx qqq kkk ttt rrr 111
#LSNA17
(Term Frequency/Inverse Document Frequency)
In question form... Score increases...
Term frequency How often a term appears in a field? + When the term pops up a lot of
times along the text
Inverse Document
Frequency
How rare is the term in the whole index? + When the term is found in this
document and not many others
Field-length norm How short is the field where the term is? + When there isn't much else in
the same field (like, a title)
#LSNA17
•
•
{ "must" : { "bool" : { "should" : [ { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : {
"content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "content_en_US" : { "query" : "pigeon",
"type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "description_en_US" : { "query" : "pigeon",
"type" : "boolean" } } }, { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : {
"description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : {
"entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase_prefix"
} } } ] } }, "should" : { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : {
"should" : [ { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "title_en_US" : { "query" : "pigeon",
"type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }
#LSNA17
● → FacetedSearcher →
● Indexer
● fields
● score
{ "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } }
{ "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } }
{ "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }
#LSNA17
Natural
language?
string:
text
● TF/IDF
● case insensitive
Score!
IDs and
Serials?
string:
keyword
● not_analyzed
● case sensitive
● match | no match
No score!
Non string
data?
integer,
date,
geo_point...
● match | no match No score!
(... "no score" really a const = 1)
#LSNA17
// IndexSettingsContributor
typeMappingsHelper. addTypeMappings(indexName, myCustomFieldMappings);
liferay-type-mappings.json
"content": {
"index": "analyzed",
"type": "string"
},
"organizationId": {
"index": "not_analyzed",
"type": "string"
},
"publishDate": {
"format": "yyyyMMddHHmmss",
"type": "date"
}
#LSNA17
• Analyzed human searches
• query types
• combinations
• best relevance
Favor text fields over keyword fields.
#LSNA17
"*ubstrin*"
• lowercase
• * → "full scan" ↓↓↓
• don't score
#LSNA17
1. full text search
2. Prefix
3. n-grams
#LSNA17
• Match →
• Prefix →
• Phrase →
Know your field, use the right queries.
#LSNA17
Write a field specific query builder
@Component(service = FieldQueryBuilder.class, immediate = true)
public class MyFieldQueryBuilder implements FieldQueryBuilder {
public Query build(String field, String keywords) {
Fine tune the right queries for your field
myBooleanQuery.add(q1, MUST); myBooleanQuery.add(q2, SHOULD); ...
#LSNA17
多言語検索
• Map
• suffix →
• "b" "a" "d"
• Stemming, stopwords
(https://www.elastic.co/guide/en/elasticsearch/guide/current/using-language-analyzers.html)
Pick the right language analyzer.
#LSNA17
document.addText(" myField_ja_JP", japanese);
document.addText(" myField_en_US", english);
Locale defaultLocale = portal. getSiteDefaultLocale (groupId);
document.addText( getLocalizedName ("myField", defaultLocale), translation);
addSearchLocalizedTerm (searchQuery, searchContext, " myField");
searchContext. setLocale(themeDisplay.getLocale());
liferay-type-mappings.json
"template_ja": {
"mapping": {
"analyzer": "kuromoji"
},
"match": "w+_ja_[A-Z]{2}b"
}
#LSNA17
• description, content
• title, title_en_US
• content
2x matching query clauses = inflated relevance.
Match once and only once.
#LSNA17
If already indexing once...
document.addText(getLocalizedName("myField", languageId), translation);
… no need to index twice...
// DON'T //// document.addText(" myField", content);
… match once and only once.
addSearchLocalizedTerm(searchQuery, searchContext, " myField");
// DON'T //// addSearchTerm(searchQuery, searchContext, " myField");
#LSNA17
• docs
• value
• display
• highlight
Index for rendering, render from doc.
#LSNA17
analyzed
✔
✗
[30] Liferay
[15] DXP
[15] Symposium
#LSNA17
not_analyzed
✔
✗
[15] Liferay DXP
[15] Liferay Symposium
#LSNA17
• Aggregate not_analyzed
– [15] Liferay DXP
– [15] Liferay Symposium
• Match analyzed
–
2 fields, 1 analyzed, 1 not_analyzed.
#LSNA17
Search on the text field
new MatchQuery("myfield", keywords);
Aggregate on the keyword field
myFacet.setFieldName("myfield.raw");
#LSNA17
• multifields
(https://www.elastic.co/guide/en/elasticsearch/guide/current/aggregations-and-analysis.html)
• Copy Fields
(https://wiki.apache.org/solr/SchemaXml#Copy_Fields)
• analyzed
• not_analyzed
#LSNA17
• elasticsearch-head
• Solr Admin
• query string
• explain
Tweak clauses, re-run query, repeat.
#LSNA17
#LSNA17
#LSNA17
Thank you!
And lots of relevant content at #LSNA17
#LSNA17

Más contenido relacionado

La actualidad más candente

Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch MeetupLoïc Bertron
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful servicesMarkus Lanthaler
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxCoffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxLucidworks
 
03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data OutOpenThink Labs
 
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaMarkus Lanthaler
 
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Markus Lanthaler
 
Faster and better search results with Elasticsearch
Faster and better search results with ElasticsearchFaster and better search results with Elasticsearch
Faster and better search results with ElasticsearchEnrico Polesel
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"George Stathis
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!Alexander Byndyu
 
GraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOGraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOLa Cuisine du Web
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуOlga Lavrentieva
 
JSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked DataJSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked DataGregg Kellogg
 
Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Ryan Heaton
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Rothamsted Research, UK
 

La actualidad más candente (20)

Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful services
 
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, FlaxCoffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
Coffee, Danish & Search: Presented by Alan Woodward & Charlie Hull, Flax
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
 
03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out
 
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaA Semantic Description Language for RESTful Data Services to Combat Semaphobia
A Semantic Description Language for RESTful Data Services to Combat Semaphobia
 
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...
 
Faster and better search results with Elasticsearch
Faster and better search results with ElasticsearchFaster and better search results with Elasticsearch
Faster and better search results with Elasticsearch
 
JSON-LD Update
JSON-LD UpdateJSON-LD Update
JSON-LD Update
 
MongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() OutputMongoDB World 2016: Deciphering .explain() Output
MongoDB World 2016: Deciphering .explain() Output
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!ElasticSearch: Найдется все... и быстро!
ElasticSearch: Найдется все... и быстро!
 
BigML.io - The BigML API
BigML.io - The BigML APIBigML.io - The BigML API
BigML.io - The BigML API
 
GraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTOGraphQL, l'avenir du REST par François ZANINOTTO
GraphQL, l'avenir du REST par François ZANINOTTO
 
Использование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайтуИспользование Elasticsearch для организации поиска по сайту
Использование Elasticsearch для организации поиска по сайту
 
JSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked DataJSON-LD: JSON for Linked Data
JSON-LD: JSON for Linked Data
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
 
Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014Hypermedia In Practice - FamilySearch Developers Conference 2014
Hypermedia In Practice - FamilySearch Developers Conference 2014
 
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
Behind the Scenes of KnetMiner: Towards Standardised and Interoperable Knowle...
 
Data exchange formats
Data exchange formatsData exchange formats
Data exchange formats
 

Similar a Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Symposium North America 2017, Austin, USA

ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in actionCodemotion
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑Pokai Chang
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL UsersAll Things Open
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchPedro Franceschi
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用LearningTech
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsMongoDB
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Codemotion
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB
 
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep DiveMongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep DiveMongoDB
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasNorberto Leite
 
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasMongoDB
 

Similar a Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Symposium North America 2017, Austin, USA (20)

ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Fazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearchFazendo mágica com ElasticSearch
Fazendo mágica com ElasticSearch
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Webinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation OptionsWebinar: Data Processing and Aggregation Options
Webinar: Data Processing and Aggregation Options
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
 
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
MongoDB .local Paris 2020: Tout savoir sur le moteur de recherche Full Text S...
 
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep DiveMongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
MongoDB .local London 2019: MongoDB Atlas Full-Text Search Deep Dive
 
Strongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible SchemasStrongly Typed Languages and Flexible Schemas
Strongly Typed Languages and Flexible Schemas
 
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Webinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible SchemasWebinar: Strongly Typed Languages and Flexible Schemas
Webinar: Strongly Typed Languages and Flexible Schemas
 

Último

What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 

Último (20)

What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 

Liferay Search: Best Practices to Dramatically Improve Relevance - Liferay Symposium North America 2017, Austin, USA

  • 3. #LSNA17 SEO Relevance Pages Liferay assets Whole text is indexed Key/value docs are indexed Opaque ranking criteria Scored queries, filters, field types Reverse engineer Fine tune Third party algorithms Search engine that you control
  • 4. #LSNA17 GET /_search?explain { "query" : { "term" : { "tag" : "LSNA17" } } } GET /index/type/0/ _explain?q=user_id:2 "value" : 2.7051764, "description" : "score(doc=0,freq=1.0), product of:", "details" : [ { "value" : 0.66422296, "description" : "queryWeight, product of:", "details" : [ { "value" : 4.0726933, "description" : "idf(docFreq=4, maxDocs=108)" }, { "value" : 0.16309182, "description" : "queryNorm" } ] }, { "value" : 4.0726933, "description" : "fieldWeight in 0, product of:", "details" : [ { "value" : 1.0, "description" : "tf(freq=1.0), with freq of:", "details" : [ { "value" : 1.0, "description" : "termFreq=1.0" } ] }, { "value" : 4.0726933, "description" : "idf(docFreq=4, maxDocs=108)" }, { "value" : 1.0, "description" : "fieldNorm(doc=0)" "failure to match filter: cache(user_id:[2 TO 2])"
  • 5. #LSNA17 query = apple eclipse zzz yyy xxx qqq kkk ttt rrr 2.345 doc1: apple banana 2.345 doc2: eclipse moon sun 16.415 doc3: zzz yyy xxx qqq kkk ttt rrr 111
  • 6. #LSNA17 (Term Frequency/Inverse Document Frequency) In question form... Score increases... Term frequency How often a term appears in a field? + When the term pops up a lot of times along the text Inverse Document Frequency How rare is the term in the whole index? + When the term is found in this document and not many others Field-length norm How short is the field where the term is? + When there isn't much else in the same field (like, a title)
  • 7. #LSNA17 • • { "must" : { "bool" : { "should" : [ { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }, { "bool" : { "must" : { "bool" : { "should" : [ { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "boolean" } } }, { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } ] } }, "should" : { "match" : { "title_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } } }
  • 8. #LSNA17 ● → FacetedSearcher → ● Indexer ● fields ● score { "match" : { "content_en_US" : { "query" : "pigeon", "type" : "phrase_prefix" } } } { "match" : { "description_en_US" : { "query" : "pigeon", "type" : "phrase", "boost" : 2.0 } } } { "match" : { "entryClassPK" : { "query" : "pigeon", "type" : "boolean" } } }
  • 9. #LSNA17 Natural language? string: text ● TF/IDF ● case insensitive Score! IDs and Serials? string: keyword ● not_analyzed ● case sensitive ● match | no match No score! Non string data? integer, date, geo_point... ● match | no match No score! (... "no score" really a const = 1)
  • 10. #LSNA17 // IndexSettingsContributor typeMappingsHelper. addTypeMappings(indexName, myCustomFieldMappings); liferay-type-mappings.json "content": { "index": "analyzed", "type": "string" }, "organizationId": { "index": "not_analyzed", "type": "string" }, "publishDate": { "format": "yyyyMMddHHmmss", "type": "date" }
  • 11. #LSNA17 • Analyzed human searches • query types • combinations • best relevance Favor text fields over keyword fields.
  • 12. #LSNA17 "*ubstrin*" • lowercase • * → "full scan" ↓↓↓ • don't score
  • 13. #LSNA17 1. full text search 2. Prefix 3. n-grams
  • 14. #LSNA17 • Match → • Prefix → • Phrase → Know your field, use the right queries.
  • 15. #LSNA17 Write a field specific query builder @Component(service = FieldQueryBuilder.class, immediate = true) public class MyFieldQueryBuilder implements FieldQueryBuilder { public Query build(String field, String keywords) { Fine tune the right queries for your field myBooleanQuery.add(q1, MUST); myBooleanQuery.add(q2, SHOULD); ...
  • 16. #LSNA17 多言語検索 • Map • suffix → • "b" "a" "d" • Stemming, stopwords (https://www.elastic.co/guide/en/elasticsearch/guide/current/using-language-analyzers.html) Pick the right language analyzer.
  • 17. #LSNA17 document.addText(" myField_ja_JP", japanese); document.addText(" myField_en_US", english); Locale defaultLocale = portal. getSiteDefaultLocale (groupId); document.addText( getLocalizedName ("myField", defaultLocale), translation); addSearchLocalizedTerm (searchQuery, searchContext, " myField"); searchContext. setLocale(themeDisplay.getLocale()); liferay-type-mappings.json "template_ja": { "mapping": { "analyzer": "kuromoji" }, "match": "w+_ja_[A-Z]{2}b" }
  • 18. #LSNA17 • description, content • title, title_en_US • content 2x matching query clauses = inflated relevance. Match once and only once.
  • 19. #LSNA17 If already indexing once... document.addText(getLocalizedName("myField", languageId), translation); … no need to index twice... // DON'T //// document.addText(" myField", content); … match once and only once. addSearchLocalizedTerm(searchQuery, searchContext, " myField"); // DON'T //// addSearchTerm(searchQuery, searchContext, " myField");
  • 20. #LSNA17 • docs • value • display • highlight Index for rendering, render from doc.
  • 23. #LSNA17 • Aggregate not_analyzed – [15] Liferay DXP – [15] Liferay Symposium • Match analyzed – 2 fields, 1 analyzed, 1 not_analyzed.
  • 24. #LSNA17 Search on the text field new MatchQuery("myfield", keywords); Aggregate on the keyword field myFacet.setFieldName("myfield.raw");
  • 25. #LSNA17 • multifields (https://www.elastic.co/guide/en/elasticsearch/guide/current/aggregations-and-analysis.html) • Copy Fields (https://wiki.apache.org/solr/SchemaXml#Copy_Fields) • analyzed • not_analyzed
  • 26. #LSNA17 • elasticsearch-head • Solr Admin • query string • explain Tweak clauses, re-run query, repeat.
  • 30. Thank you! And lots of relevant content at #LSNA17