1. Google Is Just a Two Page
SiteRelevant Results with Sitecore.ContentSearch
Martina Helene Welander
Technical Consulting Engineer, Sitecore
2. Speaker
• Technical Consulting Engineer at Sitecore
• Community and Information Enthusiast
• Ecosystem Sites with Dnepropetrovsk Team
Martina Helene
Welander
5. Speaker
• Technical Consulting Engineer at Sitecore
• Community and Information Enthusiast
• Ecosystem Sites with Dnepropetrovsk Team
• @mhwelander / mhwelander.net
Martina Helene
Welander
26. Where Sitecore adds value
• Source content to index to strongly typed object – and back again!
• You can actually index anything
• Provider model – Solr, Lucene, Elastic Search, Azure Search
• Provider-agnostic LINQ-based search API
• Highly configurable
37. Hello my name is
Martina
“Hello”, “my”, “name”, “is”,
“Martina”
38. Types of Tokenizer
StandardTokenizer
“My name is Martina” “My”, “name”, “is”, “Martina”
KeywordTokenizer
“My name is Martina” “My name is Martina”
N-Gram Tokenizer (Min 4, Max 5)
“sitecore” -> “site”, “itec”, “ecor”, “core”, “siteco”, “iteco” … etc
61. What makes something relevant? (tf.idf)
• tf – term frequency
• Idf – inverse document frequency
• coord - # of terms found in document
• fieldNorm – field length
67. #3 – I love you, PredicateBuilder
Expression<Func<ResultItem, bool>> predicate =
PredicateBuilder.True<ResultItem>();
foreach (var word in list)
{
predicate = predicate.Or(x =>
x.Title.Contains(word);
}
False for ‘OR’,
True for ‘AND’
68. #4 – Boost
• At query time
• At index time (type or field)
• Rules-based
74. If the title…
• Like phrase (with slop)
• Contains phrase
• Starts with phrase
• Equals phrase
If the content…
• Like phrase (with slop)
• Contains phrase
• Starts with phrase
• Equals phrase
86. Sitecore 7 ContentSearch Tips
- Matt Burke
“Finding a user’s search term in the title or
keywords of a document is probably more
relevant than one where the term is only in
the body”
98. // TODO: On the plane home
• Keywords
• Location
• Pinning exact title matches – “scaling”
• Expected search phrases with boost – e.g. “scaling xDB”, “xDB
scaling”, “xDB scaling options”
xDB
• Key Behaviour Cache – developer or editor?
• Common searches
99. It’s not all queries and indexes
• Vague titles are a bit of a nightmare
• Review use of keywords in content
• “I would never search for that!”
• Continuous user testing and tuning
100. What I learned
• It isn’t magic
• Get to know the provider
• Content and content structure matter
• Search is actually quite hard