SlideShare una empresa de Scribd logo
1 de 42
What we have learned
about SharePoint 2013 and
Enterprise Search
Petter Skodvin-Hvammen Tallak Hellebust
Agenda
• How to run a successful search project
• Architecture and infrastructure learning's
• User experience and search customizations
• How can you crawl thousands of file shares
• Discover associations and enrich indexed content
• What about search relevancy?
HOW TO RUN A SUCCESSFUL
PROJECT
Sprint 0 – goal
Best Solution Business Goals
Technology
User Needs
Sprint 0 – process
Analysis
• User Interviews
• Stakeholder
interviews
• Search Logs
• Existing work and
documentation
Technology
Assessment
• Sources
• Information
Model
• Technology
components
• Architecture
• Scaling
Concept
Development
• Problem Solving
• Information modus
• Mockups
• Clickable concept
demo
• Best practices
• Concept testing
Enterprise
Strategy
• Information
Marketplace
• Achieving
business goals
Final
Report
• Presentations
• Recommendations
• Project plan
• Quickwins
How to run a successful search project
• Sprint 0
• Planning
• Development
• Testing
• Demo
• Deployment
One sprint ahead
• Let the UX-work be one sprint ahead of the
technical team
• Produce a clickable prototype each sprint
• The prototype are a visual presentation of the
product backlog
• The technical team implements the prototype in the
next sprint
Sprint 3
UX (Sprint 4)
Sprint 2
UX (Sprint 3)
Sprint n
UX (Sprint n+1)
Sprint 1
UX (Sprint 2)
UX (Sprint 0)
Infrastructure Needs
Is Microsoft moving into server hardware business?
Index-0
Query
WFE
Doc Proc
Crawling
Central Admin
Enrichment
FRONT
Query
WFE
FRONT
Index-2
Index-1
Index-3
Index-0
Index-2
Index-1
Index-3
Doc Proc
Doc Proc
Doc Proc
Doc Proc
Doc Proc
Doc Proc
Doc Proc
Crawling
Analytics
Admin
Admin
Enrichment
Enrichment
Enrichment
Enrichment
Enrichment
Enrichment
Enrichment
Analytics
Doc Proc
Enrichment
Doc Proc
Enrichment
40
Million
Documents
10
Queries /
Second
SQL Server SQL Server
• Admin DB
• Analytics DB
• Crawl DB
• Link DB
• Other SP DBs
Infrastructure Investments
What Spec Count Total
SharePoint Server Virtual Machine 12 12 VMs
CPU 8 cores 12 96 cores
Memory 16 GB 12 192 GB
System Disk 150 GB 12 1,8 TB
Data Disk 450 GB 12 5,4 TB
Disk IO 200 (Indexer) 10 2 000 IOPS
• Physical Servers
• Database Servers
• Load Balancer
• SAN or local disk arrays
• Domain Controller
• Other networking
• Licenses for
• SharePoint Server
• SQL Server
• Windows Server
• CALs/eCALs
• Visual Studio
• Comperio FRONT
• UAT Env
• QA/Test Env
• Dev Envs
We have learned that…
You will need
• Funding!
• Time
• Documentation
• Network
• To automate
Performance will get you
• Add more CPU
• Add more Memory
• Optimize Disk IO
• Balance load vicely
• Tune Distributed cache
• Know your Anti virus
Capacity Test Findings
• Crawl rate decline 1% per million items indexed
• Query latency increase exponentially from 12 million
items indexed per partition
• Database latency insignificant during crawling
• Successfully crawled file shares via symbolic directory
links
• Disk space usage significant lower than expected
Crawl Rate / Indexed Items
Disk Space Usage
Server System Volume (C:) Data Volume (E:)
Used
space
Free space Capacity Used space Free space Capacity
Admin, Crawler, Content Processing, Analytics Processing 33.3 116 149 42 807 849
Query Processing, Index Partition 0 34.4 115 149 270 579 849
Query Processing, Index Partition 1 34.5 115 149 268 581 849
Crawler, Content Processing, Analytics Processing 34.5 115 149 55 794 849
Disk volume Total
Number of servers 4
Data 52
Index 1 077 248
Logs 24 576
MB 1 101 876
GB 1 076
We reduced data volume
from 850 GB to 450 GB
Huge savings in storage costs!
The table above shows measured disk space usage for 31 million items indexed
Database Space Usage
Database Capacity Test
Number of searchable items (in millions) 30
Search Service Application 156
Analytics Reporting 6
Crawl Store 19 151
Links Store 24 316
MB 43 628
GB 43
Table to the left shows
measured database space
usage for 31 million items
indexed
USER EXPERIENCE
&
SEARCH CUSTOMIZATIONS
Display templates
• Content search webpart
– Control, item
• Refinement webpart
– Control, item
• Search result webpart
– Control, group, item, hover
FRONT Search
• Advanced query and result processing
• Highly customizable business logic represented through
reusable tasks and flows
• Lightweight development environment
• Lightweight deployment
• Fully integrated with SharePoint result presentation and
display templates
• Fully integrated with SharePoint security
FRONT Search in SP2013
• Front webpart
– Handles communication between Front and UI
• Front app
– Handles claims security
• Front webservice
– Flow engine
FRONT Search in SP2013
• Javascript events
– QueryIssuingEvent
– ResultReadyEvent
• Search Rest API
– Query, postquery and suggestions
– Json and XML result
– Windows security / claims
– http://host/site/_api/search
FRONT Search <=> Query rules
FRONT Search
• Conditions
– Analyze query
– Analyze request
– Full flexibility
• Tasks (Actions)
– Change query model
– Perform parallel queries
– Full flexibility
• Publishing
– Special conditions case
• Result processing
– Analyze result from a query
– Perform new queries based on
result
– Change order/grouping/content
of result
Query rules
• Conditions
– Six types
• Actions
– Add promoted result
– Add blocked result
– Change query
• Publishing
– When is the rule active
FRONT Search <=> Result sources
FRONT Search
• Source system
– SP 2013
– SP 2010
– FAST ESP
– Lucene/Solr
– …
• Query transformation
– Full control of query model
Result sources
• Source system
– Local SP 2013 index
– Remote SP 2013 index
– OpenSearch
• Query transformation
– Subset of content
Crawl
Admin
Link
Analytics
Reporting
Public API
Unit of scale/role boundary
Custom components
HTTP
File shares
SharePoint
User profiles
Lotus Notes
Documentum
Exchange
folders
Custom - BCS
Search UX Examples has been removed from presentation to preserve client IP
Please contact Petter or Tallak if you like to discuss search user experience
How do you index
millions of documents
in thousands of file shares
in hundreds of locations?
Bonus! Support governance and operations
Challenges
• Max 50 content sources per service application
• Max 100 start addresses per content source
• Max 20 concurrent crawls per service application
• Limit bandwidth usage for specific server locations
• Limit crawler impact within local business hours
• Grant read access to crawler per file share
• Avoid token bloat issues with more than 1000
groups per account
• Manage indexing and crawling of each file shares
with minimum manual effort
A Proven Approach
• Symbolic links in smart folder
structure
impactfilessourceimpactaccountsymlink
• Content Sources per region with
smart start addresses
file://impact/files/source/impact
• Content Enrichment to fix file
paths in results
• Custom application for
managing file shares and
granting access to crawler
• Host aliases for crawler impact
• Custom timer job that synchs
custom lists from custom app
• Custom timer job that
creates/removes symbolic links
• Custom list: Locations
– Map server prefix to content
source
– Map location to schedule
and impact
• Custom List: File shares
– Map share to crawl account
– Map UNC to symlink
– Map share specific metadata
Example Solution
Files in Norway
• Incremental Crawl every 6 hours
• Start address: file://default/files/norway/default
Files in India
• Incremental Crawl every night at 21:00 IST
• Start address: file://reduced/files/india/reduced
Crawl Rules
• file://*/user1/* account=user1
• file://*/user2/* account=user2
Crawler Impact Rules
• Server name: default
• Server name: reduced wait 60 secs
Folders
• files/norway/default/user1/symlink1
• files/norway/default/user1/symlink2
• files/norway/default/user2/symlink3
• files/india/reduced/user1/symlink4
• files/india/reduced/user1/symlink5
• files/india/reduced/user2/symlink6
Custom list: Locations
• Server Prefix: osl
• Content Source: norway
• Crawler Impact: default
Custom list: File Shares
• UNC Path: osl-file01share1hr
• Crawl Account: user2
• Symlink: files/norway/default/user2/symlink3
Discover associations
in your indexed data using
custom entity extractors
Explore how your
indexed data is
associated with terms
often used by your
business
• Examples
– Organization
– Projects
– Customers
– Products
Add metadata or clean up
your indexed data using
custom content enrichment
• Based on where the
items are located, add
info about
– Department
– information owner,
– Security classification
• Lookup name based
on user account
• Remove company
name from title for all
web pages
• Normalize names
• Normalize phone
numbers
• Fix search result link
Synchronize Terms with Search
Spelling and Synonyms Dictionaries
Synchronize
Spelling Inclusion
Synchronize
Thesaurus
«Custom Timer Job» «Custom Timer Job»
SSA
How fast can you find
what you are searching for?
• What should be
indexed?
• What should be
searchable?
• What should be
displayed?
- Relevancy - Recall – Precision -
• How to a weight a
managed property?
• How to change
ranking model?
• How to tune
ranking?
Managed Property Weighting
These are not ordered
by importance!
Change Ranking Model
• The default ranking model
in SP 2013 did not fit us!
– Power Points always won
– Complete matches in site
titles and document titles
were outranked by number
of partial matches in body
– Community sites were
weighted lower than
discussions and posts
We replaced the SP 2013
ranking model with the
SP 2010 ranking model
Tune Ranking Model
Microsoft will soon
release a tool for tuning
ranking models!
1. Select ranking model to tune
2. Select result source to search
3. Add judgement sets
4. Add queries to judgement sets
5. Run queries and evaluate
results
6. Add and tune features
7. Save and publish model
THE END
Petter Skodvin-Hvammen
psh@adgruppen.no
@pettersh
Tallak Hellebust
tallak.hellebust@comperiosearch.com
@titakker

Más contenido relacionado

La actualidad más candente

10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 Search10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 SearchSPC Adriatics
 
Understanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid SearchUnderstanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid SearchJeff Fried
 
search driven intranets
search driven intranetssearch driven intranets
search driven intranetsJeff Fried
 
SharePoint 2013 search improvements
SharePoint 2013 search improvementsSharePoint 2013 search improvements
SharePoint 2013 search improvementsKunaal Kapoor
 
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchSPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchAgnes Molnar
 
Building enterprise records management solutions for share point 2010
Building enterprise records management solutions for share point 2010Building enterprise records management solutions for share point 2010
Building enterprise records management solutions for share point 2010Eric Shupps
 
SharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - AnnouncementsSharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - AnnouncementsNick Hobbs
 
SharePoint Workflows - SharePoint Saturday Twin Cities April 2012
SharePoint Workflows - SharePoint Saturday Twin Cities April 2012SharePoint Workflows - SharePoint Saturday Twin Cities April 2012
SharePoint Workflows - SharePoint Saturday Twin Cities April 2012Don Donais
 
Rev Your Engines: SharePoint Performance Best Practices
Rev Your Engines: SharePoint Performance Best PracticesRev Your Engines: SharePoint Performance Best Practices
Rev Your Engines: SharePoint Performance Best PracticesSPC Adriatics
 
Understanding and Configuring an Effective SharePoint 2013 Search
Understanding and Configuring an Effective SharePoint 2013 SearchUnderstanding and Configuring an Effective SharePoint 2013 Search
Understanding and Configuring an Effective SharePoint 2013 SearchMetanalysis
 
ECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern WorldECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern WorldMarc D Anderson
 
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchMetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
SPS Twin Cities - Congratulations You Inherited a SharePoint Site
SPS Twin Cities - Congratulations You Inherited a SharePoint SiteSPS Twin Cities - Congratulations You Inherited a SharePoint Site
SPS Twin Cities - Congratulations You Inherited a SharePoint SiteDon Donais
 
Leveraging microsoft’s e discovery platform in your organization
Leveraging microsoft’s e discovery platform in your organizationLeveraging microsoft’s e discovery platform in your organization
Leveraging microsoft’s e discovery platform in your organizationDon Donais
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesBenjamin Athawes
 
Tips and tricks for complex migrations to SharePoint Online
Tips and tricks for complex migrations to SharePoint OnlineTips and tricks for complex migrations to SharePoint Online
Tips and tricks for complex migrations to SharePoint OnlineAndries den Haan
 
SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...
SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...
SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...J.D. Wade
 
2014 TechFuse - Findability Within SharePoint 2013
2014 TechFuse - Findability Within SharePoint 20132014 TechFuse - Findability Within SharePoint 2013
2014 TechFuse - Findability Within SharePoint 2013Don Donais
 
What’s new in SharePoint 2016 Beta 2?
What’s new in SharePoint 2016 Beta 2?What’s new in SharePoint 2016 Beta 2?
What’s new in SharePoint 2016 Beta 2?Jason Himmelstein
 
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia SearchAcquia
 

La actualidad más candente (20)

10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 Search10 Things I Like in SharePoint 2013 Search
10 Things I Like in SharePoint 2013 Search
 
Understanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid SearchUnderstanding and Applying Cloud Hybrid Search
Understanding and Applying Cloud Hybrid Search
 
search driven intranets
search driven intranetssearch driven intranets
search driven intranets
 
SharePoint 2013 search improvements
SharePoint 2013 search improvementsSharePoint 2013 search improvements
SharePoint 2013 search improvements
 
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 SearchSPCAdriatics - 10 Things I Like In SharePoint 2013 Search
SPCAdriatics - 10 Things I Like In SharePoint 2013 Search
 
Building enterprise records management solutions for share point 2010
Building enterprise records management solutions for share point 2010Building enterprise records management solutions for share point 2010
Building enterprise records management solutions for share point 2010
 
SharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - AnnouncementsSharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - Announcements
 
SharePoint Workflows - SharePoint Saturday Twin Cities April 2012
SharePoint Workflows - SharePoint Saturday Twin Cities April 2012SharePoint Workflows - SharePoint Saturday Twin Cities April 2012
SharePoint Workflows - SharePoint Saturday Twin Cities April 2012
 
Rev Your Engines: SharePoint Performance Best Practices
Rev Your Engines: SharePoint Performance Best PracticesRev Your Engines: SharePoint Performance Best Practices
Rev Your Engines: SharePoint Performance Best Practices
 
Understanding and Configuring an Effective SharePoint 2013 Search
Understanding and Configuring an Effective SharePoint 2013 SearchUnderstanding and Configuring an Effective SharePoint 2013 Search
Understanding and Configuring an Effective SharePoint 2013 Search
 
ECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern WorldECS2019 - Managing Content Types in the Modern World
ECS2019 - Managing Content Types in the Modern World
 
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 SearchMetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
MetaVis Webinar - 10 Things I Like in SharePoint 2013 Search
 
SPS Twin Cities - Congratulations You Inherited a SharePoint Site
SPS Twin Cities - Congratulations You Inherited a SharePoint SiteSPS Twin Cities - Congratulations You Inherited a SharePoint Site
SPS Twin Cities - Congratulations You Inherited a SharePoint Site
 
Leveraging microsoft’s e discovery platform in your organization
Leveraging microsoft’s e discovery platform in your organizationLeveraging microsoft’s e discovery platform in your organization
Leveraging microsoft’s e discovery platform in your organization
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakes
 
Tips and tricks for complex migrations to SharePoint Online
Tips and tricks for complex migrations to SharePoint OnlineTips and tricks for complex migrations to SharePoint Online
Tips and tricks for complex migrations to SharePoint Online
 
SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...
SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...
SharePoint Saturday St. Louis 2014: What SharePoint Admins need to know about...
 
2014 TechFuse - Findability Within SharePoint 2013
2014 TechFuse - Findability Within SharePoint 20132014 TechFuse - Findability Within SharePoint 2013
2014 TechFuse - Findability Within SharePoint 2013
 
What’s new in SharePoint 2016 Beta 2?
What’s new in SharePoint 2016 Beta 2?What’s new in SharePoint 2016 Beta 2?
What’s new in SharePoint 2016 Beta 2?
 
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search
 

Destacado (8)

Globalizacion cultural 4
Globalizacion cultural 4Globalizacion cultural 4
Globalizacion cultural 4
 
Empower MediaMarketing
Empower MediaMarketingEmpower MediaMarketing
Empower MediaMarketing
 
Making share point governance work for business, it and users
Making share point governance work for business, it and usersMaking share point governance work for business, it and users
Making share point governance work for business, it and users
 
νικοσε1
νικοσε1νικοσε1
νικοσε1
 
St1
St1St1
St1
 
Γλυκό και χαρούμενο Πάσχα
Γλυκό και χαρούμενο ΠάσχαΓλυκό και χαρούμενο Πάσχα
Γλυκό και χαρούμενο Πάσχα
 
MOH UAE Pharmacy License Guidelines
MOH UAE Pharmacy License GuidelinesMOH UAE Pharmacy License Guidelines
MOH UAE Pharmacy License Guidelines
 
Globalizacion y cultura
Globalizacion y culturaGlobalizacion y cultura
Globalizacion y cultura
 

Similar a Share point 2013 enterprise search (public)

SharePoint 2013 Search Operations
SharePoint 2013 Search OperationsSharePoint 2013 Search Operations
SharePoint 2013 Search OperationsSPC Adriatics
 
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...Petter Skodvin-Hvammen
 
How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...Petter Skodvin-Hvammen
 
Share point 2010 performance and capacity planning best practices
Share point 2010 performance and capacity planning best practicesShare point 2010 performance and capacity planning best practices
Share point 2010 performance and capacity planning best practicesEric Shupps
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
DOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using SplunkDOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using SplunkOutlyer
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...Agnes Molnar
 
ESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 SearchESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
Alfresco Day Stockholm 2015 - Alfresco One
Alfresco Day Stockholm 2015 - Alfresco OneAlfresco Day Stockholm 2015 - Alfresco One
Alfresco Day Stockholm 2015 - Alfresco OneNicole Szigeti
 
Navigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointNavigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointJoanne Klein
 
SharePoint Saturday Paris 2015 Validating SharePoint 2013 Farm Before Go-Live
SharePoint Saturday Paris 2015   Validating SharePoint 2013 Farm Before Go-LiveSharePoint Saturday Paris 2015   Validating SharePoint 2013 Farm Before Go-Live
SharePoint Saturday Paris 2015 Validating SharePoint 2013 Farm Before Go-LiveChirag Patel
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol ValidationBIOVIA
 
SplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with SplunkSplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with SplunkSplunk
 
SplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner WorkshopSplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner Workshopjenny_splunk
 
SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013Agnes Molnar
 
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...NCCOMMS
 
2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten
2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten
2018 09-03 aOS Aachen - SharePoint demystified - Thomas VochtenaOS Community
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
SharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceSharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceBrian Culver
 
Webinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's NewWebinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's NewLucidworks
 

Similar a Share point 2013 enterprise search (public) (20)

SharePoint 2013 Search Operations
SharePoint 2013 Search OperationsSharePoint 2013 Search Operations
SharePoint 2013 Search Operations
 
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
ESPC14 380 So you think you can crawl? Stretching the Boundaries of SharePoin...
 
How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...How did it go? The first large enterprise search project in Europe using Shar...
How did it go? The first large enterprise search project in Europe using Shar...
 
Share point 2010 performance and capacity planning best practices
Share point 2010 performance and capacity planning best practicesShare point 2010 performance and capacity planning best practices
Share point 2010 performance and capacity planning best practices
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
 
DOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using SplunkDOXLON November 2016 - Data Democratization Using Splunk
DOXLON November 2016 - Data Democratization Using Splunk
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
 
ESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 SearchESPC13 - 10 Things I Like in SharePoint 2013 Search
ESPC13 - 10 Things I Like in SharePoint 2013 Search
 
Alfresco Day Stockholm 2015 - Alfresco One
Alfresco Day Stockholm 2015 - Alfresco OneAlfresco Day Stockholm 2015 - Alfresco One
Alfresco Day Stockholm 2015 - Alfresco One
 
Navigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointNavigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePoint
 
SharePoint Saturday Paris 2015 Validating SharePoint 2013 Farm Before Go-Live
SharePoint Saturday Paris 2015   Validating SharePoint 2013 Farm Before Go-LiveSharePoint Saturday Paris 2015   Validating SharePoint 2013 Farm Before Go-Live
SharePoint Saturday Paris 2015 Validating SharePoint 2013 Farm Before Go-Live
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
 
SplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with SplunkSplunkLive! - Getting started with Splunk
SplunkLive! - Getting started with Splunk
 
SplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner WorkshopSplunkLive Oslo/Stockholm Beginner Workshop
SplunkLive Oslo/Stockholm Beginner Workshop
 
SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013
 
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
SPCA2013 - Best Practices & Considerations for Designing Your SharePoint Logi...
 
2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten
2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten
2018 09-03 aOS Aachen - SharePoint demystified - Thomas Vochten
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
SharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 PerformanceSharePoint Saturday The Conference 2011 - SP2010 Performance
SharePoint Saturday The Conference 2011 - SP2010 Performance
 
Webinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's NewWebinar: Fusion 3.1 - What's New
Webinar: Fusion 3.1 - What's New
 

Share point 2013 enterprise search (public)

  • 1. What we have learned about SharePoint 2013 and Enterprise Search Petter Skodvin-Hvammen Tallak Hellebust
  • 2. Agenda • How to run a successful search project • Architecture and infrastructure learning's • User experience and search customizations • How can you crawl thousands of file shares • Discover associations and enrich indexed content • What about search relevancy?
  • 3. HOW TO RUN A SUCCESSFUL PROJECT
  • 4. Sprint 0 – goal Best Solution Business Goals Technology User Needs
  • 5. Sprint 0 – process Analysis • User Interviews • Stakeholder interviews • Search Logs • Existing work and documentation Technology Assessment • Sources • Information Model • Technology components • Architecture • Scaling Concept Development • Problem Solving • Information modus • Mockups • Clickable concept demo • Best practices • Concept testing Enterprise Strategy • Information Marketplace • Achieving business goals Final Report • Presentations • Recommendations • Project plan • Quickwins
  • 6.
  • 7. How to run a successful search project • Sprint 0 • Planning • Development • Testing • Demo • Deployment
  • 8. One sprint ahead • Let the UX-work be one sprint ahead of the technical team • Produce a clickable prototype each sprint • The prototype are a visual presentation of the product backlog • The technical team implements the prototype in the next sprint Sprint 3 UX (Sprint 4) Sprint 2 UX (Sprint 3) Sprint n UX (Sprint n+1) Sprint 1 UX (Sprint 2) UX (Sprint 0)
  • 9. Infrastructure Needs Is Microsoft moving into server hardware business?
  • 10. Index-0 Query WFE Doc Proc Crawling Central Admin Enrichment FRONT Query WFE FRONT Index-2 Index-1 Index-3 Index-0 Index-2 Index-1 Index-3 Doc Proc Doc Proc Doc Proc Doc Proc Doc Proc Doc Proc Doc Proc Crawling Analytics Admin Admin Enrichment Enrichment Enrichment Enrichment Enrichment Enrichment Enrichment Analytics Doc Proc Enrichment Doc Proc Enrichment 40 Million Documents 10 Queries / Second SQL Server SQL Server • Admin DB • Analytics DB • Crawl DB • Link DB • Other SP DBs
  • 11. Infrastructure Investments What Spec Count Total SharePoint Server Virtual Machine 12 12 VMs CPU 8 cores 12 96 cores Memory 16 GB 12 192 GB System Disk 150 GB 12 1,8 TB Data Disk 450 GB 12 5,4 TB Disk IO 200 (Indexer) 10 2 000 IOPS • Physical Servers • Database Servers • Load Balancer • SAN or local disk arrays • Domain Controller • Other networking • Licenses for • SharePoint Server • SQL Server • Windows Server • CALs/eCALs • Visual Studio • Comperio FRONT • UAT Env • QA/Test Env • Dev Envs
  • 12. We have learned that… You will need • Funding! • Time • Documentation • Network • To automate Performance will get you • Add more CPU • Add more Memory • Optimize Disk IO • Balance load vicely • Tune Distributed cache • Know your Anti virus
  • 13. Capacity Test Findings • Crawl rate decline 1% per million items indexed • Query latency increase exponentially from 12 million items indexed per partition • Database latency insignificant during crawling • Successfully crawled file shares via symbolic directory links • Disk space usage significant lower than expected
  • 14. Crawl Rate / Indexed Items
  • 15. Disk Space Usage Server System Volume (C:) Data Volume (E:) Used space Free space Capacity Used space Free space Capacity Admin, Crawler, Content Processing, Analytics Processing 33.3 116 149 42 807 849 Query Processing, Index Partition 0 34.4 115 149 270 579 849 Query Processing, Index Partition 1 34.5 115 149 268 581 849 Crawler, Content Processing, Analytics Processing 34.5 115 149 55 794 849 Disk volume Total Number of servers 4 Data 52 Index 1 077 248 Logs 24 576 MB 1 101 876 GB 1 076 We reduced data volume from 850 GB to 450 GB Huge savings in storage costs! The table above shows measured disk space usage for 31 million items indexed
  • 16. Database Space Usage Database Capacity Test Number of searchable items (in millions) 30 Search Service Application 156 Analytics Reporting 6 Crawl Store 19 151 Links Store 24 316 MB 43 628 GB 43 Table to the left shows measured database space usage for 31 million items indexed
  • 18. Display templates • Content search webpart – Control, item • Refinement webpart – Control, item • Search result webpart – Control, group, item, hover
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. FRONT Search • Advanced query and result processing • Highly customizable business logic represented through reusable tasks and flows • Lightweight development environment • Lightweight deployment • Fully integrated with SharePoint result presentation and display templates • Fully integrated with SharePoint security
  • 24.
  • 25. FRONT Search in SP2013 • Front webpart – Handles communication between Front and UI • Front app – Handles claims security • Front webservice – Flow engine
  • 26. FRONT Search in SP2013 • Javascript events – QueryIssuingEvent – ResultReadyEvent • Search Rest API – Query, postquery and suggestions – Json and XML result – Windows security / claims – http://host/site/_api/search
  • 27. FRONT Search <=> Query rules FRONT Search • Conditions – Analyze query – Analyze request – Full flexibility • Tasks (Actions) – Change query model – Perform parallel queries – Full flexibility • Publishing – Special conditions case • Result processing – Analyze result from a query – Perform new queries based on result – Change order/grouping/content of result Query rules • Conditions – Six types • Actions – Add promoted result – Add blocked result – Change query • Publishing – When is the rule active
  • 28. FRONT Search <=> Result sources FRONT Search • Source system – SP 2013 – SP 2010 – FAST ESP – Lucene/Solr – … • Query transformation – Full control of query model Result sources • Source system – Local SP 2013 index – Remote SP 2013 index – OpenSearch • Query transformation – Subset of content
  • 29. Crawl Admin Link Analytics Reporting Public API Unit of scale/role boundary Custom components HTTP File shares SharePoint User profiles Lotus Notes Documentum Exchange folders Custom - BCS
  • 30. Search UX Examples has been removed from presentation to preserve client IP Please contact Petter or Tallak if you like to discuss search user experience
  • 31. How do you index millions of documents in thousands of file shares in hundreds of locations? Bonus! Support governance and operations
  • 32. Challenges • Max 50 content sources per service application • Max 100 start addresses per content source • Max 20 concurrent crawls per service application • Limit bandwidth usage for specific server locations • Limit crawler impact within local business hours • Grant read access to crawler per file share • Avoid token bloat issues with more than 1000 groups per account • Manage indexing and crawling of each file shares with minimum manual effort
  • 33. A Proven Approach • Symbolic links in smart folder structure impactfilessourceimpactaccountsymlink • Content Sources per region with smart start addresses file://impact/files/source/impact • Content Enrichment to fix file paths in results • Custom application for managing file shares and granting access to crawler • Host aliases for crawler impact • Custom timer job that synchs custom lists from custom app • Custom timer job that creates/removes symbolic links • Custom list: Locations – Map server prefix to content source – Map location to schedule and impact • Custom List: File shares – Map share to crawl account – Map UNC to symlink – Map share specific metadata
  • 34. Example Solution Files in Norway • Incremental Crawl every 6 hours • Start address: file://default/files/norway/default Files in India • Incremental Crawl every night at 21:00 IST • Start address: file://reduced/files/india/reduced Crawl Rules • file://*/user1/* account=user1 • file://*/user2/* account=user2 Crawler Impact Rules • Server name: default • Server name: reduced wait 60 secs Folders • files/norway/default/user1/symlink1 • files/norway/default/user1/symlink2 • files/norway/default/user2/symlink3 • files/india/reduced/user1/symlink4 • files/india/reduced/user1/symlink5 • files/india/reduced/user2/symlink6 Custom list: Locations • Server Prefix: osl • Content Source: norway • Crawler Impact: default Custom list: File Shares • UNC Path: osl-file01share1hr • Crawl Account: user2 • Symlink: files/norway/default/user2/symlink3
  • 35. Discover associations in your indexed data using custom entity extractors Explore how your indexed data is associated with terms often used by your business • Examples – Organization – Projects – Customers – Products
  • 36. Add metadata or clean up your indexed data using custom content enrichment • Based on where the items are located, add info about – Department – information owner, – Security classification • Lookup name based on user account • Remove company name from title for all web pages • Normalize names • Normalize phone numbers • Fix search result link
  • 37. Synchronize Terms with Search Spelling and Synonyms Dictionaries Synchronize Spelling Inclusion Synchronize Thesaurus «Custom Timer Job» «Custom Timer Job» SSA
  • 38. How fast can you find what you are searching for? • What should be indexed? • What should be searchable? • What should be displayed? - Relevancy - Recall – Precision - • How to a weight a managed property? • How to change ranking model? • How to tune ranking?
  • 39. Managed Property Weighting These are not ordered by importance!
  • 40. Change Ranking Model • The default ranking model in SP 2013 did not fit us! – Power Points always won – Complete matches in site titles and document titles were outranked by number of partial matches in body – Community sites were weighted lower than discussions and posts We replaced the SP 2013 ranking model with the SP 2010 ranking model
  • 41. Tune Ranking Model Microsoft will soon release a tool for tuning ranking models! 1. Select ranking model to tune 2. Select result source to search 3. Add judgement sets 4. Add queries to judgement sets 5. Run queries and evaluate results 6. Add and tune features 7. Save and publish model
  • 42. THE END Petter Skodvin-Hvammen psh@adgruppen.no @pettersh Tallak Hellebust tallak.hellebust@comperiosearch.com @titakker

Notas del editor

  1. Se på hva brukere har behov for, hvilke utfordringer oppstår I hverdagen Hvilke tekniske muligheter/begrensinger ertilgjengelige Hvilke mål har bedriften
  2. Development Environment OS: Windows Server 2008 R2 SP1 CPU: 4 cores Memory: 8GB -> 16 GB Disk: Fast disks Visual Studio 2012 SQL Server 2012 (Max server memory: 1500 MB)
  3. Dedicated search farm for 40 million searchable items and 10 queries per second Front end server to host your search UI One index server per 10 million items 20 million items 30 million items 40 million items Server to host crawling Analytics processing Central administration and other sharepoint application services Query and results processing Search administration Document processing Database server Load balanced front end and redundant admin and query processing Index replicas for redundancy and increased throughput Extra crawl component per 20 M items and redundancy Cluster or mirror the database server for fault tolerance Multiple data centers for disaster scenarioes For advanced query and result processing, put Comperio Front between your search center and REST API For advanced content enrichment, deploy your content enrichment web services
  4. 7,2 TB
  5. Funding System requirements have increased Infrastructure investments are massive There must be a significant PAIN to solve Time To analyse requirements To purchase and setup the infrastructure To get to know all the new stuff To build and deploy your customizations Documentation We were early adopters -> not much to find on Google, MSDN or Technet Network Knowing someone who knows something… Automation You will ned to re-install SharePoint You will re-deploy your solutions Autospinstaller, custom cmdlets and scripts Performance CPU increased from 4 > 8 cores on dev env Memory Increased from 8 GB > 16 GB on dev env (paging) Increased from 16 GB per SQL Server to 16 GB per database instance Disk IO You need enough disk spindles to handle the IO You need to configure your SAN correct Opt out of dynamic disk solution Load balancer Turn of sticky sessions and trust the distributed cache Test and tune timeouts Distributed cache Configure enough memory Anti virus Turn it of Exclude the index folder ++
  6. The purpose of the search capacity test is to validate the documented and undocumented soft boundaries in Microsoft SharePoint Server 2013, with focus on   maximum number of documents in search partition maximum number of documents in a crawl database architecture for crawling a large number of file shares getting an initial picture of search and crawl performance Crawled 30 million documents from file shares via symbolic links on crawler server. Tested 20,000 searches per day and used top 300 used search queries from search statistics. 4 server farm with 2 index partitions, 2 crawl component and 1 crawl database.
  7. Slide shows actual numbers with 31 million items indexed
  8. Display templates control which managed properties are shown in the search results, and how they appear in the Web Part. Each display template is made of two files: an HTML version of the display template that you can edit in your HTML editor, and a .js file that SharePoint uses. Control templates determine the overall structure of how the results are presented. Includes lists, lists with paging, and slide shows. Item templates determine how each result in the set is displayed. Includes images, text, video, and other items. Group templates is special for search results and is used for html surrounding grouped items Hover templates is used for presenting more information on a search result hit. A item template and a hover template have a connection
  9. Hvordan display templates er bygd opp Control Group Item
  10. Hover
  11. API Enkelt grensesnitt for å spørre SP uten å ha SP-bibliotek Lett å teste og konsumere
  12. Query rules conditions Query matches string exactly Query contains string Query matches dictionary exactly Query more common in source Result type commonly clicked Advanced query matching
  13. What should be indexed? Content sources and start addresses Content types / file types Crawl rules for exclusions What parts of the indexed content should be searchable? Full-text index Fielded search Refiners What should be displayed? In search suggestions In search results In search flyouts