SlideShare una empresa de Scribd logo
1 de 36
Examining the new
Search core in SP2013
Marcus Johansson




OSLO STOCKHOLM   LONDON   BOSTON   SINGAPORE
Marcus Johansson
• Senior Consultant, Comperio
• V-TSP Enterprise Search, Microsoft




Email:      marcus.johansson@comperiosearch.com
Twitter:    @marcjoha
Blog:       http://blog.comperiosearch.com
LinkedIn:   http://www.linkedin.com/in/marcusjohansson
End of an era, birth of a New age
• FAST now “fully integrated”
   – True, but there’s more!

• No longer a “FAST license”
  – SP2013 contains everything
  – Enterprise version

• Migration from FS4SP?
  – Brr…                        1997 – 2013
The evolution of FAST
                                    Secret sauce
                                     (incl. Mars)



                FSIS


                                     Search in
    FDS   ESP            FS4SP
                                      SP2013


                FSIA


                        Search in
                         SP2010
All this talk about the new Sheriff…
• Search in SP2013 gets a lot of attention
   – Revamped user/admin interface
   – Hover panels, previews
   – Query rules, result blocks
   – Result types, display templates
   – “You’ve seen this result before”
   – Query Builder
   – Content Search web part
   – Etc.
• Notice the pattern?
…what Search in SP13 really is

 Empowering
                              Better, more                  Major user
  the whole
                               powerful                     experience
 SharePoint
                              extensibility                  overhaul
 experience




                 Finally a                       Vastly
              single search                    improved
               architecture                   search core




• How come most of the buzz is about the UX?
For the first time,
  Search isn’t defined by the
        nuts and bolts,
but from the User Experience
and high-level tools around it.
Examining SharePoint 2013’s

NEW SEARCH CORE
Search architecture
                                  Public API
                                  Extensibility Points
                                  Unit of scale/role boundary




      Crawl     Link



              Analytics   Admin
              Reporting
The search components
• A “node” is an instance of a component
• Scale by adding nodes
RESTful interfaces
• Directly interact with SharePoint artifacts by
  using any technology supporting REST




• Also:
  – CSOM       JavaScript, Silverlight
  – SSOM       Managed code
The new Search Service Application
Keeping it all together
            Services                                 Processes




Process name                Description
hostcontrollerservice.exe   Process controller. Monitors and restarts children.
noderunner.exe              A search component (except the crawl component)
mssearch.exe                The crawl component.
Crawl component
                      • Changes from SP2010
       mssearch.exe     – Only crawling
                          • No indexing
                        – Continuous crawl
                          • Improves freshness
                        – Crawl Log
                          • More details
                          • Document removal
           Crawl        – Crawl Health Report
                          • Huge improvement!
Continuous crawls
• Not event-driven indexing
• Starts crawl regardless of prior crawl session
• Large change sets no longer bad for freshness

      Full and
    incremental


    Continuous



                  Default 15 min
                                            time



• Only available for SharePoint content types
  – Possible to crawl SP 2010 and 2007
Crawl health reports
   Crawl rate per type                   Crawl load




                               CPU and     Content
 Rate    Latency   Freshness   memory     Processing   Etc.
                                 load       activity
Crawl component performance
• Anecdotal: feels faster, more stable
• Bound by CPU and network
  – Documents per second
  – Link discovery
• Some I/O – files temporarily stored on disk
• Adjust performance by:
  – Crawler impact rules
  – Performance level (number of threads)

  Set-SPEnterpriseSearchService -PerformanceLevel X
Content processing component
             • Schema mapping
                – Crawled  Managed properties
             • Entity extraction
                – Companies and custom
             • Advanced Filter Pack is gone
                – Though PDFs are out of the box
             • Extensible through web service
             • Internally: processing flows
      Link      – Replaces pipeline in FS4SP
                – Based on FSIS/CTS. Hidden 
Content processing flows
• Hidden in SP2013. In FSIS, flows could be
  created, modified and debugged in real-time.

• Why on earth
  was this not
  included in
  SP2013!?



    The flow designer in
   FSIS, not available in
               SP2013.
Processing flow execution
Index component
            • Disk-based and atomic(!)

            • Divided into partitions
            • One partition per 10M docs

            • 1 partition contains 1+ replicas
               – fault-tolerance
               – query volume
            • 1 replica, 1 server

            • Indexing partially in-memory
Example: Partitions and replicas




Same content




Different content
Query processing component
• Prepares the queries
   – Query rules
   – Result sources
   – Linguistics/dictionaries
   – Etc.
• Manipulates the results
   – Display templates
   – Late security trimming
   – Etc.

• Internally: processing flows
   – Derived from FSIS/IMS. Again, this is hidden 
   – Still MAJOR improvement compared to FS4SP
Query rules
• For a certain term  trigger certain action:
   – Add/change query terms
   – Use alternate sorting/relevance
   – Hybrid search (or other federated results)
   – Etc.
• Replaces search keywords in SP2010
• Configure at farm, site collection or site-level
• Warning: Triggering the query rules engine comes
  with a penalty
   – Anecdotal tests: ~70ms + excl. parallel queries
Query builder
• Easily builds KQL
  – CSWP, result sources, query rules, etc.
Query client types
• Adjust throttling per client type
Query health reports
     Latency per processing node in SharePoint flow




                                 Latency in
                    Latency in
 Trend    Overall                  each       Index times   Etc.
                    main flow
                                  subflow
Analytics processing component
• Analyzes crawled items and search usage
• Updates index without re-indexing documents
• Result: relevance becomes self-learning
  – Also: search reports and recommendations



        Link



      Analytics
      Reporting
Type 1: Search analytics
Influences relevance
Type                Description
Anchor processing   Comparable to Google PageRank.
Click Distance      Number of clicks to an authoritative page.
Search clicks       Keeps track of how users click in the results.

Used in search center
Type                Description
Social tags         Tags that users apply to content. Not used per default,
                    but could be integrated as e.g. refiners.
Social distance     Used for sorting in People search.
Deep links          Subsite that users click on are added as deep links on
                    the top-site result.
Type 2: SP usage analytics
• Usage counts
  – Opened and viewed items
  – From all of SharePoint, not just search results
  – Improves relevance
• Activity ranking
  – Looks for trends and boosts “hot” items
• Recommendations
  – Looks for usage patterns within a site
  – “People who viewed this also viewed…”
Search reports
• Self-learning relevance aside,
  never underestimate manual effort!
   – Query rules, synonyms, boosts, etc.
• Automatic reports:
   – Number of queries
   – Top queries
   – Abandoned queries
   – No-result queries
   – Query rule usage
Search administration component
• Provisions other search components
• Talks to Admin database on behalf of:
  Crawl, Content and Query processing
  components
• In previous FAST products, it was impossible to
  make the admin component redundant
   – Not the case in SP2013!
   – Scale appropriately
                              Admin
Hardware properties
Component              CPU      Memory   Disk I/O   Network
Crawl                  Medium   Medium   Medium     High
Content processing     High     High                Medium
Index                  High     High     High       Medium
Query processing       Low      Medium              Medium
Analytics processing   Medium   Medium   Medium     High
Search administration Low       Low      Low        Low

• Special cases
   – Crawler temporarily store files on disk
   – Memory usage of admin component increases
     with topology size
Changes in HW requirements


• I/O bound, lots of IOPS!   • Still I/O-bound, but:
• VMs not recommended           – VMs are fine!
• Often issues with SANs        – SANs are fine!
                             • More RAM required, but:
                                – Lower indexing latency
                                – Lower search times
• Thresholds:                • Thresholds :
   – 15M items/server           – 10M items/server
   – Tested at 500M items       – Tested at 500M items
A note on RAM consumption
• Search is a BIG thief of RAM in SP13
• Memory limit configurable in:
  <15 hive>SearchRuntime1.0noderunner.exe.config


  – Warning: Components may crash at limit
• Safer options:
  – Decrease memory limit for the
    Distributed Cache service.
  – Tell your boss:
    “RAM is cheap. I’m not!”
Questions?




Email:      marcus.johansson@comperiosearch.com
                                                         Thank
Twitter:
Blog:
            @marcjoha
            http://blog.comperiosearch.com
                                                           you!
LinkedIn:   http://www.linkedin.com/in/marcusjohansson

Más contenido relacionado

Destacado

EL MISSATGE DE JESUS : EL REGNE I EL CANVI
EL MISSATGE DE JESUS : EL REGNE I EL CANVIEL MISSATGE DE JESUS : EL REGNE I EL CANVI
EL MISSATGE DE JESUS : EL REGNE I EL CANVIthe niks
 
Paradigmas de investigacion
Paradigmas de investigacionParadigmas de investigacion
Paradigmas de investigacionDigiZen
 
Generating Thermo-electricity using Graphit and Aluminum module
Generating Thermo-electricity using Graphit and Aluminum moduleGenerating Thermo-electricity using Graphit and Aluminum module
Generating Thermo-electricity using Graphit and Aluminum moduleCharith Suriyakula
 
Non linear biopharmaceutics
Non linear biopharmaceuticsNon linear biopharmaceutics
Non linear biopharmaceuticsDanish Kurien
 
Flowers: Parts and Functions
Flowers: Parts and FunctionsFlowers: Parts and Functions
Flowers: Parts and FunctionsHome
 
Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...
Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...
Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...Fishbowl Solutions
 
200306 51a13 infofiche_aluminium_d3_tig_lassen
200306 51a13 infofiche_aluminium_d3_tig_lassen200306 51a13 infofiche_aluminium_d3_tig_lassen
200306 51a13 infofiche_aluminium_d3_tig_lassenGuiseppe Sterling
 

Destacado (9)

EL MISSATGE DE JESUS : EL REGNE I EL CANVI
EL MISSATGE DE JESUS : EL REGNE I EL CANVIEL MISSATGE DE JESUS : EL REGNE I EL CANVI
EL MISSATGE DE JESUS : EL REGNE I EL CANVI
 
Paradigmas de investigacion
Paradigmas de investigacionParadigmas de investigacion
Paradigmas de investigacion
 
Generating Thermo-electricity using Graphit and Aluminum module
Generating Thermo-electricity using Graphit and Aluminum moduleGenerating Thermo-electricity using Graphit and Aluminum module
Generating Thermo-electricity using Graphit and Aluminum module
 
Non linear biopharmaceutics
Non linear biopharmaceuticsNon linear biopharmaceutics
Non linear biopharmaceutics
 
Flowers: Parts and Functions
Flowers: Parts and FunctionsFlowers: Parts and Functions
Flowers: Parts and Functions
 
Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...
Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...
Fishbowl Mobile Library Tablet Application for Oracle WebCenter Content - May...
 
200306 51a13 infofiche_aluminium_d3_tig_lassen
200306 51a13 infofiche_aluminium_d3_tig_lassen200306 51a13 infofiche_aluminium_d3_tig_lassen
200306 51a13 infofiche_aluminium_d3_tig_lassen
 
Cynthia yeo cv2 nov
Cynthia yeo cv2 novCynthia yeo cv2 nov
Cynthia yeo cv2 nov
 
Rueda helicoidal
Rueda helicoidalRueda helicoidal
Rueda helicoidal
 

Más de Comperio - Search Matters.

Samhandlingsløsninger med søk på tvers av kilder
Samhandlingsløsninger med søk på tvers av kilderSamhandlingsløsninger med søk på tvers av kilder
Samhandlingsløsninger med søk på tvers av kilderComperio - Search Matters.
 
NDC lightning SharePoint 2013 and Enterprise Search
NDC lightning SharePoint 2013 and Enterprise SearchNDC lightning SharePoint 2013 and Enterprise Search
NDC lightning SharePoint 2013 and Enterprise SearchComperio - Search Matters.
 
Improve Performance in Fast Search for SharePoint - Comperio
Improve Performance in Fast Search for SharePoint - ComperioImprove Performance in Fast Search for SharePoint - Comperio
Improve Performance in Fast Search for SharePoint - ComperioComperio - Search Matters.
 
Welcome virksomhetssøk og sosial samhandling - Comperio
Welcome virksomhetssøk og sosial samhandling - ComperioWelcome virksomhetssøk og sosial samhandling - Comperio
Welcome virksomhetssøk og sosial samhandling - ComperioComperio - Search Matters.
 
SharePoint 2013 Enterprise Search Prjoect Learnings - Comperio
SharePoint 2013 Enterprise Search Prjoect Learnings - ComperioSharePoint 2013 Enterprise Search Prjoect Learnings - Comperio
SharePoint 2013 Enterprise Search Prjoect Learnings - ComperioComperio - Search Matters.
 
Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013
Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013
Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013Comperio - Search Matters.
 
Information wants to be free - Comperio seminar oslo14may2013
Information wants to be free - Comperio seminar oslo14may2013Information wants to be free - Comperio seminar oslo14may2013
Information wants to be free - Comperio seminar oslo14may2013Comperio - Search Matters.
 
Produktivitet 1.0 - Comperio Seminar oktober 2012
Produktivitet 1.0 - Comperio Seminar oktober 2012Produktivitet 1.0 - Comperio Seminar oktober 2012
Produktivitet 1.0 - Comperio Seminar oktober 2012Comperio - Search Matters.
 
Search solutions for big data and collaboration - Comperio seminar October 2012
Search solutions for big data and collaboration - Comperio seminar October 2012Search solutions for big data and collaboration - Comperio seminar October 2012
Search solutions for big data and collaboration - Comperio seminar October 2012Comperio - Search Matters.
 

Más de Comperio - Search Matters. (18)

Samhandlingsløsninger med søk på tvers av kilder
Samhandlingsløsninger med søk på tvers av kilderSamhandlingsløsninger med søk på tvers av kilder
Samhandlingsløsninger med søk på tvers av kilder
 
Søkeløsningen dine kolleger drømmer om
Søkeløsningen dine kolleger drømmer omSøkeløsningen dine kolleger drømmer om
Søkeløsningen dine kolleger drømmer om
 
SharePoint Search mot 360 og ProArc
SharePoint Search mot 360 og ProArcSharePoint Search mot 360 og ProArc
SharePoint Search mot 360 og ProArc
 
NDC lightning SharePoint 2013 and Enterprise Search
NDC lightning SharePoint 2013 and Enterprise SearchNDC lightning SharePoint 2013 and Enterprise Search
NDC lightning SharePoint 2013 and Enterprise Search
 
Improve Performance in Fast Search for SharePoint - Comperio
Improve Performance in Fast Search for SharePoint - ComperioImprove Performance in Fast Search for SharePoint - Comperio
Improve Performance in Fast Search for SharePoint - Comperio
 
Search Driven Websites - Comperio
Search Driven Websites - ComperioSearch Driven Websites - Comperio
Search Driven Websites - Comperio
 
Search Analytics - Comperio
Search Analytics - ComperioSearch Analytics - Comperio
Search Analytics - Comperio
 
Welcome virksomhetssøk og sosial samhandling - Comperio
Welcome virksomhetssøk og sosial samhandling - ComperioWelcome virksomhetssøk og sosial samhandling - Comperio
Welcome virksomhetssøk og sosial samhandling - Comperio
 
Virksomhetssøk for prosjekt - Comperio
Virksomhetssøk for prosjekt  - ComperioVirksomhetssøk for prosjekt  - Comperio
Virksomhetssøk for prosjekt - Comperio
 
SharePoint 2013 Enterprise Search Prjoect Learnings - Comperio
SharePoint 2013 Enterprise Search Prjoect Learnings - ComperioSharePoint 2013 Enterprise Search Prjoect Learnings - Comperio
SharePoint 2013 Enterprise Search Prjoect Learnings - Comperio
 
Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013
Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013
Yammer and office 365 roadmap update - Comperio seminar oslo14 May2013
 
Information wants to be free - Comperio seminar oslo14may2013
Information wants to be free - Comperio seminar oslo14may2013Information wants to be free - Comperio seminar oslo14may2013
Information wants to be free - Comperio seminar oslo14may2013
 
Fileserver Search Assessment - Comperio
Fileserver Search Assessment - ComperioFileserver Search Assessment - Comperio
Fileserver Search Assessment - Comperio
 
Sökmotorn i SharePoint 2013 - Comperio
Sökmotorn i SharePoint 2013 - ComperioSökmotorn i SharePoint 2013 - Comperio
Sökmotorn i SharePoint 2013 - Comperio
 
Big Data – good news for Enterprise Search
Big Data – good news for Enterprise SearchBig Data – good news for Enterprise Search
Big Data – good news for Enterprise Search
 
Produktivitet 1.0 - Comperio Seminar oktober 2012
Produktivitet 1.0 - Comperio Seminar oktober 2012Produktivitet 1.0 - Comperio Seminar oktober 2012
Produktivitet 1.0 - Comperio Seminar oktober 2012
 
Search solutions for big data and collaboration - Comperio seminar October 2012
Search solutions for big data and collaboration - Comperio seminar October 2012Search solutions for big data and collaboration - Comperio seminar October 2012
Search solutions for big data and collaboration - Comperio seminar October 2012
 
Hvordan lykkes med intern Facebook og Google
Hvordan lykkes med intern Facebook og GoogleHvordan lykkes med intern Facebook og Google
Hvordan lykkes med intern Facebook og Google
 

Último

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Último (20)

Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Examining the new search core in SharePoint 2013

  • 1. Examining the new Search core in SP2013 Marcus Johansson OSLO STOCKHOLM LONDON BOSTON SINGAPORE
  • 2. Marcus Johansson • Senior Consultant, Comperio • V-TSP Enterprise Search, Microsoft Email: marcus.johansson@comperiosearch.com Twitter: @marcjoha Blog: http://blog.comperiosearch.com LinkedIn: http://www.linkedin.com/in/marcusjohansson
  • 3. End of an era, birth of a New age • FAST now “fully integrated” – True, but there’s more! • No longer a “FAST license” – SP2013 contains everything – Enterprise version • Migration from FS4SP? – Brr…  1997 – 2013
  • 4. The evolution of FAST Secret sauce (incl. Mars) FSIS Search in FDS ESP FS4SP SP2013 FSIA Search in SP2010
  • 5. All this talk about the new Sheriff… • Search in SP2013 gets a lot of attention – Revamped user/admin interface – Hover panels, previews – Query rules, result blocks – Result types, display templates – “You’ve seen this result before” – Query Builder – Content Search web part – Etc. • Notice the pattern?
  • 6. …what Search in SP13 really is Empowering Better, more Major user the whole powerful experience SharePoint extensibility overhaul experience Finally a Vastly single search improved architecture search core • How come most of the buzz is about the UX?
  • 7. For the first time, Search isn’t defined by the nuts and bolts, but from the User Experience and high-level tools around it.
  • 9. Search architecture Public API Extensibility Points Unit of scale/role boundary Crawl Link Analytics Admin Reporting
  • 10. The search components • A “node” is an instance of a component • Scale by adding nodes
  • 11. RESTful interfaces • Directly interact with SharePoint artifacts by using any technology supporting REST • Also: – CSOM JavaScript, Silverlight – SSOM Managed code
  • 12. The new Search Service Application
  • 13. Keeping it all together Services Processes Process name Description hostcontrollerservice.exe Process controller. Monitors and restarts children. noderunner.exe A search component (except the crawl component) mssearch.exe The crawl component.
  • 14. Crawl component • Changes from SP2010 mssearch.exe – Only crawling • No indexing – Continuous crawl • Improves freshness – Crawl Log • More details • Document removal Crawl – Crawl Health Report • Huge improvement!
  • 15. Continuous crawls • Not event-driven indexing • Starts crawl regardless of prior crawl session • Large change sets no longer bad for freshness Full and incremental Continuous Default 15 min time • Only available for SharePoint content types – Possible to crawl SP 2010 and 2007
  • 16. Crawl health reports Crawl rate per type Crawl load CPU and Content Rate Latency Freshness memory Processing Etc. load activity
  • 17. Crawl component performance • Anecdotal: feels faster, more stable • Bound by CPU and network – Documents per second – Link discovery • Some I/O – files temporarily stored on disk • Adjust performance by: – Crawler impact rules – Performance level (number of threads) Set-SPEnterpriseSearchService -PerformanceLevel X
  • 18. Content processing component • Schema mapping – Crawled  Managed properties • Entity extraction – Companies and custom • Advanced Filter Pack is gone – Though PDFs are out of the box • Extensible through web service • Internally: processing flows Link – Replaces pipeline in FS4SP – Based on FSIS/CTS. Hidden 
  • 19. Content processing flows • Hidden in SP2013. In FSIS, flows could be created, modified and debugged in real-time. • Why on earth was this not included in SP2013!? The flow designer in FSIS, not available in SP2013.
  • 21. Index component • Disk-based and atomic(!) • Divided into partitions • One partition per 10M docs • 1 partition contains 1+ replicas – fault-tolerance – query volume • 1 replica, 1 server • Indexing partially in-memory
  • 22. Example: Partitions and replicas Same content Different content
  • 23. Query processing component • Prepares the queries – Query rules – Result sources – Linguistics/dictionaries – Etc. • Manipulates the results – Display templates – Late security trimming – Etc. • Internally: processing flows – Derived from FSIS/IMS. Again, this is hidden  – Still MAJOR improvement compared to FS4SP
  • 24. Query rules • For a certain term  trigger certain action: – Add/change query terms – Use alternate sorting/relevance – Hybrid search (or other federated results) – Etc. • Replaces search keywords in SP2010 • Configure at farm, site collection or site-level • Warning: Triggering the query rules engine comes with a penalty – Anecdotal tests: ~70ms + excl. parallel queries
  • 25. Query builder • Easily builds KQL – CSWP, result sources, query rules, etc.
  • 26. Query client types • Adjust throttling per client type
  • 27. Query health reports Latency per processing node in SharePoint flow Latency in Latency in Trend Overall each Index times Etc. main flow subflow
  • 28. Analytics processing component • Analyzes crawled items and search usage • Updates index without re-indexing documents • Result: relevance becomes self-learning – Also: search reports and recommendations Link Analytics Reporting
  • 29. Type 1: Search analytics Influences relevance Type Description Anchor processing Comparable to Google PageRank. Click Distance Number of clicks to an authoritative page. Search clicks Keeps track of how users click in the results. Used in search center Type Description Social tags Tags that users apply to content. Not used per default, but could be integrated as e.g. refiners. Social distance Used for sorting in People search. Deep links Subsite that users click on are added as deep links on the top-site result.
  • 30. Type 2: SP usage analytics • Usage counts – Opened and viewed items – From all of SharePoint, not just search results – Improves relevance • Activity ranking – Looks for trends and boosts “hot” items • Recommendations – Looks for usage patterns within a site – “People who viewed this also viewed…”
  • 31. Search reports • Self-learning relevance aside, never underestimate manual effort! – Query rules, synonyms, boosts, etc. • Automatic reports: – Number of queries – Top queries – Abandoned queries – No-result queries – Query rule usage
  • 32. Search administration component • Provisions other search components • Talks to Admin database on behalf of: Crawl, Content and Query processing components • In previous FAST products, it was impossible to make the admin component redundant – Not the case in SP2013! – Scale appropriately Admin
  • 33. Hardware properties Component CPU Memory Disk I/O Network Crawl Medium Medium Medium High Content processing High High Medium Index High High High Medium Query processing Low Medium Medium Analytics processing Medium Medium Medium High Search administration Low Low Low Low • Special cases – Crawler temporarily store files on disk – Memory usage of admin component increases with topology size
  • 34. Changes in HW requirements • I/O bound, lots of IOPS! • Still I/O-bound, but: • VMs not recommended – VMs are fine! • Often issues with SANs – SANs are fine! • More RAM required, but: – Lower indexing latency – Lower search times • Thresholds: • Thresholds : – 15M items/server – 10M items/server – Tested at 500M items – Tested at 500M items
  • 35. A note on RAM consumption • Search is a BIG thief of RAM in SP13 • Memory limit configurable in: <15 hive>SearchRuntime1.0noderunner.exe.config – Warning: Components may crash at limit • Safer options: – Decrease memory limit for the Distributed Cache service. – Tell your boss: “RAM is cheap. I’m not!”
  • 36. Questions? Email: marcus.johansson@comperiosearch.com Thank Twitter: Blog: @marcjoha http://blog.comperiosearch.com you! LinkedIn: http://www.linkedin.com/in/marcusjohansson