SlideShare una empresa de Scribd logo
1 de 9
Descargar para leer sin conexión
Duplicate Content & Multiple Site
            Issues
            Sasi Parthasarathy
        Program Manager, Microsoft
Topics covered

• Duplicate content
   – Internal content -> URL Canonicalization
   – External content -> Spam, Geo-targeting
• Content Syndication
• Good practices
• Examples Examples Examples
URL canonicalization

•   Less is more - expose only one URL per piece of content – pretty
    please
•   The practice of consolidating all versions of a page under one URL is
    referred to as quot;canonicalizationquot;
•   Helps the search engine; at the same time does not split your rank juice
•   Having too many duplicate URLs will waste crawl time – the crawler might
    spend time indexing duplicate URLs and miss good content
•   4 ways to get to microsoft.com but we need only one
     1. microsoft.com
     2. www.microsoft.com
     3. www.microsoft.com/en/us/default.aspx
     4. www.microsoft.com/en/us/
Few recommendations for canonicalization

• Select WWW or Non-WWW, then redirect the other option to your
  preferred version
• Remove the default filename from the end of your URLs
    – All web servers allow you to select one or more default filenames to serve when
      the browser requests a directory. Check and see if the default filename is at the
      end of the URL and then trim it off
• Link internally to the canonical form of your URL
    – Make sure you always link to the proper canonical form of your URLs from within
      your site
• Remove query string variables or rewrite to readable URLs
    – http://www.mysite.com/downloads/details.aspx?FamilyID=ab99&displaylang=en
      to
      http://www.mysite.com/downloads/en/family/ab99
Why duplicate content?

• Your intention is the key
• If your intent is to manipulate the search engine, you will
  be penalized
  Example1: Multiple domains with very little or no
  difference in content and no clear intent why these
  domains exist
  Example2: If you are trying to falsely promote original
  content as your own (please report any issues with
  copied content to Live Search support)
Going International – Help Search Engines

You may have similar pages but for various regions.
Problems for search engines with geo-targeting:
• No standardized way to tell a search engine which region or
   language your content is targeted for
• Top level domains may not indicate the intended audience. For
   example, http://ma.tt/, an English site or Orange.com, a French
   Telecom site hosted in France.
• Using search unfriendly redirection techniques
Few indicators - Help Live Search while Geo-
                      targeting
• Country code top-level domain (ccTLD). For example, .ca
  specifically targets users in Canada
• Set all your domains in Live Search webmaster tools and make it
  explicit for the region

These indicators will help us show the correct page for the correct
  market
Content Syndication

• Syndicate with caution: For sites that syndicate their content on
  other sites
• From our perspective, we always want to show the version we think
  is appropriate to the user. This may not be the version you want or
  prefer.
• Tip:
   Ask your partner to use robots.txt to stop us from indexing the syndicated material
General tips to help the Search Engine


• Dynamic URLs – if the content is not changing, don’t have too many
  parameters
• 301 is your best friend – use them when you can
• No 302 hijack!!
• When you do a site update, don’t have links to expired pages
• Use robots.txt for anything you don’t want crawlers to crawl
• Consistent naming convention – easy for search engines to
  understand
• Follow standard URL formation practices

Más contenido relacionado

Destacado

Ancient Indian Mathematics And Astronomy
Ancient Indian Mathematics And AstronomyAncient Indian Mathematics And Astronomy
Ancient Indian Mathematics And AstronomyKalaimani Retnasamy
 
Ambit Energy Business Presentation
Ambit Energy Business PresentationAmbit Energy Business Presentation
Ambit Energy Business Presentationtoelerich
 
Poly Books (Japan Finals)
Poly Books (Japan Finals)Poly Books (Japan Finals)
Poly Books (Japan Finals)guestde5b2cc
 
08[1] multimedia
08[1]  multimedia08[1]  multimedia
08[1] multimediavincentlin
 
les discapacitats treball recerca boix
les discapacitats treball recerca boixles discapacitats treball recerca boix
les discapacitats treball recerca boixrecercadiscapacitats
 

Destacado (8)

Ancient Indian Mathematics And Astronomy
Ancient Indian Mathematics And AstronomyAncient Indian Mathematics And Astronomy
Ancient Indian Mathematics And Astronomy
 
Ambit Energy Business Presentation
Ambit Energy Business PresentationAmbit Energy Business Presentation
Ambit Energy Business Presentation
 
Hungaria
HungariaHungaria
Hungaria
 
Poly Books (Japan Finals)
Poly Books (Japan Finals)Poly Books (Japan Finals)
Poly Books (Japan Finals)
 
08[1] multimedia
08[1]  multimedia08[1]  multimedia
08[1] multimedia
 
les discapacitats treball recerca boix
les discapacitats treball recerca boixles discapacitats treball recerca boix
les discapacitats treball recerca boix
 
Bharamri
BharamriBharamri
Bharamri
 
vb.net
vb.netvb.net
vb.net
 

Último

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Último (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Duplicate Content SES NY 2009

  • 1. Duplicate Content & Multiple Site Issues Sasi Parthasarathy Program Manager, Microsoft
  • 2. Topics covered • Duplicate content – Internal content -> URL Canonicalization – External content -> Spam, Geo-targeting • Content Syndication • Good practices • Examples Examples Examples
  • 3. URL canonicalization • Less is more - expose only one URL per piece of content – pretty please • The practice of consolidating all versions of a page under one URL is referred to as quot;canonicalizationquot; • Helps the search engine; at the same time does not split your rank juice • Having too many duplicate URLs will waste crawl time – the crawler might spend time indexing duplicate URLs and miss good content • 4 ways to get to microsoft.com but we need only one 1. microsoft.com 2. www.microsoft.com 3. www.microsoft.com/en/us/default.aspx 4. www.microsoft.com/en/us/
  • 4. Few recommendations for canonicalization • Select WWW or Non-WWW, then redirect the other option to your preferred version • Remove the default filename from the end of your URLs – All web servers allow you to select one or more default filenames to serve when the browser requests a directory. Check and see if the default filename is at the end of the URL and then trim it off • Link internally to the canonical form of your URL – Make sure you always link to the proper canonical form of your URLs from within your site • Remove query string variables or rewrite to readable URLs – http://www.mysite.com/downloads/details.aspx?FamilyID=ab99&displaylang=en to http://www.mysite.com/downloads/en/family/ab99
  • 5. Why duplicate content? • Your intention is the key • If your intent is to manipulate the search engine, you will be penalized Example1: Multiple domains with very little or no difference in content and no clear intent why these domains exist Example2: If you are trying to falsely promote original content as your own (please report any issues with copied content to Live Search support)
  • 6. Going International – Help Search Engines You may have similar pages but for various regions. Problems for search engines with geo-targeting: • No standardized way to tell a search engine which region or language your content is targeted for • Top level domains may not indicate the intended audience. For example, http://ma.tt/, an English site or Orange.com, a French Telecom site hosted in France. • Using search unfriendly redirection techniques
  • 7. Few indicators - Help Live Search while Geo- targeting • Country code top-level domain (ccTLD). For example, .ca specifically targets users in Canada • Set all your domains in Live Search webmaster tools and make it explicit for the region These indicators will help us show the correct page for the correct market
  • 8. Content Syndication • Syndicate with caution: For sites that syndicate their content on other sites • From our perspective, we always want to show the version we think is appropriate to the user. This may not be the version you want or prefer. • Tip: Ask your partner to use robots.txt to stop us from indexing the syndicated material
  • 9. General tips to help the Search Engine • Dynamic URLs – if the content is not changing, don’t have too many parameters • 301 is your best friend – use them when you can • No 302 hijack!! • When you do a site update, don’t have links to expired pages • Use robots.txt for anything you don’t want crawlers to crawl • Consistent naming convention – easy for search engines to understand • Follow standard URL formation practices