Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

What is Content Analytics - MeasureCamp London 2016

800 visualizaciones

Publicado el

We usually say "Content is king" but in our Analytics tool, we usually talk about traffic and conversion. There are a lot of Natural Language Processing metrics that we could use to have a better understanding of your actual content. Let's dive into that.

Publicado en: Internet
  • Sé el primero en comentar

What is Content Analytics - MeasureCamp London 2016

  1. 1. What is Content Analytics ?
  2. 2. Content is King
  3. 3. ...and yet what content metrics and dimensions do you use ?
  4. 4. On Google Analytics Some dimensions : ● Title ● URL ● Keywords (or what is left of it) No actual metrics directly related to content
  5. 5. What should we get ?
  6. 6. NLP Data ● Natural Language Processing statistics New data : – How many times the main keywords are in my content ? – How many times these keywords are subject of a sentence ? – How relevant are the words I am using ?
  7. 7. Quick poll Who has ever heard about TF-IDF metric ?
  8. 8. Metric : TF - IDF Numerical statistic that is intended to reflect how important a word is to a document in a corpus Frequency of a word (or series of words) in a document. To avoid words that would be too specific to only 1 document, it is compared to the frequency in the corpus
  9. 9. Quick poll Who knows what is a n-gram ?
  10. 10. N-gram What is a n-gram ? N-gram is a contiguous sequence of n items from a given sequence of text.
  11. 11. Example of 2-grams I am attending Measure Camp in London ● I am ● am attending ● attending Measure ● Measure Camp ● Camp in ● in London
  12. 12. If you remove useless words ● attending Measure ● Measure Camp ● Camp London
  13. 13. Let's say you want to be as relevant as possible (and therefore rank on Google) for « Measure Camp »
  14. 14. 1st step Analyse your content with a n-gram analysis
  15. 15. 2nd - Topic Corpus Now, create a Topic corpus around your keyword (basically, pages ranked in Google) Let's get 100 top results for these keywords ● Analytics event ● Analytics conference ● Measure Camp Get the n-gram within all the documents (around 200 documents if you remove duplicate) Calculate TF-IDF for each n gram
  16. 16. YAY !!! : My first relevant Content Metrics:) measure camp : 100 (very frequent) analytics conference : 60 (quite frequent) ● Peter O'Neill : 50 (quite frequent) ● Stay (in) London : 30 (somewhat frequent) * not actual data. Simplified version of TF-IDF
  17. 17. Now, create a topic-neutral corpus (basically take thousands and thousands of random webpages and create a corpus with it) Get the n-gram out of it Extract : Click here (very frequent) Stay London (appears a few times) Peter O'Neill (nowhere to be found) Measure Camp (1 time in the corpus) 3rd – topic neutral corpus
  18. 18. 4 - Now let's compare ● Stay London : somewhat frequent in both corpus : not so relevant for your content ● Peter O'Neill : Yay ! ● Measure Camp : not so frequent in English, very frequent in our topic corpus : I shall use it
  19. 19. ● Big data : very frequent in the topic corpus, not seo frequent → Oh, sounds like something people want to hear about. Let's write content about it.
  20. 20. 5 – Optimize your content Proofread your content with these new relevant expressions in mind. Can I add more value to the user ? Can it help improve my organic ranking ?
  21. 21. Let's discuss What kind of other content metrics or dimensions would we use ?

×