SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Targeting the Interest Graph:
  Personalization of content and ad selection
           using the Inform Service



 Marc Hadfield
 CTO, Inform
 Semantic Technology Conference, 2011
Introduction
       Marc Hadfield is CTO of Inform Technologies.

       Interests:
       Natural Language Processing, Semantics, Life Science
       Graph Algorithms, Machine Learning, Big Data

       Inform Technologies is a semantic technology company.

       Inform provides semantic technology – NLP and Analytics to
            Publishers, and operates a user generated forum site
            Yuku.com.
       We at Inform have been evolving our technology to the user
            generated content space. We’ve adapted our technology to
            different kinds of content such as informal text, photos, videos,
            and questions.
       We’ve recently addressed Ad Selection, Video Selection, and
            Personalization.
       I’ll discuss some of our results with the Interest Graph.

                                                                                2
Inform Service
   Semantic Software-as-a-Service for Publishers

   Advantage: ~30% boost in engagement in “traditional” publisher
     websites.

   Tracks 4,000+ Subjects and 320,000+ Entities: Inform Topics

   Inform Service:
       –    In-Article links to Topics Pages
       –    Related Articles from the Archive
       –    Related Articles around the Web
       –    Related Photos
       –    Related Videos
       –    Topic Pages including mix of content sources
       –    Tools (Publishing Tools, etc.)




                                                                    3
Inform Publisher Customers




                             4
Yuku Forums
       Forum Content
         –    “Old School” user generated content
         –    ~40,000 forums
         –    Top 100 forums account for about 50% of traffic
         –    ~1 Billion short form content pieces
         –    ~1 Million monthly unique users
         –    ~150K new content objects per day
         –    ~1 Million Page Views per Day
       Subscription / Advertising Revenue
       Inform adapting / integration our Semantic Tech

       Great laboratory for testing algorithms / theories
         –    Apply more broadly than Yuku platform
       Nice A/B testing environment
       Testing new algorithms on our ForumFind search engine
         –    And embedded widgets in Yuku

       Good reason to improve Ad Selection


                                                                5
Today: Personalization for Enhanced Targeting
 •  Capturing the Interest Graph
 •  Personalized experience
        Help People find interesting content
        Make Ads relevant




                                                Occam




                                                        6
Inform Content & Analytics Platform

 Licensed /                              3rd          Content / Data
 Crawled                                 Party /        Ingestion
 Content                                 Activity
                                         Data

                        Text Analysis


      Algorithms
                                                       Core Engine
                                                         Occam
                      Categorization /
                      Personalization



                                                    Content Distribution

    Publisher site   Yuku           Widgets
                                                                           7
Inform “Occam” Architecture
Example Workflow:
                  • REST Webservice Call
       Receive    • Queue
       Message


                  • Get URL
                  • Extract Document Features
       Extract    • Extract Text



                  • NLP Features (Machine Learning)
                  • Inference Engine (Prolog / Frame Logic)
        NLP       • Discourse / Behavior / Sentiment Models (Prolog / Frame Logic) (New)




                  • Trend Analysis (incremental data)
                  • Graph Analysis (incremental data)
       Analysis




                  • Store in Semantic Repository (if needed)
                  • Send Reply Message (via Queue or Webservice)
        Reply



                                                                                           8
Inform API
       REST Based
       Queue for high volume content exchange
       Returns data in RDF, XML, or JSON
       All Content has a URI
       All Inform Topics have URIs (can be dereferenced)
       Insert Content, Update Content, Delete Content
       Login / Logout
       Change Status of Content (Published, Unpublished)
       Content can be “GET”
         –    Associated Topics (Subjects and Entities) returned
         –    Include scores
       Search Inform Topics
       Semantic Search
         –    Simplified queries (not full sparql)
         –    Typical Query: Get Content of Type “Article” about “Barack Obama”
              ranked by score



                                                                                  9
Inform API (2)
       Related Content
         –    Articles, Messages, Photos, Videos, Questions, Web

       AdContext™ (new)
         –    URL  IAB Topics + Inform Topics

       VideoContext™ (new)
         –    URL  Inform Topics
         –    Related Videos

       InterestGraph (new)
         –    Parameters: user-id / session-id  Inform Topics

       Personalized AdContext™ (new)
         –    URL + session-id / user-id (anonymized)  IAB Topics + Inform Topics




                                                                                     10
AdContext™: IAB Ad Standards
   IAB (Interactive Advertising Bureau) Standard to return a set of
     metadata about a website, webpage, section of a webpage to
     assist advertising within web content.

   Defines how a Topic may be associated with web content.

   Defines a set of standard upper level Topics such as “Science”,
     “Sports”, and “Business”, and mid-level Topics such as “Golf” and
     “Fashion”. These are tier-1 and tier-2.

   Inform has aligned the IAB Topics with Inform’s Topics. Inform can
      deliver more specific Topics (the full set of Inform Topics) as “tier-3”
      IAB Topics.

   The AdContext™ service returns this metadata. Ad Networks may
     use the service to assist in ad selection.

   Semantic Ad Selection may improve yield 2X – 5X (as per various
     external studies).
                                                                                 11
Aside: rNews RDFa Standard
 rNews: embedding metadata in online news
 rNews is a proposed standard for using RDFa to annotate
   news-specific metadata in HTML documents. The rNews
   proposal has been developed by the IPTC, a consortium
   of the world's major news agencies, news publishers and
   news industry vendors. rNews is currently in draft form
   and the IPTC welcomes feedback on how to improve the
   standard in the rNews Forum.

    http://dev.iptc.org/rNews

    Why?
      SEO, Rich Snippets, Reduce “scrapper” error, better metadata.

    Inform API returns via the API rNews metadata ready to embed in
       news articles (in testing).

                                                                      12
Publisher Customer Example:
                                  Inform automatically
                                  tags entities (people,
                                  places, companies,
                                  and organizations)
                                  and provides related
                                  topics, articles, and
                                  media




                 The Related
                 News Widget
                 pulls in the
                 most relevant
                 and recent
                 articles from
                 within the New
                 York Daily
                 News Archive




                                                           13
Customer Example:


                    Inform also
                    generates
                    highly
  Inform’s tags     engaging
  can be brought    and
  together in       relevant
  numerous ways     slideshows
  to create a
  richer
  experience for
  consumers




                                  14
Demo Inform API w/Facebook




How to connect Inform to the social graph?
                                             15
Demo Inform API w/Facebook




                             16
Demo Inform API w/Facebook




                             17
Demo Inform API w/Facebook


                     Inform Topics mapped to
                     Wikipedia Pages and thus
                     to other Concepts –
                     including the Facebook
                     “Like” Graph




                                                18
Interest Graph
 •  Inform Topics                       •  ~1 Billion content pieces
        4,000+ Subjects in Hierarchy     total
         (SKOS)                                Forum Messages, Replies,
        320,000+ Entities                      Photos, Videos
        Wikipedia Pages
        Wikipedia Categories
                                        •  150K new content pieces
                                          per day
        Inform “same-as” links to
         Wikipedia                      •  1 Million+ PageViews per
                                          Day

 •  1 Million+ Monthly Unique •  ~5 Million ads serviced per
                                 Day
    Users

     Goal: Link Users to Topics for selection of content and ads



                                                                           19
Personalization Signals
 •  Content is “about” a Topic (subject or entity)
 •  User submits Content (“write”)
        Message, Reply, Photo, Video, Question, …

 •  User reads Content (“view”)
        Message, Reply, Photo, Video, Question


 Trends / Global Aggregation:
 •  Importance Metric
 •  Bursty / Velocity
 •  Sentiment ( “:-)”, “LOL”, …)
        “Like” the topic? “Dislike” the topic? Context?
          –    i.e. dislike a Football Team, so “likes” to hear when they lose (negative
               sentiment)

 •  Other features…
                                                                                           20
Interest Graph Algorithms
 Criteria:
 •  Near Real-Time
 •  Highly parallel to allow for scaling
 •  Fuzzy Data, Flexible data model
 Implementation:
 •  General Graph Representation
         Node Weights, Edge Weights, Node Types, Edge Types

 •  Graph walk to extract a User’s Interest Graph
 •  Parallel Message-Passing Algorithms for Graph Analysis
         Importance, PageRank, Centrality
         Spreading Activitation
         Pregel-like implementation (Signal/Collect)

 •  Add Graph Analytics to Workflow                            21
Neighborhood around JJB User




                               22
Niketalk User Interest Graph (local)
             Without global importance metric:




                                                 23
Niketalk User Interest Graph (global)
 With global importance metric:
                                  Recommendations can
                                  be made reflecting the
                                  shifting interests of the
                                  global community.




                                                              24
Example Yuku Forum - Gymnastics




                                  25
ForumFind – “laboratory”




                           26
ForumFind – Topic, Ad, Content




                                 27
ForumFind – MyForumFind (user: jjb2 )




                                        28
Interest Graph – User Insights
 •  “Everybody Lies” (“House” TV Show)
          –    The only way to know the users interests is to have an implicit channel
               to detect interests without impacting user behavior

 •  People have broad / dynamic interests
 •  People read “trash”
          –    i.e. everyone reads Celebrity Gossip
          –    If convenient / no one looking

 •  Global Data can be used to make recommendations
        No surprise, but nice to have confirmation

 •  People move on
        “Likes” need to expire

 •  Recommendations for content and ads can be
   implemented in a highly dynamic and parallel fashion
   running in real time with reasonable resources using
   graph analysis
                                                                                         29
Interest Graph – Conclusion

  •  Using a User’s Graph of Interests can
    dramatically improve the user’s engagement
         Data still being gathered within Inform as to percentage
          increase, but so far very encouraging numbers!



  •  The Inform Service can be used to implement a
    more personalized content and ad experience
    with minimal implementation effort.


  •  Talk to me about using our API!



                                                                     30
Thank You!


   Questions?

   Marc Hadfield
   CTO, Inform Technologies
   marc@inform.com




                              31
Example CMS Integration




                          32
Published Article:




                     33

Más contenido relacionado

Similar a Inform: Targeting the Interest Graph

Social media radar | Artur Karda
Social media radar | Artur KardaSocial media radar | Artur Karda
Social media radar | Artur KardaArtur Karda
 
Kiev congress social media radar - artur karda 2011-10-21
Kiev congress   social media radar - artur karda 2011-10-21Kiev congress   social media radar - artur karda 2011-10-21
Kiev congress social media radar - artur karda 2011-10-21Oksana Kushnir
 
Discovery & Reuse of Content
Discovery & Reuse of ContentDiscovery & Reuse of Content
Discovery & Reuse of ContentKlaris IP
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & AnalysisScott Sanders
 
Mariana Alupului Inventions
Mariana Alupului InventionsMariana Alupului Inventions
Mariana Alupului Inventionsmalupului
 
Painless XML Authoring?: How DITA Simplifies XML
Painless XML Authoring?: How DITA Simplifies XMLPainless XML Authoring?: How DITA Simplifies XML
Painless XML Authoring?: How DITA Simplifies XMLScott Abel
 
Linked services for the Web of Data
Linked services for the Web of DataLinked services for the Web of Data
Linked services for the Web of DataJohn Domingue
 
Jeremy cabral search marketing summit - scraping data-driven content (1)
Jeremy cabral   search marketing summit - scraping data-driven content (1)Jeremy cabral   search marketing summit - scraping data-driven content (1)
Jeremy cabral search marketing summit - scraping data-driven content (1)Jeremy Cabral
 
Gilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence StrategiesGilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence StrategiesEric Barroca
 
Oracle Multichannel Content Management
Oracle Multichannel Content ManagementOracle Multichannel Content Management
Oracle Multichannel Content ManagementOracle
 
PoolParty Thesaurus Management Quick Overview
PoolParty Thesaurus Management Quick OverviewPoolParty Thesaurus Management Quick Overview
PoolParty Thesaurus Management Quick OverviewAndreas Blumauer
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebAmit Sheth
 
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing TagSPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing TagKnowledge Management Associates, LLC
 

Similar a Inform: Targeting the Interest Graph (20)

PFI Corporate Profile
PFI Corporate ProfilePFI Corporate Profile
PFI Corporate Profile
 
Social media radar | Artur Karda
Social media radar | Artur KardaSocial media radar | Artur Karda
Social media radar | Artur Karda
 
Kiev congress social media radar - artur karda 2011-10-21
Kiev congress   social media radar - artur karda 2011-10-21Kiev congress   social media radar - artur karda 2011-10-21
Kiev congress social media radar - artur karda 2011-10-21
 
Discovery & Reuse of Content
Discovery & Reuse of ContentDiscovery & Reuse of Content
Discovery & Reuse of Content
 
People aggregator
People aggregatorPeople aggregator
People aggregator
 
Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
1377 impact v9_final_2
1377 impact v9_final_21377 impact v9_final_2
1377 impact v9_final_2
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
 
Mariana Alupului Inventions
Mariana Alupului InventionsMariana Alupului Inventions
Mariana Alupului Inventions
 
Painless XML Authoring?: How DITA Simplifies XML
Painless XML Authoring?: How DITA Simplifies XMLPainless XML Authoring?: How DITA Simplifies XML
Painless XML Authoring?: How DITA Simplifies XML
 
Linked services for the Web of Data
Linked services for the Web of DataLinked services for the Web of Data
Linked services for the Web of Data
 
Jeremy cabral search marketing summit - scraping data-driven content (1)
Jeremy cabral   search marketing summit - scraping data-driven content (1)Jeremy cabral   search marketing summit - scraping data-driven content (1)
Jeremy cabral search marketing summit - scraping data-driven content (1)
 
Fundamentals Of Search
Fundamentals Of SearchFundamentals Of Search
Fundamentals Of Search
 
Mark logic for dita
Mark logic for ditaMark logic for dita
Mark logic for dita
 
Gilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence StrategiesGilbane SF - Content Convergence Strategies
Gilbane SF - Content Convergence Strategies
 
Oracle Multichannel Content Management
Oracle Multichannel Content ManagementOracle Multichannel Content Management
Oracle Multichannel Content Management
 
PoolParty Thesaurus Management Quick Overview
PoolParty Thesaurus Management Quick OverviewPoolParty Thesaurus Management Quick Overview
PoolParty Thesaurus Management Quick Overview
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
Content Management, Metadata and Semantic Web
Content Management, Metadata and Semantic WebContent Management, Metadata and Semantic Web
Content Management, Metadata and Semantic Web
 
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing TagSPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
SPSTCDC - Managed Metadata and Taxonomies in SharePoint 2010 - Playing Tag
 

Más de Vital.AI

Optimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceOptimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceVital.AI
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital.AI
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data ModelingVital.AI
 
Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect WorldVital.AI
 
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to AudienceBuilding the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to AudienceVital.AI
 

Más de Vital.AI (6)

Optimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceOptimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data Science
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 
Vital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent AppsVital.AI Creating Intelligent Apps
Vital.AI Creating Intelligent Apps
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
 
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to AudienceBuilding the Inform Semantic Publishing Ecosystem: from Author to Audience
Building the Inform Semantic Publishing Ecosystem: from Author to Audience
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Inform: Targeting the Interest Graph

  • 1. Targeting the Interest Graph: Personalization of content and ad selection using the Inform Service Marc Hadfield CTO, Inform Semantic Technology Conference, 2011
  • 2. Introduction Marc Hadfield is CTO of Inform Technologies. Interests: Natural Language Processing, Semantics, Life Science Graph Algorithms, Machine Learning, Big Data Inform Technologies is a semantic technology company. Inform provides semantic technology – NLP and Analytics to Publishers, and operates a user generated forum site Yuku.com. We at Inform have been evolving our technology to the user generated content space. We’ve adapted our technology to different kinds of content such as informal text, photos, videos, and questions. We’ve recently addressed Ad Selection, Video Selection, and Personalization. I’ll discuss some of our results with the Interest Graph. 2
  • 3. Inform Service Semantic Software-as-a-Service for Publishers Advantage: ~30% boost in engagement in “traditional” publisher websites. Tracks 4,000+ Subjects and 320,000+ Entities: Inform Topics Inform Service: –  In-Article links to Topics Pages –  Related Articles from the Archive –  Related Articles around the Web –  Related Photos –  Related Videos –  Topic Pages including mix of content sources –  Tools (Publishing Tools, etc.) 3
  • 5. Yuku Forums   Forum Content –  “Old School” user generated content –  ~40,000 forums –  Top 100 forums account for about 50% of traffic –  ~1 Billion short form content pieces –  ~1 Million monthly unique users –  ~150K new content objects per day –  ~1 Million Page Views per Day   Subscription / Advertising Revenue   Inform adapting / integration our Semantic Tech   Great laboratory for testing algorithms / theories –  Apply more broadly than Yuku platform   Nice A/B testing environment   Testing new algorithms on our ForumFind search engine –  And embedded widgets in Yuku   Good reason to improve Ad Selection 5
  • 6. Today: Personalization for Enhanced Targeting •  Capturing the Interest Graph •  Personalized experience   Help People find interesting content   Make Ads relevant Occam 6
  • 7. Inform Content & Analytics Platform Licensed / 3rd Content / Data Crawled Party / Ingestion Content Activity Data Text Analysis Algorithms Core Engine Occam Categorization / Personalization Content Distribution Publisher site Yuku Widgets 7
  • 8. Inform “Occam” Architecture Example Workflow: • REST Webservice Call Receive • Queue Message • Get URL • Extract Document Features Extract • Extract Text • NLP Features (Machine Learning) • Inference Engine (Prolog / Frame Logic) NLP • Discourse / Behavior / Sentiment Models (Prolog / Frame Logic) (New) • Trend Analysis (incremental data) • Graph Analysis (incremental data) Analysis • Store in Semantic Repository (if needed) • Send Reply Message (via Queue or Webservice) Reply 8
  • 9. Inform API   REST Based   Queue for high volume content exchange   Returns data in RDF, XML, or JSON   All Content has a URI   All Inform Topics have URIs (can be dereferenced)   Insert Content, Update Content, Delete Content   Login / Logout   Change Status of Content (Published, Unpublished)   Content can be “GET” –  Associated Topics (Subjects and Entities) returned –  Include scores   Search Inform Topics   Semantic Search –  Simplified queries (not full sparql) –  Typical Query: Get Content of Type “Article” about “Barack Obama” ranked by score 9
  • 10. Inform API (2)   Related Content –  Articles, Messages, Photos, Videos, Questions, Web   AdContext™ (new) –  URL  IAB Topics + Inform Topics   VideoContext™ (new) –  URL  Inform Topics –  Related Videos   InterestGraph (new) –  Parameters: user-id / session-id  Inform Topics   Personalized AdContext™ (new) –  URL + session-id / user-id (anonymized)  IAB Topics + Inform Topics 10
  • 11. AdContext™: IAB Ad Standards IAB (Interactive Advertising Bureau) Standard to return a set of metadata about a website, webpage, section of a webpage to assist advertising within web content. Defines how a Topic may be associated with web content. Defines a set of standard upper level Topics such as “Science”, “Sports”, and “Business”, and mid-level Topics such as “Golf” and “Fashion”. These are tier-1 and tier-2. Inform has aligned the IAB Topics with Inform’s Topics. Inform can deliver more specific Topics (the full set of Inform Topics) as “tier-3” IAB Topics. The AdContext™ service returns this metadata. Ad Networks may use the service to assist in ad selection. Semantic Ad Selection may improve yield 2X – 5X (as per various external studies). 11
  • 12. Aside: rNews RDFa Standard rNews: embedding metadata in online news rNews is a proposed standard for using RDFa to annotate news-specific metadata in HTML documents. The rNews proposal has been developed by the IPTC, a consortium of the world's major news agencies, news publishers and news industry vendors. rNews is currently in draft form and the IPTC welcomes feedback on how to improve the standard in the rNews Forum. http://dev.iptc.org/rNews Why? SEO, Rich Snippets, Reduce “scrapper” error, better metadata. Inform API returns via the API rNews metadata ready to embed in news articles (in testing). 12
  • 13. Publisher Customer Example: Inform automatically tags entities (people, places, companies, and organizations) and provides related topics, articles, and media The Related News Widget pulls in the most relevant and recent articles from within the New York Daily News Archive 13
  • 14. Customer Example: Inform also generates highly Inform’s tags engaging can be brought and together in relevant numerous ways slideshows to create a richer experience for consumers 14
  • 15. Demo Inform API w/Facebook How to connect Inform to the social graph? 15
  • 16. Demo Inform API w/Facebook 16
  • 17. Demo Inform API w/Facebook 17
  • 18. Demo Inform API w/Facebook Inform Topics mapped to Wikipedia Pages and thus to other Concepts – including the Facebook “Like” Graph 18
  • 19. Interest Graph •  Inform Topics •  ~1 Billion content pieces   4,000+ Subjects in Hierarchy total (SKOS)   Forum Messages, Replies,   320,000+ Entities Photos, Videos   Wikipedia Pages   Wikipedia Categories •  150K new content pieces per day   Inform “same-as” links to Wikipedia •  1 Million+ PageViews per Day •  1 Million+ Monthly Unique •  ~5 Million ads serviced per Day Users Goal: Link Users to Topics for selection of content and ads 19
  • 20. Personalization Signals •  Content is “about” a Topic (subject or entity) •  User submits Content (“write”)   Message, Reply, Photo, Video, Question, … •  User reads Content (“view”)   Message, Reply, Photo, Video, Question Trends / Global Aggregation: •  Importance Metric •  Bursty / Velocity •  Sentiment ( “:-)”, “LOL”, …)   “Like” the topic? “Dislike” the topic? Context? –  i.e. dislike a Football Team, so “likes” to hear when they lose (negative sentiment) •  Other features… 20
  • 21. Interest Graph Algorithms Criteria: •  Near Real-Time •  Highly parallel to allow for scaling •  Fuzzy Data, Flexible data model Implementation: •  General Graph Representation   Node Weights, Edge Weights, Node Types, Edge Types •  Graph walk to extract a User’s Interest Graph •  Parallel Message-Passing Algorithms for Graph Analysis   Importance, PageRank, Centrality   Spreading Activitation   Pregel-like implementation (Signal/Collect) •  Add Graph Analytics to Workflow 21
  • 23. Niketalk User Interest Graph (local) Without global importance metric: 23
  • 24. Niketalk User Interest Graph (global) With global importance metric: Recommendations can be made reflecting the shifting interests of the global community. 24
  • 25. Example Yuku Forum - Gymnastics 25
  • 27. ForumFind – Topic, Ad, Content 27
  • 28. ForumFind – MyForumFind (user: jjb2 ) 28
  • 29. Interest Graph – User Insights •  “Everybody Lies” (“House” TV Show) –  The only way to know the users interests is to have an implicit channel to detect interests without impacting user behavior •  People have broad / dynamic interests •  People read “trash” –  i.e. everyone reads Celebrity Gossip –  If convenient / no one looking •  Global Data can be used to make recommendations   No surprise, but nice to have confirmation •  People move on   “Likes” need to expire •  Recommendations for content and ads can be implemented in a highly dynamic and parallel fashion running in real time with reasonable resources using graph analysis 29
  • 30. Interest Graph – Conclusion •  Using a User’s Graph of Interests can dramatically improve the user’s engagement   Data still being gathered within Inform as to percentage increase, but so far very encouraging numbers! •  The Inform Service can be used to implement a more personalized content and ad experience with minimal implementation effort. •  Talk to me about using our API! 30
  • 31. Thank You! Questions? Marc Hadfield CTO, Inform Technologies marc@inform.com 31

Notas del editor

  1. Content / Activity Ingestion Diversity of content sources Data / activity ingestion Occam Big Data Processing and Scale Search, Storage, Archive Text analysis for categorization and organization Algorithms drive content discovery Intersection of content and activity data yields trends and personalization Content Distribution Dynamic content assignment and publishing Cross-platform publishing via apps and APIs Emphasis on integration with emerging data & content standards