SlideShare una empresa de Scribd logo
1 de 39
design for interaction
   Daniel Tunkelang
   Chief Scientist, Endeca

      © 2009 Endeca Technologies, Inc. All rights reserved.
about me




    Organizing SIGIR ’09 Industry Track in Boston on July 22nd!


2                       © 2009 Endeca Technologies, Inc. All rights reserved.
about endeca


     leading provider of
     search applications




         250M+
          end users
              per month
                                                                                       600+ customers
                                                                                   $100M+ annual sales




3                          © 2009 Endeca Technologies, Inc. All rights reserved.
what i hope you learn from this talk




     the db and ir perspectives have a common thread



              convergence may be upon us



         but we need interaction to make it work



4                    © 2009 Endeca Technologies, Inc. All rights reserved.
overview




          don't put all your eggs in one basket



                 design for interaction



         human-computer information retrieval



5                   © 2009 Endeca Technologies, Inc. All rights reserved.
don’t put all your eggs in one basket




              Still Life with Basket and Broken Eggs by Michael Edwards, 2008




6                            © 2009 Endeca Technologies, Inc. All rights reserved.
the db approach: perfection in, perfection out




              http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/




7                             © 2009 Endeca Technologies, Inc. All rights reserved.
db usability researchers recognize the pain




8                   © 2009 Endeca Technologies, Inc. All rights reserved.
sql is hard


    Making Database Systems Usable
    [Jagadish et al., SIGMOD 2007]
                                                                              __
                                                                              sql


    • labor-intensive query construction

    • lengthy query evaluation

    • high query reformulation cost




9                     © 2009 Endeca Technologies, Inc. All rights reserved.
data sucks and users are lazy


     Extracting Problems for Database
     and IR Researchers
     [Naughton, Spring 2008 North East DB/IR Day]

     • real data is
        – incomplete
        – inconsistent
        – incorrect


     • users don’t want to learn
        – data schemas
        – structured query languages                                      we’re not gonna take it!



10                         © 2009 Endeca Technologies, Inc. All rights reserved.
the ir way: don’t worry, be happy




                http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries



11                          © 2009 Endeca Technologies, Inc. All rights reserved.
ir for db people: what would google do?


                                        tf-idf                                      PageRank
 SYSTEM:



                                                 rank using IR model




 USER:




     information Need       query                                               select from results


12                      © 2009 Endeca Technologies, Inc. All rights reserved.
assumptions of relevance-centric ir approach



                                              • self-awareness

                                              • self-expression

                                              • model knows best

                                              • answer is a document

                                              • one-shot query


13                  © 2009 Endeca Technologies, Inc. All rights reserved.
life is not a batch


     • db approach expects too much of user
     • ir approach expects too much of system



              both approaches act as if it all
              comes down to a single query




                  is that your final answer question?


14                      © 2009 Endeca Technologies, Inc. All rights reserved.
design for interaction




                   The Future of Social Interaction by Jim Stoten




15                       © 2009 Endeca Technologies, Inc. All rights reserved.
changes assumptions about what to optimize




                                                                           precision
                                                                                         recall
          complexity                                                                   relevance




                            communication


16                     © 2009 Endeca Technologies, Inc. All rights reserved.
how do we optimize communication?




           transparency

                                                                                  guidance




                  control

17                        © 2009 Endeca Technologies, Inc. All rights reserved.
ir offers a black box




           ca c'est la caisse. le mouton que tu veux est dedans.




18                        © 2009 Endeca Technologies, Inc. All rights reserved.
db / set retrieval offers 2 out of 3




            transparency

                                                                                   guidance




                   control

19                         © 2009 Endeca Technologies, Inc. All rights reserved.
but we need it all!


     • set retrieval is a failure in the ir world
        – though quite successful in the db world!


     • but ranked retrieval is inherently crippled
        – no transparency, control, or guidance!




        how do we optimize for communication?




20                        © 2009 Endeca Technologies, Inc. All rights reserved.
human-computer information retrieval



                          “Toward Human-Computer
                           Information Retrieval”

                          Gary Marchionini


     • don’t just guess the user’s intent
     • increase user responsibility and control
     • require and reward human intellectual effort




21                     © 2009 Endeca Technologies, Inc. All rights reserved.
great idea




                                  how?




22                © 2009 Endeca Technologies, Inc. All rights reserved.
treat query construction as a process


     A Case for Interaction
     [Koenemann and Belkin, 1996]

     • used term feedback to improve alerting queries

     • users select from suggested terms

     • 17 – 34% improvement in precision @ 30

     • users liked the feedback interface


23                    © 2009 Endeca Technologies, Inc. All rights reserved.
expose the facets of semistructured content




24                  © 2009 Endeca Technologies, Inc. All rights reserved.
success in the lab and the field


     • favored in user studies by Marti Hearst
        – http://flamenco.berkeley.edu/


     • ubiquitous in ecommerce
        – amazon.com
        – eBay
        – endeca powers 42 of top 100 online retailers


     • taking over media, libraries, enterprise, etc.




25                       © 2009 Endeca Technologies, Inc. All rights reserved.
even a few db folks have drunk the kool-aid


     DataGuides
     [Goldman and Widom, VLDB 1997]
     • user-friendly schema summaries


     Magnet
     [Sinha and Karger, SIGMOD 2005]
     • navigation and refinement options

             common theme: semistructured


26                    © 2009 Endeca Technologies, Inc. All rights reserved.
what is semistructured data?




                                             • one universe

                                             • self-describing

                                             • blends data / meta-data




27                  © 2009 Endeca Technologies, Inc. All rights reserved.
data modeling flexibility


     • no a-priori schema
        – integrated sources without up-front schema design


     • richer modeling capabilities tame data complexity
        – hierarchy, multi-valued fields, sparse fields


     • schema flexibility eases schema evolution
        – new entity types, new data source




                   WWW                               SOA, ESB,               Groupware and            Content
      Databases                                                                               ERP
                  Internet   File Systems           Web Service               Collaboration         Management




28                           © 2009 Endeca Technologies, Inc. All rights reserved.
semantically direct queries


                                                               which attributes
            which on-sale items                                characterize on-sale
            are available in blue?                             blue items?

                                                                                        price, sleeve,
                                                                                        color, salePrice,
                                                                                        brand, fabric, …




           <shirt>
                                                        <buyingGuide>
                 <sku>1234</sku>
                                                              <title>Selecting the right
                 <sleeve>Long</sleeve>
                                                                  ski coat for you.</title>
                 <desc>Classic end-on-end shirt</desc>
                                                              <file>skiguide.pdf</file>
                 <price>39.99</price>
                                                              <keyword>ski</keyword>
                 <salePrice>29.99</salePrice>
                                                              <keyword>coat</keyword>
                 <color>Blue</color>
                                                              ...
                 <color>Yellow</color>
                                                        </buyingGuide>
                 <color>White</color>
                 ...
           </shirt>               <trousers>
                                        <sku>1579</sku>
                                        <price>59.99</price>
                                        <color>Khaki</color>
                                        ...
                                  </trousers>


29                              © 2009 Endeca Technologies, Inc. All rights reserved.
but let’s make this concrete


                         Uh oh, I’m presenting at
                        SIGMOD! Better find a good
                          book about databases!




30                   © 2009 Endeca Technologies, Inc. All rights reserved.
quick, to the goog-mobile!




                                                                         not quite…




31                   © 2009 Endeca Technologies, Inc. All rights reserved.
i know, i’ll go to the library!




                                                                               #%@$!




32                     © 2009 Endeca Technologies, Inc. All rights reserved.
let’s try a little hcir…




33                     © 2009 Endeca Technologies, Inc. All rights reserved.
hcir works for news too




34                  © 2009 Endeca Technologies, Inc. All rights reserved.
life in a semistructured world


     • search is a great starting point
        – users can’t / won’t initiate structured queries


     • ranked lists are an inadequate ending point
        – search queries are lossy projections of intent


     • hcir leads users down a garden path to structure




35                        © 2009 Endeca Technologies, Inc. All rights reserved.
lots of trade-offs


     “everything should be made as simple
      as possible, but no simpler”

     “speed of thought” vs. “going nowhere quickly”

     “to err is human, but to really foul
      things up requires a computer”

                   simple interfaces don’t
                  always yield satisfaction


36                      © 2009 Endeca Technologies, Inc. All rights reserved.
users want the triumvirate


     • transparency
     • control
     • guidance



           transparency and control are easy

              guidance requires cleverness




37                    © 2009 Endeca Technologies, Inc. All rights reserved.
in closing




      all of us want to help people access information



        the best help is to help them help themselves



                design for interaction though
              transparency, control, guidance


38                    © 2009 Endeca Technologies, Inc. All rights reserved.
thank you…and come to SIGIR!


                communication 1.0
               email: dt@endeca.com

                 communication 2.0
          blog: http://thenoisychannel.com
        twitter: http://twitter.com/dtunkelang

            SIGIR: July 19-23 in Boston
            Industry Track on July 22nd!


39                 © 2009 Endeca Technologies, Inc. All rights reserved.

Más contenido relacionado

La actualidad más candente

Intel Corporation - BA401
Intel Corporation - BA401Intel Corporation - BA401
Intel Corporation - BA401guest3ea4529f
 
Emc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility SeminarEmc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility SeminarExponential_e
 
Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012Werner Luetkemeier
 
Bbx Biz Plan Presentation
Bbx Biz Plan PresentationBbx Biz Plan Presentation
Bbx Biz Plan PresentationPaul Brisson
 
MDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generationMDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generationPedro J. Molina
 
Cloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your EnterpriseCloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your EnterpriseXO Communications
 
Discovering Computers: Chapter 03
Discovering Computers: Chapter 03Discovering Computers: Chapter 03
Discovering Computers: Chapter 03Anna Stirling
 
Fun and games for profit
Fun and games for profitFun and games for profit
Fun and games for profitVenu Vasudevan
 
Code Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface PatternsCode Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface PatternsPedro J. Molina
 
Irfan Ur Rehman
Irfan Ur RehmanIrfan Ur Rehman
Irfan Ur Rehmanmrcool2002
 
Dispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revcDispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revckdelcol
 
HP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 finalHP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 finallaurabeckcahoon
 
Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...Stefan Pfeiffer
 
Exploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIOExploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIOJessvin Thomas
 
Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011John Weiler
 
Mobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - larsMobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - larsLars Bodenhoff
 

La actualidad más candente (19)

Intel Corporation - BA401
Intel Corporation - BA401Intel Corporation - BA401
Intel Corporation - BA401
 
Emc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility SeminarEmc - Journey to the Cloud - Business Agility Seminar
Emc - Journey to the Cloud - Business Agility Seminar
 
Emc expoesymposium
Emc expoesymposiumEmc expoesymposium
Emc expoesymposium
 
Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012Novell Tour Europe and South Africa 2012
Novell Tour Europe and South Africa 2012
 
Bbx Biz Plan Presentation
Bbx Biz Plan PresentationBbx Biz Plan Presentation
Bbx Biz Plan Presentation
 
MDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generationMDD: Models, frameworks, & code generation
MDD: Models, frameworks, & code generation
 
Cloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your EnterpriseCloud Communications: Top 5 Advantages for Your Enterprise
Cloud Communications: Top 5 Advantages for Your Enterprise
 
Discovering Computers: Chapter 03
Discovering Computers: Chapter 03Discovering Computers: Chapter 03
Discovering Computers: Chapter 03
 
Curated Computing
Curated Computing Curated Computing
Curated Computing
 
Fun and games for profit
Fun and games for profitFun and games for profit
Fun and games for profit
 
Code Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface PatternsCode Generation for Conceptual User Interface Patterns
Code Generation for Conceptual User Interface Patterns
 
EMC Overview
EMC OverviewEMC Overview
EMC Overview
 
Irfan Ur Rehman
Irfan Ur RehmanIrfan Ur Rehman
Irfan Ur Rehman
 
Dispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revcDispelling the mystery around resource planning revc
Dispelling the mystery around resource planning revc
 
HP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 finalHP Open Stack Keynote 4 18_2012 final
HP Open Stack Keynote 4 18_2012 final
 
Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...Business made Social - How social technologies and behaviour are changing the...
Business made Social - How social technologies and behaviour are changing the...
 
Exploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIOExploring the future of the IT industry and the next generation CIO
Exploring the future of the IT industry and the next generation CIO
 
Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011Congressional it reform-roadmap_2011
Congressional it reform-roadmap_2011
 
Mobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - larsMobile advisor zenprise-pitch - lars
Mobile advisor zenprise-pitch - lars
 

Similar a Design for Interaction

Cloud Technology to Facilitate Growth
Cloud Technology to Facilitate GrowthCloud Technology to Facilitate Growth
Cloud Technology to Facilitate GrowthIconnyx
 
Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009Francis D'Silva
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companiesData Science Milan
 
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz Claudio Cinquepalmi
 
The Elements Of User Experience
The Elements Of User ExperienceThe Elements Of User Experience
The Elements Of User ExperienceJohn Chen, Jun
 
Day of data: skills for the future
Day of data: skills for the futureDay of data: skills for the future
Day of data: skills for the futureSteven Miller
 
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info
 
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdfBrunoAtti1
 
Cw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcCw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcinevitablecloud
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform javaCh'ti JUG
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform javaMichael Chaize
 
Who Made This Mess?
Who Made This Mess?Who Made This Mess?
Who Made This Mess?mmiddaugh
 
Ds roi tc_world
Ds roi tc_worldDs roi tc_world
Ds roi tc_worldvsrtwin
 
Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008Roland Tritsch
 
Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010phptechtalk
 
Support as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with CiscoSupport as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with CisconoHold, Inc.
 

Similar a Design for Interaction (20)

Jobs in the Cloud
 Jobs in the Cloud Jobs in the Cloud
Jobs in the Cloud
 
Cloud Technology to Facilitate Growth
Cloud Technology to Facilitate GrowthCloud Technology to Facilitate Growth
Cloud Technology to Facilitate Growth
 
Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009Cloud Computing, Business Models, Geilo April 2009
Cloud Computing, Business Models, Geilo April 2009
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
 
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz Ibm software network2012 claudio cinquepalmi  #ibmsocialbiz
Ibm software network2012 claudio cinquepalmi #ibmsocialbiz
 
The Elements Of User Experience
The Elements Of User ExperienceThe Elements Of User Experience
The Elements Of User Experience
 
101 ab 1445-1515
101 ab 1445-1515101 ab 1445-1515
101 ab 1445-1515
 
101 ab 1445-1515
101 ab 1445-1515101 ab 1445-1515
101 ab 1445-1515
 
Day of data: skills for the future
Day of data: skills for the futureDay of data: skills for the future
Day of data: skills for the future
 
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin BreitmanRio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
Rio Info 2015 - Projetos de Big Data no Setor Público - Karin Breitman
 
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
17h25_closing_keynote_stefano_stinchi_-_innovation_story.pdf
 
Cw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emcCw13 cloud meets big data by ibrahim alloub-emc
Cw13 cloud meets big data by ibrahim alloub-emc
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
 
Adobe flash platform java
Adobe flash platform javaAdobe flash platform java
Adobe flash platform java
 
Who Made This Mess?
Who Made This Mess?Who Made This Mess?
Who Made This Mess?
 
Ds roi tc_world
Ds roi tc_worldDs roi tc_world
Ds roi tc_world
 
Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008Mobile Monday - WebServices on the iPhone - 05/2008
Mobile Monday - WebServices on the iPhone - 05/2008
 
Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010Php In The Enterprise 01 24 2010
Php In The Enterprise 01 24 2010
 
EMC & Techno Vision
EMC & Techno VisionEMC & Techno Vision
EMC & Techno Vision
 
Support as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with CiscoSupport as a Leader in Innovation: A Case Study with Cisco
Support as a Leader in Innovation: A Case Study with Cisco
 

Más de Daniel Tunkelang

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and EcommerceDaniel Tunkelang
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesDaniel Tunkelang
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingDaniel Tunkelang
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A ManifestoDaniel Tunkelang
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?Daniel Tunkelang
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityDaniel Tunkelang
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningDaniel Tunkelang
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?Daniel Tunkelang
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query UnderstandingDaniel Tunkelang
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional ContextDaniel Tunkelang
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInDaniel Tunkelang
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneyDaniel Tunkelang
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Daniel Tunkelang
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Daniel Tunkelang
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsDaniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and ContextDaniel Tunkelang
 

Más de Daniel Tunkelang (20)

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and Ecommerce
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce Queries
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query Understanding
 
MMM, Search!
MMM, Search!MMM, Search!
MMM, Search!
 
Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A Manifesto
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for Productivity
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query Understanding
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional Context
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of Needs
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and Context
 

Último

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Último (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Design for Interaction

  • 1. design for interaction Daniel Tunkelang Chief Scientist, Endeca © 2009 Endeca Technologies, Inc. All rights reserved.
  • 2. about me Organizing SIGIR ’09 Industry Track in Boston on July 22nd! 2 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 3. about endeca leading provider of search applications 250M+ end users per month 600+ customers $100M+ annual sales 3 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 4. what i hope you learn from this talk the db and ir perspectives have a common thread convergence may be upon us but we need interaction to make it work 4 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 5. overview don't put all your eggs in one basket design for interaction human-computer information retrieval 5 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 6. don’t put all your eggs in one basket Still Life with Basket and Broken Eggs by Michael Edwards, 2008 6 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 7. the db approach: perfection in, perfection out http://www.storeitfoodsblog.com/category/food-preparation/meat-grinder/ 7 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 8. db usability researchers recognize the pain 8 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 9. sql is hard Making Database Systems Usable [Jagadish et al., SIGMOD 2007] __ sql • labor-intensive query construction • lengthy query evaluation • high query reformulation cost 9 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 10. data sucks and users are lazy Extracting Problems for Database and IR Researchers [Naughton, Spring 2008 North East DB/IR Day] • real data is – incomplete – inconsistent – incorrect • users don’t want to learn – data schemas – structured query languages we’re not gonna take it! 10 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 11. the ir way: don’t worry, be happy http://adsoftheworld.com/media/print/mcdonalds_burger_mysteries 11 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 12. ir for db people: what would google do? tf-idf PageRank SYSTEM: rank using IR model USER: information Need query select from results 12 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 13. assumptions of relevance-centric ir approach • self-awareness • self-expression • model knows best • answer is a document • one-shot query 13 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 14. life is not a batch • db approach expects too much of user • ir approach expects too much of system both approaches act as if it all comes down to a single query is that your final answer question? 14 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 15. design for interaction The Future of Social Interaction by Jim Stoten 15 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 16. changes assumptions about what to optimize precision recall complexity relevance communication 16 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 17. how do we optimize communication? transparency guidance control 17 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 18. ir offers a black box ca c'est la caisse. le mouton que tu veux est dedans. 18 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 19. db / set retrieval offers 2 out of 3 transparency guidance control 19 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 20. but we need it all! • set retrieval is a failure in the ir world – though quite successful in the db world! • but ranked retrieval is inherently crippled – no transparency, control, or guidance! how do we optimize for communication? 20 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 21. human-computer information retrieval “Toward Human-Computer Information Retrieval” Gary Marchionini • don’t just guess the user’s intent • increase user responsibility and control • require and reward human intellectual effort 21 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 22. great idea how? 22 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 23. treat query construction as a process A Case for Interaction [Koenemann and Belkin, 1996] • used term feedback to improve alerting queries • users select from suggested terms • 17 – 34% improvement in precision @ 30 • users liked the feedback interface 23 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 24. expose the facets of semistructured content 24 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 25. success in the lab and the field • favored in user studies by Marti Hearst – http://flamenco.berkeley.edu/ • ubiquitous in ecommerce – amazon.com – eBay – endeca powers 42 of top 100 online retailers • taking over media, libraries, enterprise, etc. 25 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 26. even a few db folks have drunk the kool-aid DataGuides [Goldman and Widom, VLDB 1997] • user-friendly schema summaries Magnet [Sinha and Karger, SIGMOD 2005] • navigation and refinement options common theme: semistructured 26 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 27. what is semistructured data? • one universe • self-describing • blends data / meta-data 27 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 28. data modeling flexibility • no a-priori schema – integrated sources without up-front schema design • richer modeling capabilities tame data complexity – hierarchy, multi-valued fields, sparse fields • schema flexibility eases schema evolution – new entity types, new data source WWW SOA, ESB, Groupware and Content Databases ERP Internet File Systems Web Service Collaboration Management 28 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 29. semantically direct queries which attributes which on-sale items characterize on-sale are available in blue? blue items? price, sleeve, color, salePrice, brand, fabric, … <shirt> <buyingGuide> <sku>1234</sku> <title>Selecting the right <sleeve>Long</sleeve> ski coat for you.</title> <desc>Classic end-on-end shirt</desc> <file>skiguide.pdf</file> <price>39.99</price> <keyword>ski</keyword> <salePrice>29.99</salePrice> <keyword>coat</keyword> <color>Blue</color> ... <color>Yellow</color> </buyingGuide> <color>White</color> ... </shirt> <trousers> <sku>1579</sku> <price>59.99</price> <color>Khaki</color> ... </trousers> 29 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 30. but let’s make this concrete Uh oh, I’m presenting at SIGMOD! Better find a good book about databases! 30 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 31. quick, to the goog-mobile! not quite… 31 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 32. i know, i’ll go to the library! #%@$! 32 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 33. let’s try a little hcir… 33 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 34. hcir works for news too 34 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 35. life in a semistructured world • search is a great starting point – users can’t / won’t initiate structured queries • ranked lists are an inadequate ending point – search queries are lossy projections of intent • hcir leads users down a garden path to structure 35 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 36. lots of trade-offs “everything should be made as simple as possible, but no simpler” “speed of thought” vs. “going nowhere quickly” “to err is human, but to really foul things up requires a computer” simple interfaces don’t always yield satisfaction 36 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 37. users want the triumvirate • transparency • control • guidance transparency and control are easy guidance requires cleverness 37 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 38. in closing all of us want to help people access information the best help is to help them help themselves design for interaction though transparency, control, guidance 38 © 2009 Endeca Technologies, Inc. All rights reserved.
  • 39. thank you…and come to SIGIR! communication 1.0 email: dt@endeca.com communication 2.0 blog: http://thenoisychannel.com twitter: http://twitter.com/dtunkelang SIGIR: July 19-23 in Boston Industry Track on July 22nd! 39 © 2009 Endeca Technologies, Inc. All rights reserved.