SlideShare una empresa de Scribd logo
1 de 17
Faceted Navigation with django-haystack and solr
DjangoTO 2012-02-09
Craig Nagy @nagyman
What is Faceted Navigation/Browsing?

• Visible options for narrowing a set of items based on metadata; interactive
  query building


• Important for e-commerce sites with a significant number of products; shared
  classifications/metadata


• Browse vs Search, Links vs. Forms
Engines with Faceted Search

• Commercial: Endeca, FAST Search Server (Microsoft, SharePoint)


• Open Source: solr (backed by Apache Lucene), Whoosh, Sphinx (“multi-
  queries”)


• SQL - COUNT, GROUP BY, JOINs, oh my
Solr

• Full text, faceted, distributed, amazing performance (Java)


• Lucene syntax - title:”The Right Way” AND text:go, text:swim?ing, test~, craig
  nagy^4


• JSON/XML API - transparent with Haystack


• Run in a servlet container (e.g. Tomcat); Haystack provides easy local testing
  (Jetty) - manage.py solr --start


• Data is denormalized; documents based on a schema
Solr Schema	

• All fields to index in schema.xml. Haystack has tools to help generate this file
  (manage.py build_solr_schema)


• _exact fields for facets, multiValued=”true” for lists of data, data types
Haystack

• Provides QuerySet-like API to a number of search backends
  haystack.query.SearchQuerySet


  • sqs = SearchQuerySet().filter(content='foo',
    pub_date__lte=datetime.date(2012, 1, 1))


• Mirror your models, and any value you want to index


• Automatic content_type handling


• Performance Tip: Don’t access unindexed data
Haystack - Indexing

• subclass haystack.indexes.SearchIndex.


• prepare_<fieldname> or prepare callbacks


• manage.py update_index or real-time
Full text indexing




Prepare Fields



Register
Not Included

• Not included in Haystack


  • Views, templates, utilities, generating URLs, handling special data types


• Custom Helpers: Searcher, FacetList, Facet, FacetItem, templatetags


• Extras: Facet Landing Pages (e.g. Etsy, Zappos, G Adventures)
Facet Parsing
• Haystack returns dictionaries of facet data; we parse into custom objects
Template Tags




templatetags for arranging the faceted navigation
UI Pattern - Integrated Faceted Breadcrumbs (IFB)

• http://boxesandarrows.com/view/faceted-finding-with (by Greg Nudelman)
Integrated Faceted Breadcrumbs
Refs

• http://www.flickr.com/photos/morville/collections/72157603789246885/


• http://haystacksearch.org/


• https://github.com/toastdriven/django-haystack


• http://boxesandarrows.com/view/faceted-finding-with


• http://lucene.apache.org/solr/


• http://django-haystack.readthedocs.org/en/latest/tutorial.html


• http://www.gadventures.com/trips/
Fin.

Más contenido relacionado

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Destacado

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destacado (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 

Faceted navigation using django haystack and solr

  • 1. Faceted Navigation with django-haystack and solr DjangoTO 2012-02-09 Craig Nagy @nagyman
  • 2. What is Faceted Navigation/Browsing? • Visible options for narrowing a set of items based on metadata; interactive query building • Important for e-commerce sites with a significant number of products; shared classifications/metadata • Browse vs Search, Links vs. Forms
  • 3.
  • 4.
  • 5. Engines with Faceted Search • Commercial: Endeca, FAST Search Server (Microsoft, SharePoint) • Open Source: solr (backed by Apache Lucene), Whoosh, Sphinx (“multi- queries”) • SQL - COUNT, GROUP BY, JOINs, oh my
  • 6. Solr • Full text, faceted, distributed, amazing performance (Java) • Lucene syntax - title:”The Right Way” AND text:go, text:swim?ing, test~, craig nagy^4 • JSON/XML API - transparent with Haystack • Run in a servlet container (e.g. Tomcat); Haystack provides easy local testing (Jetty) - manage.py solr --start • Data is denormalized; documents based on a schema
  • 7. Solr Schema • All fields to index in schema.xml. Haystack has tools to help generate this file (manage.py build_solr_schema) • _exact fields for facets, multiValued=”true” for lists of data, data types
  • 8. Haystack • Provides QuerySet-like API to a number of search backends haystack.query.SearchQuerySet • sqs = SearchQuerySet().filter(content='foo', pub_date__lte=datetime.date(2012, 1, 1)) • Mirror your models, and any value you want to index • Automatic content_type handling • Performance Tip: Don’t access unindexed data
  • 9. Haystack - Indexing • subclass haystack.indexes.SearchIndex. • prepare_<fieldname> or prepare callbacks • manage.py update_index or real-time
  • 10. Full text indexing Prepare Fields Register
  • 11. Not Included • Not included in Haystack • Views, templates, utilities, generating URLs, handling special data types • Custom Helpers: Searcher, FacetList, Facet, FacetItem, templatetags • Extras: Facet Landing Pages (e.g. Etsy, Zappos, G Adventures)
  • 12. Facet Parsing • Haystack returns dictionaries of facet data; we parse into custom objects
  • 13. Template Tags templatetags for arranging the faceted navigation
  • 14. UI Pattern - Integrated Faceted Breadcrumbs (IFB) • http://boxesandarrows.com/view/faceted-finding-with (by Greg Nudelman)
  • 16. Refs • http://www.flickr.com/photos/morville/collections/72157603789246885/ • http://haystacksearch.org/ • https://github.com/toastdriven/django-haystack • http://boxesandarrows.com/view/faceted-finding-with • http://lucene.apache.org/solr/ • http://django-haystack.readthedocs.org/en/latest/tutorial.html • http://www.gadventures.com/trips/
  • 17. Fin.

Notas del editor

  1. \n
  2. * (size, price, brand, type, categories).\n* improving findability\n* like products (e.g computers) not unlike products (books &amp; cars)\n\n
  3. AirBnb, Zappos, ToysRUs\n
  4. Browse vs Search\nLinks vs Forms\nOften little or &amp;#x201C;No Results&amp;#x201D;\nUnpredictable combinations\n
  5. SQL - Don&amp;#x2019;t torture yourself; not made for search\n
  6. * Lucene: boosting, boolean, wildcards, ranges, grouping, etc. But Haystack takes care of this, unless you need more advanced searching\n* avoid hitting your DB\n
  7. starting schema provided with Haystack - django_ct, django_id important for differentiating content types\nexpected text field for full-text search\n
  8. * filter, exclude, order_by, highlight\n
  9. * Similar to django model definition. If fields mirror model exactly, no extra work required. Otherwise use\n
  10. * SearchIndex, a la django model definitions. Denormalize your data here.\n* text field template\n* prepare fields for indexing\n\n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n