SlideShare una empresa de Scribd logo
1 de 15
SPHINX AND THINKING
       SPHINX
      10 Minute Intro
HAYES DAVIS
      Founder, Appozite
cheaptweet.com | @cheaptweet
        @hayesdavis
SPHINX
•Open Source full-text search
 engine
•Designed around SQL
•Standalone daemon
 (searchd)


                                http://guardians.net/hawass/images/sphinx3.jpg
THINKING
     SPHINX
•Rails plugin
•Integrates Active Record
 with Sphinx
•Makes talking to Sphinx
 basically painless
BASIC IDEA


• Configure   your indexes

• Index

• Query

• Repeat
CONFIGURING INDEXES

• Add indexes on your AR            class Article < ActiveRecord::Base

 classes using define_index           define_index do
                                        # fields
• Fields (indexes)   contain text       indexes subject, :sortable => true
                                        indexes content
 you can search                         indexes author.name, :as=> :author,
                                          :sortable => true

• Attributes (has)
                 allow you to           # attributes

 sort and constrain your                has author_id, created_at,
                                            updated_at
 searches                             end

                                    end
• Careful!Column names
 aren’t symbols
Run the indexer
rake thinking_sphinx:index
source twitterer_core_0
{
  type = mysql
  sql_host = 127.0.0.1
  sql_user = cheaptweet
  sql_pass = cheaptweet
  sql_db = cheaptweet_development2
  sql_query_pre = UPDATE `twitterer` SET `delta` = 0
  sql_query_pre = SET NAMES utf8
  sql_query = SELECT `twitterer`.`id` * 1 + 0 AS `id` , CAST(`twitterer`.`screen_name` AS CHAR) AS `screen_name`, CAST(`twitterer`.`name` AS
CHAR) AS `name`, CAST(`twitterer`.`description` AS CHAR) AS `description`, CAST(`twitterer`.`url` AS CHAR) AS `url`,
CAST(`twitterer`.`location` AS CHAR) AS `location`, `twitterer`.`id` AS `sphinx_internal_id`, 283224142 AS `class_crc`, '283224142' AS
`subclass_crcs`, 0 AS `sphinx_deleted` FROM twitterer    WHERE `twitterer`.`id` >= $start   AND `twitterer`.`id` <= $end    AND
`twitterer`.`delta` = 0 GROUP BY `twitterer`.`id` ORDER BY NULL
  sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `twitterer` WHERE `twitterer`.`delta` = 0
  sql_attr_uint = sphinx_internal_id
  sql_attr_uint = class_crc
  sql_attr_uint = sphinx_deleted
  sql_attr_multi = uint subclass_crcs from field
  sql_query_info = SELECT * FROM `twitterer` WHERE `id` = (($id - 0) / 1)
}

index twitterer_core
{
  source = twitterer_core_0
  path = /Users/hayesdavis/Appozite/workspace/CheapTweet/data/sphinx/development/twitterer_core
  morphology = stem_en
  charset_type = utf-8
}




          MORE ABOUT INDEXING
Thinking Sphinx generates a config file for sphinx, indexes (aka
        “sources”) are defined. It’s a little complicated.
Start Sphinx
rake thinking_sphinx:start
#Searches all fields for “pants”
Article.search “pants”

#Conditions are allowed on fields but must be hash
Article.search “pants”, :conditions=>{
  :subject=>”How To Wear”
}

#Query attributes using :with
Article.search “pants”, :with=>{
  :author_id=>1, :created_at=>1.week.ago..Time.now
}




               SEARCHING
         Use the search method on AR classes
BUT WAIT
     HOW DO I KEEP INDEXES
(ESPECIALLY BIG ONES) UP TO DATE?
DELTA INDEXES TO THE
                 RESCUE
• Mini   index of only rows that have been updated

• Must    merge into “core” index periodically or it’ll get slow

• Simplest   approach: add delta boolean column to model

• Add set_property :delta=>true        to define_index block

• Delta   index is rebuilt on model saves, can cause performance
 hit
DEPLOYMENT &
                 PRODUCTION

• Must   schedule full re-indexing periodically

• Have   god or monit keep an eye on things

• Consider adding some cap tasks to help out with reindexing
 and restarting
TIPS, TRICKS, GOTCHAS

• Simplest   delta indexing can lead to performance issues

• Indexer assumes you have sequential ids on your DB rows and
 iterates through them in chunks - very bad if you have big
 gaps

• Run full indexing as often as you can without hurting
 performance - it’s usually pretty fast

• Youcan hand-edit config files if you need to tune - but be
 careful not to regenerate
RESOURCES


Sphinx http://www.sphinxsearch.com/

Thinking Sphinx http://freelancing-god.github.com/ts/en/

Railscast http://railscasts.com/episodes/120-thinking-sphinx

Más contenido relacionado

La actualidad más candente

5分で説明する Play! scala
5分で説明する Play! scala5分で説明する Play! scala
5分で説明する Play! scala
masahitojp
 
Intro To Moose
Intro To MooseIntro To Moose
Intro To Moose
cPanel
 

La actualidad más candente (20)

5分で説明する Play! scala
5分で説明する Play! scala5分で説明する Play! scala
5分で説明する Play! scala
 
Solr Anti - patterns
Solr Anti - patternsSolr Anti - patterns
Solr Anti - patterns
 
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
(DEV305) Building Apps with the AWS SDK for PHP | AWS re:Invent 2014
 
it's just search
it's just searchit's just search
it's just search
 
Mentor Your Indexes
Mentor Your IndexesMentor Your Indexes
Mentor Your Indexes
 
State of search | drupalcon dublin
State of search | drupalcon dublinState of search | drupalcon dublin
State of search | drupalcon dublin
 
Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch Transforming WordPress Search and Query Performance with Elasticsearch
Transforming WordPress Search and Query Performance with Elasticsearch
 
Assetic (Symfony Live Paris)
Assetic (Symfony Live Paris)Assetic (Symfony Live Paris)
Assetic (Symfony Live Paris)
 
Elegant APIs
Elegant APIsElegant APIs
Elegant APIs
 
Better Data Persistence on Android
Better Data Persistence on AndroidBetter Data Persistence on Android
Better Data Persistence on Android
 
Assetic (OSCON)
Assetic (OSCON)Assetic (OSCON)
Assetic (OSCON)
 
AngularJS Tips&Tricks
AngularJS Tips&TricksAngularJS Tips&Tricks
AngularJS Tips&Tricks
 
Intro To Moose
Intro To MooseIntro To Moose
Intro To Moose
 
The effective use of Django ORM
The effective use of Django ORMThe effective use of Django ORM
The effective use of Django ORM
 
Building Cloud Castles - LRUG
Building Cloud Castles - LRUGBuilding Cloud Castles - LRUG
Building Cloud Castles - LRUG
 
Great Developers Steal
Great Developers StealGreat Developers Steal
Great Developers Steal
 
Getting Hiera and Hiera
Getting Hiera and HieraGetting Hiera and Hiera
Getting Hiera and Hiera
 
Building Cloud Castles
Building Cloud CastlesBuilding Cloud Castles
Building Cloud Castles
 
Pourquoi ruby et rails déchirent
Pourquoi ruby et rails déchirentPourquoi ruby et rails déchirent
Pourquoi ruby et rails déchirent
 
Lu solr32 34-20110912
Lu solr32 34-20110912Lu solr32 34-20110912
Lu solr32 34-20110912
 

Destacado

fidelity national information 2nd Quarter 2007 10Q
fidelity national information  2nd Quarter 2007 10Qfidelity national information  2nd Quarter 2007 10Q
fidelity national information 2nd Quarter 2007 10Q
finance48
 
Thesis110309
Thesis110309Thesis110309
Thesis110309
klee4vp
 
Thesis100609
Thesis100609Thesis100609
Thesis100609
klee4vp
 
18 Minute Presentation In Greek
18 Minute Presentation In Greek18 Minute Presentation In Greek
18 Minute Presentation In Greek
Fred Johansen
 
Cuestionariojornadadereflexion
CuestionariojornadadereflexionCuestionariojornadadereflexion
Cuestionariojornadadereflexion
Juan Castillo
 
Thesis Midterm032610
Thesis Midterm032610Thesis Midterm032610
Thesis Midterm032610
klee4vp
 
As cores do casamento - O azul
As cores do casamento - O azulAs cores do casamento - O azul
As cores do casamento - O azul
casebem
 

Destacado (20)

Структура сайта Camisco 2006
Структура сайта Camisco 2006Структура сайта Camisco 2006
Структура сайта Camisco 2006
 
fidelity national information 2nd Quarter 2007 10Q
fidelity national information  2nd Quarter 2007 10Qfidelity national information  2nd Quarter 2007 10Q
fidelity national information 2nd Quarter 2007 10Q
 
Thesis110309
Thesis110309Thesis110309
Thesis110309
 
Thesis100609
Thesis100609Thesis100609
Thesis100609
 
18 Minute Presentation In Greek
18 Minute Presentation In Greek18 Minute Presentation In Greek
18 Minute Presentation In Greek
 
AIESEC HUST 09Fall 招新——外语学院
AIESEC HUST 09Fall 招新——外语学院AIESEC HUST 09Fall 招新——外语学院
AIESEC HUST 09Fall 招新——外语学院
 
Have Breakfast… Or…Be Breakfast
Have Breakfast… Or…Be BreakfastHave Breakfast… Or…Be Breakfast
Have Breakfast… Or…Be Breakfast
 
Accept the Pain
Accept the PainAccept the Pain
Accept the Pain
 
Městská karta
Městská kartaMěstská karta
Městská karta
 
Ground breakingceremony csr-ptcsi
Ground breakingceremony csr-ptcsiGround breakingceremony csr-ptcsi
Ground breakingceremony csr-ptcsi
 
AIESEC HUST 09Fall招新进行时——信息学院
AIESEC HUST 09Fall招新进行时——信息学院AIESEC HUST 09Fall招新进行时——信息学院
AIESEC HUST 09Fall招新进行时——信息学院
 
Passie Voor Horeca Minicursus Arrangeren De Rooi Pannen
Passie Voor Horeca Minicursus Arrangeren De Rooi PannenPassie Voor Horeca Minicursus Arrangeren De Rooi Pannen
Passie Voor Horeca Minicursus Arrangeren De Rooi Pannen
 
Cuestionariojornadadereflexion
CuestionariojornadadereflexionCuestionariojornadadereflexion
Cuestionariojornadadereflexion
 
Mobile Cloud Architectures
Mobile Cloud ArchitecturesMobile Cloud Architectures
Mobile Cloud Architectures
 
Thesis Midterm032610
Thesis Midterm032610Thesis Midterm032610
Thesis Midterm032610
 
Crusade propaganda and ideology
Crusade propaganda and ideologyCrusade propaganda and ideology
Crusade propaganda and ideology
 
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every DaySXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
SXSW 2013 Submission- Marketing Tech When Your Product Changes Every Day
 
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm patternRIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
RIAアーキテクチャー研究会 第3回 セッション4 Mvpvm pattern
 
Lams201: Digging deeper into the Learning Activity Management System
Lams201: Digging deeper into the Learning Activity Management SystemLams201: Digging deeper into the Learning Activity Management System
Lams201: Digging deeper into the Learning Activity Management System
 
As cores do casamento - O azul
As cores do casamento - O azulAs cores do casamento - O azul
As cores do casamento - O azul
 

Similar a Quick Introduction to Sphinx and Thinking Sphinx

Ako prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s ElasticsearchAko prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s Elasticsearch
bart-sk
 
Slides python elixir
Slides python elixirSlides python elixir
Slides python elixir
Adel Totott
 
Remixing Confluence With Speakeasy
Remixing Confluence With SpeakeasyRemixing Confluence With Speakeasy
Remixing Confluence With Speakeasy
nabeelahali
 
Sphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in DrupalSphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in Drupal
elliando dias
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Wordpress search-elasticsearch
Wordpress search-elasticsearchWordpress search-elasticsearch
Wordpress search-elasticsearch
Taylor Lovett
 

Similar a Quick Introduction to Sphinx and Thinking Sphinx (20)

Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
Bye bye $GLOBALS['TYPO3_DB']
Bye bye $GLOBALS['TYPO3_DB']Bye bye $GLOBALS['TYPO3_DB']
Bye bye $GLOBALS['TYPO3_DB']
 
Remixing Confluence with Speakeasy - AtlasCamp 2011
Remixing Confluence with Speakeasy - AtlasCamp 2011Remixing Confluence with Speakeasy - AtlasCamp 2011
Remixing Confluence with Speakeasy - AtlasCamp 2011
 
Ako prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s ElasticsearchAko prepojiť aplikáciu s Elasticsearch
Ako prepojiť aplikáciu s Elasticsearch
 
Slides python elixir
Slides python elixirSlides python elixir
Slides python elixir
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with Elasticsearch
 
Real World MVC
Real World MVCReal World MVC
Real World MVC
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
[Srijan Wednesday Webinar] Easy Performance Wins for Your Rails App
 
Examiness hints and tips from the trenches
Examiness hints and tips from the trenchesExaminess hints and tips from the trenches
Examiness hints and tips from the trenches
 
Sterling for Windows Phone 7
Sterling for Windows Phone 7Sterling for Windows Phone 7
Sterling for Windows Phone 7
 
Rails 3 (beta) Roundup
Rails 3 (beta) RoundupRails 3 (beta) Roundup
Rails 3 (beta) Roundup
 
Remixing Confluence With Speakeasy
Remixing Confluence With SpeakeasyRemixing Confluence With Speakeasy
Remixing Confluence With Speakeasy
 
The Way to Theme Enlightenment
The Way to Theme EnlightenmentThe Way to Theme Enlightenment
The Way to Theme Enlightenment
 
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle TextFind Anything In Your APEX App - Fuzzy Search with Oracle Text
Find Anything In Your APEX App - Fuzzy Search with Oracle Text
 
Sphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in DrupalSphinx: Leveraging Scalable Search in Drupal
Sphinx: Leveraging Scalable Search in Drupal
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Solr's Search Relevancy (Understand Solr's query debug)
Solr's Search Relevancy (Understand Solr's query debug)Solr's Search Relevancy (Understand Solr's query debug)
Solr's Search Relevancy (Understand Solr's query debug)
 
Wordpress search-elasticsearch
Wordpress search-elasticsearchWordpress search-elasticsearch
Wordpress search-elasticsearch
 
SphinxSE with MySQL
SphinxSE with MySQLSphinxSE with MySQL
SphinxSE with MySQL
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Quick Introduction to Sphinx and Thinking Sphinx

  • 1. SPHINX AND THINKING SPHINX 10 Minute Intro
  • 2. HAYES DAVIS Founder, Appozite cheaptweet.com | @cheaptweet @hayesdavis
  • 3. SPHINX •Open Source full-text search engine •Designed around SQL •Standalone daemon (searchd) http://guardians.net/hawass/images/sphinx3.jpg
  • 4. THINKING SPHINX •Rails plugin •Integrates Active Record with Sphinx •Makes talking to Sphinx basically painless
  • 5. BASIC IDEA • Configure your indexes • Index • Query • Repeat
  • 6. CONFIGURING INDEXES • Add indexes on your AR class Article < ActiveRecord::Base classes using define_index define_index do # fields • Fields (indexes) contain text indexes subject, :sortable => true indexes content you can search indexes author.name, :as=> :author, :sortable => true • Attributes (has) allow you to # attributes sort and constrain your has author_id, created_at, updated_at searches end end • Careful!Column names aren’t symbols
  • 7. Run the indexer rake thinking_sphinx:index
  • 8. source twitterer_core_0 { type = mysql sql_host = 127.0.0.1 sql_user = cheaptweet sql_pass = cheaptweet sql_db = cheaptweet_development2 sql_query_pre = UPDATE `twitterer` SET `delta` = 0 sql_query_pre = SET NAMES utf8 sql_query = SELECT `twitterer`.`id` * 1 + 0 AS `id` , CAST(`twitterer`.`screen_name` AS CHAR) AS `screen_name`, CAST(`twitterer`.`name` AS CHAR) AS `name`, CAST(`twitterer`.`description` AS CHAR) AS `description`, CAST(`twitterer`.`url` AS CHAR) AS `url`, CAST(`twitterer`.`location` AS CHAR) AS `location`, `twitterer`.`id` AS `sphinx_internal_id`, 283224142 AS `class_crc`, '283224142' AS `subclass_crcs`, 0 AS `sphinx_deleted` FROM twitterer WHERE `twitterer`.`id` >= $start AND `twitterer`.`id` <= $end AND `twitterer`.`delta` = 0 GROUP BY `twitterer`.`id` ORDER BY NULL sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `twitterer` WHERE `twitterer`.`delta` = 0 sql_attr_uint = sphinx_internal_id sql_attr_uint = class_crc sql_attr_uint = sphinx_deleted sql_attr_multi = uint subclass_crcs from field sql_query_info = SELECT * FROM `twitterer` WHERE `id` = (($id - 0) / 1) } index twitterer_core { source = twitterer_core_0 path = /Users/hayesdavis/Appozite/workspace/CheapTweet/data/sphinx/development/twitterer_core morphology = stem_en charset_type = utf-8 } MORE ABOUT INDEXING Thinking Sphinx generates a config file for sphinx, indexes (aka “sources”) are defined. It’s a little complicated.
  • 10. #Searches all fields for “pants” Article.search “pants” #Conditions are allowed on fields but must be hash Article.search “pants”, :conditions=>{ :subject=>”How To Wear” } #Query attributes using :with Article.search “pants”, :with=>{ :author_id=>1, :created_at=>1.week.ago..Time.now } SEARCHING Use the search method on AR classes
  • 11. BUT WAIT HOW DO I KEEP INDEXES (ESPECIALLY BIG ONES) UP TO DATE?
  • 12. DELTA INDEXES TO THE RESCUE • Mini index of only rows that have been updated • Must merge into “core” index periodically or it’ll get slow • Simplest approach: add delta boolean column to model • Add set_property :delta=>true to define_index block • Delta index is rebuilt on model saves, can cause performance hit
  • 13. DEPLOYMENT & PRODUCTION • Must schedule full re-indexing periodically • Have god or monit keep an eye on things • Consider adding some cap tasks to help out with reindexing and restarting
  • 14. TIPS, TRICKS, GOTCHAS • Simplest delta indexing can lead to performance issues • Indexer assumes you have sequential ids on your DB rows and iterates through them in chunks - very bad if you have big gaps • Run full indexing as often as you can without hurting performance - it’s usually pretty fast • Youcan hand-edit config files if you need to tune - but be careful not to regenerate
  • 15. RESOURCES Sphinx http://www.sphinxsearch.com/ Thinking Sphinx http://freelancing-god.github.com/ts/en/ Railscast http://railscasts.com/episodes/120-thinking-sphinx