Briefly described typical test search tasks (e.g. full text search, phrase match) and possible solutions.
Provided review of the most popular text search gems.
5. FULLTEXT SEARCH
• Search for a phrase within
document
• Partial matches
• Order by relevance
• Phrase highlights
• Similar matches
• Typos correction
7. RDBMS SEARCH
• No external dependencies
• Relatively slow
• Provides only basic FTS
features
(PostgreSQL)
8. EXAMPLE
ALTER TABLE documents ADD COLUMN fts_col tsvector;
CREATE INDEX fts_idx ON documents USING GIN (fts_col);
UPDATE documents SET fts_col = to_tsvector(title || ' ' || content);
SELECT * FROM documents WHERE fts_col @@ to_tsquery('text to find');
Requirement: Search within document's title and content
Implementation:
11. QUICK START
class Document < ActiveRecord::Base
include PgSearch
pg_search_scope :search_full_text, against: {
title: 'A',
content: 'B'
}
end
Document.search_full_text('text to find')
Single model search
12. REVIEW
• Simple setup
• No external dependencies
• AR-compatible output
• PostgreSQL extensions
• Order by relevance
• PostgreSQL only :)
• Multimodel indexex need to
be rebuilt
• Only basic FTS features
Pros: Cons:
15. THINKING_SPHINX HIGHLIGHTS
• Very mature (~10 years) project
• Supports ActiveRecord 3.1+
• Well documented
• Requires mysql gem to be installed
16. QUICK START
ThinkingSphinx::Index.define :document, with: :real_time do
indexes title
indexes content
end
after_save ThinkingSphinx::RealTime.callback_for(:document)
rake ts:regenerate
Document.search('text to find')
17. REVIEW
• Field weights
• Facets
• Advanced filters
• Different indexing strategies
(realtime and SQL)
• Deltas for SQL-backed
indexes
• Delta indexes may cause
data inconsistency
Pros: Cons:
21. QUICK START
class Document < ActiveRecord::Base
searchkick
end
Document.reindex
Document.search('text to find')
22. REVIEW
• Tons of features
• Bulk document updates
• Autocomplete
• Facets
• Very opinionated
development
• Documentation issues
• Default setup doesn't match
any practical requirements...
• ... therefore a reconfiguration
is a must
Pros: Cons:
24. CONCLUSION
• Use RDBMS search for a simple search within
a small and defined set of documents
• Want to scale/advanced features - try
ElasticSearch or Sphinx
27. RANSACK HIGHLIGHTS
• Works on the top of RDBMS search/filtering
• Case insensitive match by default
• Able to build search forms
• Rails 3-5.1 compatible
28. QUICK START
def index
@q = Document.ransack(params[:q])
@documents = @q.result(distinct: true)
end
<%= search_form_for @q do |f| %>
<%= f.label :title_cont %>
<%= f.search_field :title_cont %>
<%= f.submit %>
<% end %>
SELECT * FROM documents WHERE title ILIKE '%title%';
Executes: