SlideShare una empresa de Scribd logo
1 de 23
Descargar para leer sin conexión
MariaDB with SphinxSE
          Colin Charles, Monty Program Ab
             colin@montyprogram.com
http://www.montyprogram.com / http://mariadb.org
   http://bytebot.net/blog / @bytebot on Twitter
   Sphinx Search Day 2012, Santa Clara, CA, USA
                   13 April 2012
whoami

• MariaDB guy at Monty Program
• Formerly at MySQL AB/Sun Microsystems
• Past lives include FESCO (Fedora Project),
  OpenOffice.org
MariaDB/MySQL used
interchangeably in this talk
The old days
• Download MySQL, including sources
• Download SphinxSE for compiling
• Download Sphinx to compile with MySQL
  support
• Documented: http://www.howtoforge.com/
  sphinx-as-mysql-storage-engine-sphinxse
Today

• Install sphinx from your distribution
• Install MariaDB 5.5 from your distribution
  or from http://mariadb.org/
• Get started!
Getting started

mysql> INSTALL PLUGIN sphinx
SONAME 'ha_sphinx.so';
Query OK, 0 rows affected
(0.01 sec)
Another engine appears
What is SphinxSE?
• SphinxSE is just the storage engine that still
  depends on the Sphinx daemon
• It doesn’t store any data itself
• Its just a built-in client to allow MariaDB to
  talk to Sphinx searchd, run queries, obtain
  results
• Indexing, searching is performed on Sphinx
Configure sphinx!
• /usr/local/sphinx/sphinx.conf
• Source (multiple, include mysql, with
  connection info)
• Setup indexer (esp. if its on localhost) -
  mem_limit, max_iops, max_iosize
• Setup searchd (where to listen to, query
  log, etc.)
Use case scenarios

• Already have an existing application that
  makes use of full-text-search in MyISAM?
  Porting should be easier
• Have a programming language without a
  native API for Sphinx? Surely there’s a
  connector for MariaDB ;-)
Use case scenarios

• Results from Sphinx itself almost always
  require additional work involving MariaDB
  • Say to pull out text column that Sphinx
    index doesn’t store
  • JOIN with another table (using a different
    engine)
An example
CREATE TABLE t1
(
    id             INTEGER UNSIGNED NOT NULL,
    weight         INTEGER NOT NULL,
    query          VARCHAR(3072) NOT NULL,
    group_id       INTEGER,
    INDEX(query)
) ENGINE=SPHINX CONNECTION="sphinx://localhost:9312/
test";


SELECT * FROM t1 WHERE query='test it;mode=any';
Sphinx search tables
• 1st column: INTEGER UNSIGNED or
  BIGINT (document ID)
• 2nd column: match weight
• 3rd column: VARCHAR or TEXT (your
  query)
• Query column needs indexing, no other
  column needs to be
What actually happens

• SELECT passes a Sphinx query as the query
  column in the WHERE clause
• searchd returns the results
• SphinxSE translates and returns the results
  to MariaDB
SHOW ENGINE
             SPHINX STATUS
•   Per-query & per-word statistics that searchd returns are accessible via SHOW STATUS


    mysql> SHOW ENGINE SPHINX STATUS;
    +--------+-------+-------------------------------------------------+
    | Type      | Name      | Status                                                      |
    +--------+-------+-------------------------------------------------+
    | SPHINX | stats | total: 25, total found: 25, time: 126, words: 2 |
    | SPHINX | words | sphinx:591:1256 soft:11076:15945                                   |
    +--------+-------+-------------------------------------------------+
    2 rows in set (0.00 sec)
What queries are
          supported?
•   Most of the Sphinx API is exposed to SphinxSE
•   query, mode, sort, offset, limit, index, minid,
    maxid, weights, filter, !filter, range, !range,
    maxmatches, groupby, groupsort, indexweights,
    comment, select
•   Sphinx search modes can also be supported via
    _sph attributes
    •   obtain value of @groupby? use ‘_sph_groupby’
Efficiency
•   Allow Sphinx to perform sorting, filtering, and
    slicing of result set
    •   ... as opposed to using WHERE, ORDER BY,
        LIMIT clauses on MariaDB
•   Why?
    •   Sphinx optimises and performs better on these
        tasks
    •   Less data packed by searchd, and transferred and
        unpacked by SphinxSE
JOINs
•   Perform JOINs on a SphinxSE search table using tables from other engines
    SELECT content, date_added FROM test.documents docs
    -> JOIN t1 ON (docs.id=t1.id)
    -> WHERE query="one document;mode=any";
    +-------------------------------------+---------------------+
    | content                                             | docdate            |
    +-------------------------------------+---------------------+
    | this is my test document number two | 2006-06-17 14:04:28 |
    | this is my test document number one | 2006-06-17 14:04:28 |
    +-------------------------------------+---------------------+
    2 rows in set (0.00 sec)
Why MariaDB?

• We keep up to date with Sphinx releases
• In MariaDB 5.5.21 we upgraded to 2.0.4,
  the latest upstream release
• MariaDB 5.5.23 is GA and ready for use
  today
Why MariaDB II?
• Engineering and furthering MySQL happens
  with MariaDB
• Benefit from a better-built in optimizer
  (that can materialize subqueries), XtraDB,
  microsecond precision, more statistics,
  NoSQL-like features (dynamic columns),
  GIS functionality (which works for geo-
  distance type searches in Sphinx)
Warning

• If sphinx is itself not setup, SphinxSE will
  accept doing things like CREATE TABLE
• Try doing a SELECT and you’ll see it fail
  though
We have extensive
    documentation
• http://kb.askmonty.org/en/sphinx-storage-
  engine
• http://sphinxsearch.com/docs/1.10/
  sphinxse-using.html
• Introduction to Search with Sphinx by
  Andrew Aksyonoff (O’Reilly)
Q&A?
       email: colin@montyprogram.com
http://montyprogram.com/ | http://mariadb.org/
twitter: @bytebot / url: http://bytebot.net/blog/

Más contenido relacionado

La actualidad más candente

A26 MariaDB : The New&Implemented MySQL Branch by Colin Charles
A26 MariaDB : The New&Implemented MySQL Branch by Colin CharlesA26 MariaDB : The New&Implemented MySQL Branch by Colin Charles
A26 MariaDB : The New&Implemented MySQL Branch by Colin Charles
Insight Technology, Inc.
 

La actualidad más candente (20)

A26 MariaDB : The New&Implemented MySQL Branch by Colin Charles
A26 MariaDB : The New&Implemented MySQL Branch by Colin CharlesA26 MariaDB : The New&Implemented MySQL Branch by Colin Charles
A26 MariaDB : The New&Implemented MySQL Branch by Colin Charles
 
Best practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High AvailabilityBest practices for MySQL/MariaDB Server/Percona Server High Availability
Best practices for MySQL/MariaDB Server/Percona Server High Availability
 
Distributions from the view a package
Distributions from the view a packageDistributions from the view a package
Distributions from the view a package
 
MariaDB - the "new" MySQL is 5 years old and everywhere (LinuxCon Europe 2015)
MariaDB - the "new" MySQL is 5 years old and everywhere (LinuxCon Europe 2015)MariaDB - the "new" MySQL is 5 years old and everywhere (LinuxCon Europe 2015)
MariaDB - the "new" MySQL is 5 years old and everywhere (LinuxCon Europe 2015)
 
Meet MariaDB Server 10.1 London MySQL meetup December 2015
Meet MariaDB Server 10.1 London MySQL meetup December 2015Meet MariaDB Server 10.1 London MySQL meetup December 2015
Meet MariaDB Server 10.1 London MySQL meetup December 2015
 
MariaDB: The New M In LAMP - SCALE10x
MariaDB: The New M In LAMP - SCALE10xMariaDB: The New M In LAMP - SCALE10x
MariaDB: The New M In LAMP - SCALE10x
 
MariaDB 10.1 what's new and what's coming in 10.2 - Tokyo MariaDB Meetup
MariaDB 10.1   what's new and what's coming in 10.2 - Tokyo MariaDB MeetupMariaDB 10.1   what's new and what's coming in 10.2 - Tokyo MariaDB Meetup
MariaDB 10.1 what's new and what's coming in 10.2 - Tokyo MariaDB Meetup
 
The MySQL Server ecosystem in 2016
The MySQL Server ecosystem in 2016The MySQL Server ecosystem in 2016
The MySQL Server ecosystem in 2016
 
Bootstrapping Using Free Software
Bootstrapping Using Free SoftwareBootstrapping Using Free Software
Bootstrapping Using Free Software
 
A beginners guide to MariaDB
A beginners guide to MariaDBA beginners guide to MariaDB
A beginners guide to MariaDB
 
My first moments with MongoDB
My first moments with MongoDBMy first moments with MongoDB
My first moments with MongoDB
 
Introduction to MariaDB
Introduction to MariaDBIntroduction to MariaDB
Introduction to MariaDB
 
Better encryption & security with MariaDB 10.1 & MySQL 5.7
Better encryption & security with MariaDB 10.1 & MySQL 5.7Better encryption & security with MariaDB 10.1 & MySQL 5.7
Better encryption & security with MariaDB 10.1 & MySQL 5.7
 
MariaDB 10: A MySQL Replacement - HKOSC
MariaDB 10: A MySQL Replacement - HKOSC MariaDB 10: A MySQL Replacement - HKOSC
MariaDB 10: A MySQL Replacement - HKOSC
 
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live LondonMariaDB 10 Tutorial - 13.11.11 - Percona Live London
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
 
MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)MariaDB 10.5 binary install (바이너리 설치)
MariaDB 10.5 binary install (바이너리 설치)
 
Percona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA'sPercona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA's
 
MariaDB - a MySQL Replacement #SELF2014
MariaDB - a MySQL Replacement #SELF2014MariaDB - a MySQL Replacement #SELF2014
MariaDB - a MySQL Replacement #SELF2014
 
Amazon Aurora로 안전하게 migration 하기
Amazon Aurora로 안전하게 migration 하기Amazon Aurora로 안전하게 migration 하기
Amazon Aurora로 안전하게 migration 하기
 
Capacity planning for your data stores
Capacity planning for your data storesCapacity planning for your data stores
Capacity planning for your data stores
 

Similar a MariaDB with SphinxSE

Sphinx new
Sphinx newSphinx new
Sphinx new
rit2010
 
[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles
Insight Technology, Inc.
 

Similar a MariaDB with SphinxSE (20)

Sphinx new
Sphinx newSphinx new
Sphinx new
 
Streaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQLStreaming ETL - from RDBMS to Dashboard with KSQL
Streaming ETL - from RDBMS to Dashboard with KSQL
 
What is MariaDB Server 10.3?
What is MariaDB Server 10.3?What is MariaDB Server 10.3?
What is MariaDB Server 10.3?
 
Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)
 
Quick Wins
Quick WinsQuick Wins
Quick Wins
 
Fudcon talk.ppt
Fudcon talk.pptFudcon talk.ppt
Fudcon talk.ppt
 
NoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoNoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
NoSQL on MySQL - MySQL Document Store by Vadim Tkachenko
 
[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles[B14] A MySQL Replacement by Colin Charles
[B14] A MySQL Replacement by Colin Charles
 
4 docker small_web_micro_services
4 docker small_web_micro_services4 docker small_web_micro_services
4 docker small_web_micro_services
 
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale  by ...[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale  by ...
[db tech showcase Tokyo 2014] B15: Scalability with MariaDB and MaxScale by ...
 
Getting started into mySQL
Getting started into mySQLGetting started into mySQL
Getting started into mySQL
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
 
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
 
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
Lessons from the Field: Applying Best Practices to Your Apache Spark Applicat...
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527MySQL 5.7 innodb_enhance_partii_20160527
MySQL 5.7 innodb_enhance_partii_20160527
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
Apache Spark v3.0.0
Apache Spark v3.0.0Apache Spark v3.0.0
Apache Spark v3.0.0
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management ApplicationBuilding a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 

Más de Colin Charles

Más de Colin Charles (18)

Differences between MariaDB 10.3 & MySQL 8.0
Differences between MariaDB 10.3 & MySQL 8.0Differences between MariaDB 10.3 & MySQL 8.0
Differences between MariaDB 10.3 & MySQL 8.0
 
Databases in the hosted cloud
Databases in the hosted cloud Databases in the hosted cloud
Databases in the hosted cloud
 
MySQL features missing in MariaDB Server
MySQL features missing in MariaDB ServerMySQL features missing in MariaDB Server
MySQL features missing in MariaDB Server
 
The MySQL ecosystem - understanding it, not running away from it!
The MySQL ecosystem - understanding it, not running away from it! The MySQL ecosystem - understanding it, not running away from it!
The MySQL ecosystem - understanding it, not running away from it!
 
Databases in the Hosted Cloud
Databases in the Hosted CloudDatabases in the Hosted Cloud
Databases in the Hosted Cloud
 
Best practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability TutorialBest practices for MySQL High Availability Tutorial
Best practices for MySQL High Availability Tutorial
 
Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント)
Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント)Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント)
Percona ServerをMySQL 5.6と5.7用に作るエンジニアリング(そしてMongoDBのヒント)
 
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScaleThe Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
The Proxy Wars - MySQL Router, ProxySQL, MariaDB MaxScale
 
Lessons from {distributed,remote,virtual} communities and companies
Lessons from {distributed,remote,virtual} communities and companiesLessons from {distributed,remote,virtual} communities and companies
Lessons from {distributed,remote,virtual} communities and companies
 
Forking Successfully - or is a branch better?
Forking Successfully - or is a branch better?Forking Successfully - or is a branch better?
Forking Successfully - or is a branch better?
 
MariaDB Server Compatibility with MySQL
MariaDB Server Compatibility with MySQLMariaDB Server Compatibility with MySQL
MariaDB Server Compatibility with MySQL
 
Securing your MySQL / MariaDB Server data
Securing your MySQL / MariaDB Server dataSecuring your MySQL / MariaDB Server data
Securing your MySQL / MariaDB Server data
 
The MySQL Server Ecosystem in 2016
The MySQL Server Ecosystem in 2016The MySQL Server Ecosystem in 2016
The MySQL Server Ecosystem in 2016
 
The Complete MariaDB Server tutorial
The Complete MariaDB Server tutorialThe Complete MariaDB Server tutorial
The Complete MariaDB Server tutorial
 
Lessons from database failures
Lessons from database failures Lessons from database failures
Lessons from database failures
 
Lessons from database failures
Lessons from database failuresLessons from database failures
Lessons from database failures
 
Lessons from database failures
Lessons from database failuresLessons from database failures
Lessons from database failures
 
MariaDB Server & MySQL Security Essentials 2016
MariaDB Server & MySQL Security Essentials 2016MariaDB Server & MySQL Security Essentials 2016
MariaDB Server & MySQL Security Essentials 2016
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

MariaDB with SphinxSE

  • 1. MariaDB with SphinxSE Colin Charles, Monty Program Ab colin@montyprogram.com http://www.montyprogram.com / http://mariadb.org http://bytebot.net/blog / @bytebot on Twitter Sphinx Search Day 2012, Santa Clara, CA, USA 13 April 2012
  • 2. whoami • MariaDB guy at Monty Program • Formerly at MySQL AB/Sun Microsystems • Past lives include FESCO (Fedora Project), OpenOffice.org
  • 4. The old days • Download MySQL, including sources • Download SphinxSE for compiling • Download Sphinx to compile with MySQL support • Documented: http://www.howtoforge.com/ sphinx-as-mysql-storage-engine-sphinxse
  • 5. Today • Install sphinx from your distribution • Install MariaDB 5.5 from your distribution or from http://mariadb.org/ • Get started!
  • 6. Getting started mysql> INSTALL PLUGIN sphinx SONAME 'ha_sphinx.so'; Query OK, 0 rows affected (0.01 sec)
  • 8. What is SphinxSE? • SphinxSE is just the storage engine that still depends on the Sphinx daemon • It doesn’t store any data itself • Its just a built-in client to allow MariaDB to talk to Sphinx searchd, run queries, obtain results • Indexing, searching is performed on Sphinx
  • 9. Configure sphinx! • /usr/local/sphinx/sphinx.conf • Source (multiple, include mysql, with connection info) • Setup indexer (esp. if its on localhost) - mem_limit, max_iops, max_iosize • Setup searchd (where to listen to, query log, etc.)
  • 10. Use case scenarios • Already have an existing application that makes use of full-text-search in MyISAM? Porting should be easier • Have a programming language without a native API for Sphinx? Surely there’s a connector for MariaDB ;-)
  • 11. Use case scenarios • Results from Sphinx itself almost always require additional work involving MariaDB • Say to pull out text column that Sphinx index doesn’t store • JOIN with another table (using a different engine)
  • 12. An example CREATE TABLE t1 ( id INTEGER UNSIGNED NOT NULL, weight INTEGER NOT NULL, query VARCHAR(3072) NOT NULL, group_id INTEGER, INDEX(query) ) ENGINE=SPHINX CONNECTION="sphinx://localhost:9312/ test"; SELECT * FROM t1 WHERE query='test it;mode=any';
  • 13. Sphinx search tables • 1st column: INTEGER UNSIGNED or BIGINT (document ID) • 2nd column: match weight • 3rd column: VARCHAR or TEXT (your query) • Query column needs indexing, no other column needs to be
  • 14. What actually happens • SELECT passes a Sphinx query as the query column in the WHERE clause • searchd returns the results • SphinxSE translates and returns the results to MariaDB
  • 15. SHOW ENGINE SPHINX STATUS • Per-query & per-word statistics that searchd returns are accessible via SHOW STATUS mysql> SHOW ENGINE SPHINX STATUS; +--------+-------+-------------------------------------------------+ | Type | Name | Status | +--------+-------+-------------------------------------------------+ | SPHINX | stats | total: 25, total found: 25, time: 126, words: 2 | | SPHINX | words | sphinx:591:1256 soft:11076:15945 | +--------+-------+-------------------------------------------------+ 2 rows in set (0.00 sec)
  • 16. What queries are supported? • Most of the Sphinx API is exposed to SphinxSE • query, mode, sort, offset, limit, index, minid, maxid, weights, filter, !filter, range, !range, maxmatches, groupby, groupsort, indexweights, comment, select • Sphinx search modes can also be supported via _sph attributes • obtain value of @groupby? use ‘_sph_groupby’
  • 17. Efficiency • Allow Sphinx to perform sorting, filtering, and slicing of result set • ... as opposed to using WHERE, ORDER BY, LIMIT clauses on MariaDB • Why? • Sphinx optimises and performs better on these tasks • Less data packed by searchd, and transferred and unpacked by SphinxSE
  • 18. JOINs • Perform JOINs on a SphinxSE search table using tables from other engines SELECT content, date_added FROM test.documents docs -> JOIN t1 ON (docs.id=t1.id) -> WHERE query="one document;mode=any"; +-------------------------------------+---------------------+ | content | docdate | +-------------------------------------+---------------------+ | this is my test document number two | 2006-06-17 14:04:28 | | this is my test document number one | 2006-06-17 14:04:28 | +-------------------------------------+---------------------+ 2 rows in set (0.00 sec)
  • 19. Why MariaDB? • We keep up to date with Sphinx releases • In MariaDB 5.5.21 we upgraded to 2.0.4, the latest upstream release • MariaDB 5.5.23 is GA and ready for use today
  • 20. Why MariaDB II? • Engineering and furthering MySQL happens with MariaDB • Benefit from a better-built in optimizer (that can materialize subqueries), XtraDB, microsecond precision, more statistics, NoSQL-like features (dynamic columns), GIS functionality (which works for geo- distance type searches in Sphinx)
  • 21. Warning • If sphinx is itself not setup, SphinxSE will accept doing things like CREATE TABLE • Try doing a SELECT and you’ll see it fail though
  • 22. We have extensive documentation • http://kb.askmonty.org/en/sphinx-storage- engine • http://sphinxsearch.com/docs/1.10/ sphinxse-using.html • Introduction to Search with Sphinx by Andrew Aksyonoff (O’Reilly)
  • 23. Q&A? email: colin@montyprogram.com http://montyprogram.com/ | http://mariadb.org/ twitter: @bytebot / url: http://bytebot.net/blog/