SlideShare una empresa de Scribd logo
1 de 8
Descargar para leer sin conexión
THE DB AND THE INDEX: WHY OUR NEED FOR SEARCH IS SHAPING OUR TECHNOLOGY



                         SPEAKER: Grant Ingersoll
                                     CTO
                                     LucidWorks



Thursday, March 28, 13
The DB and the Index


                                     Grant Ingersoll (@gsingers)
                                     CTO




     Confidential © Copyright 2012

Thursday, March 28, 13
Confidential and Proprietary
     3               © 2012 LucidWorks

Thursday, March 28, 13
Search: NoSQL before NoSQL was cool

                         • Search is a system building block
                           - Text is only a part of the story

                         • If the algorithms fit,
                                      use them!

                         • Embrace fuzziness!

                         • Scoring features are everywhere


                     Confidential and Proprietary
                     © 2012 LucidWorks

Thursday, March 28, 13
Reference Architecture

                                                                         Access APIs
                                                        Search View           Analytic                         Personalization &
                                        1 2                                   Services                         Machine Learning
                                 Shards             3             N

                                                                                                  •Documents
                                          Discovery &                           Document          •Users
                                                                                                  •Logs          Classification Models
                                          Enrichment                               Store                         In memory
                                                                                                                 Replicated
                                                                                                                 Multi-tenant


                                                                      Content Acquisition
                                                                       ETL, batch or near real-
                                                                       time


                                        Data
                           • LucidWorks Search
                             connectors
                           • Push


                     Confidential and Proprietary
                     © 2012 LucidWorks

Thursday, March 28, 13
Search (R)evolution

                     • Search use leads to search abuse
                         - Denormalization frees your mind
                         - Scoring is just a sparse matrix multiply

                     • Lucene/Solr evolution
                         -   Non free text usages abound
                         -   Many DB-like features
                         -   Flexible indexing
                         -   Finite State Transducers FTW!

                     • Scale

                     • “This ain’t your father’s relevance anymore”

                     Confidential and Proprietary
                     © 2012 LucidWorks

Thursday, March 28, 13
Resources

                     • http://www.lucidworks.com
                     • http://www.lucidworks.com/products/lucidworks-big-data




                     • grant@lucidworks.com
                     • @gsingers




                     Confidential and Proprietary
                     © 2012 LucidWorks

Thursday, March 28, 13
Thursday, March 28, 13

Más contenido relacionado

Más de Gigaom

Más de Gigaom (20)

Structure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshopStructure 2014 - Disrupting the data center - Intel sponsor workshop
Structure 2014 - Disrupting the data center - Intel sponsor workshop
 
Structure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - BatteryStructure 2014 - Cloud trends - Battery
Structure 2014 - Cloud trends - Battery
 
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
Structure Data 2014: HOW MICRODATA CAN SAY A LOT ABOUT MACROECONOMICS, David ...
 
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
Structure Data 2014: QLIK SPONSOR WORKSHOP: ANALYTICS THE WAY NATURE INTENDED...
 
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit BendovStructure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
Structure Data 2014: FIVE MYTHS ABOUT BIG DATA, Amit Bendov
 
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
Structure Data 2014: AMID BILLIONS OF METRICS, YOUR SOFTWARE IS TRYING TO TEL...
 
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA, Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
Structure Data 2014: SISENSE SPONSOR WORKSHOP: ON BEER, CHIPS AND DATA,
 
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari GesherStructure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
Structure Data 2014: INVERTING 80/20: BEYOND BESPOKE BIG DATA, Ari Gesher
 
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris HaddadStructure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
Structure Data 2014: TRACKING A SOCCER GAME WITH BIG DATA, Chris Haddad
 
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
Structure Data 2014: TECH AGAINST HUMAN TRAFFICKING AND ILLICIT NETWORKS, Jus...
 
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrathStructure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
Structure Data 2014: DATA DRIVEN DESIGN AT FORMULA ONE SPEED, Geoff McGrath
 
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve RussellStructure Data 2014: IS VIDEO BIG DATA?, Steve Russell
Structure Data 2014: IS VIDEO BIG DATA?, Steve Russell
 
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan WaiteStructure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
Structure Data 2014: BIG DATA ANALYTICS RE-INVENTED, Ryan Waite
 
How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013How Data is Remaking E-commerce - from Roadmap 2013
How Data is Remaking E-commerce - from Roadmap 2013
 
25 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 201325 Favorite Experiences in Tech - from Roadmap 2013
25 Favorite Experiences in Tech - from Roadmap 2013
 
How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013How Moore’s Law is Influencing Design - from Roadmap 2013
How Moore’s Law is Influencing Design - from Roadmap 2013
 
Building the Visual Language of the Web - from Roadmap 2013
Building the Visual Language of the Web - from Roadmap 2013Building the Visual Language of the Web - from Roadmap 2013
Building the Visual Language of the Web - from Roadmap 2013
 
Your Device Knows What You’re Thinking Before You Do - from Mobilize 2013
Your Device Knows What You’re Thinking Before You Do - from Mobilize 2013Your Device Knows What You’re Thinking Before You Do - from Mobilize 2013
Your Device Knows What You’re Thinking Before You Do - from Mobilize 2013
 
Gigaom Research Sector RoadMap: Enterprise Mobility Management
Gigaom Research Sector RoadMap: Enterprise Mobility Management Gigaom Research Sector RoadMap: Enterprise Mobility Management
Gigaom Research Sector RoadMap: Enterprise Mobility Management
 
CERN'S GRAND COLLISION EXPERIMENT: CLOUD from Structure:Europe 2013
CERN'S GRAND COLLISION EXPERIMENT: CLOUD from Structure:Europe 2013CERN'S GRAND COLLISION EXPERIMENT: CLOUD from Structure:Europe 2013
CERN'S GRAND COLLISION EXPERIMENT: CLOUD from Structure:Europe 2013
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

THE DB AND THE INDEX: WHY OUR NEED FOR SEARCH IS SHAPING OUR TECHNOLOGY

  • 1. THE DB AND THE INDEX: WHY OUR NEED FOR SEARCH IS SHAPING OUR TECHNOLOGY SPEAKER: Grant Ingersoll CTO LucidWorks Thursday, March 28, 13
  • 2. The DB and the Index Grant Ingersoll (@gsingers) CTO Confidential © Copyright 2012 Thursday, March 28, 13
  • 3. Confidential and Proprietary 3 © 2012 LucidWorks Thursday, March 28, 13
  • 4. Search: NoSQL before NoSQL was cool • Search is a system building block - Text is only a part of the story • If the algorithms fit, use them! • Embrace fuzziness! • Scoring features are everywhere Confidential and Proprietary © 2012 LucidWorks Thursday, March 28, 13
  • 5. Reference Architecture Access APIs Search View Analytic Personalization & 1 2 Services Machine Learning Shards 3 N •Documents Discovery & Document •Users •Logs Classification Models Enrichment Store In memory Replicated Multi-tenant Content Acquisition ETL, batch or near real- time Data • LucidWorks Search connectors • Push Confidential and Proprietary © 2012 LucidWorks Thursday, March 28, 13
  • 6. Search (R)evolution • Search use leads to search abuse - Denormalization frees your mind - Scoring is just a sparse matrix multiply • Lucene/Solr evolution - Non free text usages abound - Many DB-like features - Flexible indexing - Finite State Transducers FTW! • Scale • “This ain’t your father’s relevance anymore” Confidential and Proprietary © 2012 LucidWorks Thursday, March 28, 13
  • 7. Resources • http://www.lucidworks.com • http://www.lucidworks.com/products/lucidworks-big-data • grant@lucidworks.com • @gsingers Confidential and Proprietary © 2012 LucidWorks Thursday, March 28, 13