Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Building Search Engines - Lucene, SolR and Elasticsearch

1.696 visualizaciones

Publicado el

Learn how Lucene runs more than just search indexes, how to build a proper search engine, and how to decide between SolR , Elasticsearch, Amazon CloudSearch or Azure Search.

Publicado en: Internet
  • Inicia sesión para ver los comentarios

Building Search Engines - Lucene, SolR and Elasticsearch

  1. 1. www.anant.us | solutions@anant.us | 202.905.2818 1010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007 Research & Development – Comparing Lucene / SolR / Elastic & Cloud Search Providers Building Search Engines
  2. 2. What do we do? Streamline, Organize & Unify Business Information
  3. 3. Agenda • Challenge - Why does this matter? • Info Retrieval - Retrieval / Routing • Lucene - More than meets the eye ... • Search Engine - 30k Foot View • On Premise - Lucene / SolR / Elastic • Cloud Providers - Amazon / Azure
  4. 4. Challenge – Why does this matter? Knowledge Project Information Client Service Information Corporate Guides Collaborative Documents Assets & Files Corporate Resources Appleseed Framework (Portal, Base, Search) G Drive Delta DropBox G Drive Delta Nutshell Dropbox Freshbooks G Drive G Sites (KB) G Drive Workflowy Evernote G Drive DropBox OwnCloud Pocket Leaves AIC (WP) Anant (WP)
  5. 5. Document Retrieval • Google Search • Amazon Search • LinkedIn Search • CMS Search * • Portal Search * • CRM Search * • Search * Document Routing • Google Alerts • Amazon Recommendations • Netflix Recommendations • LinkedIn Recommendations Information Retrieval
  6. 6. Lucene – Inverted Index
  7. 7. Lucene – More than meets the eye Who Next? Think of it like a “NoSQL” Database that has great indexing.. everywhere.
  8. 8. Search Engine – 30 Thousand Foot View The search index is only as good as your processed data. If you put everything you find in your index, you are going to spend a lot of time telling people how to search.
  9. 9. On Premise – Lucene / ES / SolR Lucene • Library • File System • Format • Fast • Embeddable* • Indexing Anywhere • Need to really know Lucene • No Interface • No server • Lots of house keeping SolR • Server • Admin / REST Interface • Configurable • Scalable • Great at Text* • Truly Open • 10+ Years • Good ecosystem • Too customizable • Schemas* • Zookeeper Needed ElasticSearch • Server • Configurable • Scalable • Good ecosystem • Built in Clustering • Grouping / Filtering • Great for Logs • Started as a Cloud Tool • No great OTS Interface • Only REST Interface
  10. 10. Cloud Search – Amazon / Azure Amazon • SolRCloud* • AWS* Ecosystem • 5 QParsers • Dynamic Fields • 100% Completely Managed • Been Around for a While • Data / Read Writes • No nested Objects Azure • ElasticSearch* • Azure* Ecosystem • 2 QParsers • 100% Completely Managed • Good SDK • Few Years Old • Data / Read Writes • No nested Objects • Not so Dynamic Fields
  11. 11. Questions & Contact www.anant.us | solutions@anant.us | 202.905.2818 1010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007 @anantcorp facebook.com/anantCorp linkedin.com/company/anant rahul@anant.us linkedin.com/in/xingh Rahul Singh CEO & Founder Questions & Contact • Modern Enterprise • Mastering Services in the Service of Others • Hybrid Agile Project Management • Building Search Engines • CICD / DevOps • Connecting Internet Software
  12. 12. www.anant.us | solutions@anant.us | 202.905.2818 1010 Wisconsin Ave, NW | Suite 250 | Washington, DC 20007 Streamlined Data Integration / Data Pipelines Organized Knowledge Search / Data Warehouses Unified Interfaces Portals / Dashboards / Mobile

×