Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Chicago Solr Meetup - June 10th: Exploring Hadoop with Search

1.330 visualizaciones

Publicado el

Publicado en: Tecnología
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Chicago Solr Meetup - June 10th: Exploring Hadoop with Search

  1. 1. Exploring Hadoop with Search Pritesh Patel, Principal Architect Search and Big Data Analytics @ Avalon Consulting, LLC
  2. 2. Hadoop Ecosystem
  3. 3. Possible Integration Points
  4. 4. Why Search + Big Data? What Hadoop is good at What Search is good at Distributed File storage Free text retrieval Store large data sets Index large data sets Distributed Processing Textual Analysis Filtering and Sorting = Intelligence Discovery System of large textual data sets
  5. 5. How we Integrated Search and Big Data  Hbase Replication Facade  Take advantage of results of Analytical Pig and Hive jobs in Hadoop to make retrieval more intelligent  Done with inbuilt replication and it scales  Fast access since in Memory  Push architecture so its near real time  CRUD  Store in HDFS and Search in LW/Solr  Gives reference to source when integrated this way  Hbase has a RestFul API to retrieve data given ID that Solr would have after replication/indexing
  6. 6. Our Demo Architecture Diagram by Varun Rao @ Avalon Consulting, LLC
  7. 7. A Use Case of this Architecture  Monitor tweets with words “Hadoop”, “Lucidworks”, and “Big Data”  Automatically extract url’s mentioned when talking about these terms  In near real time visualize which urls seem to be mentioned with these terms  Discover urls that are becoming the most popular when mentioned with the topics “Big Data”, “Lucidworks”, and “Hadoop” and those might be urls you want to read
  8. 8. Demo  Any one want to send a tweet? Just use one or more of the words “Hadoop”, “Lucidworks”, “Big Data”  Add the any url to the tweet that you’d like to share. Try: or
  9. 9. So much potential  You can apply this to so many things.  Do intelligent entity extraction to discover topics with UIMA integration of Solr  Do similar analysis of popular mentions and people of the topics of choice  Endless …  Any questions?
  10. 10. Team  Client Implementation done by Kevin Risden @ Avalon (  Demo Architecture Team  Varun Rao @ Avalon (  Pritesh Patel @ Avalon (