Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

Webinar: Fusion for Data Science

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Cargando en…3
×

Eche un vistazo a continuación

1 de 17 Anuncio

Más Contenido Relacionado

Presentaciones para usted (20)

A los espectadores también les gustó (20)

Anuncio

Similares a Webinar: Fusion for Data Science (20)

Más de Lucidworks (20)

Anuncio

Más reciente (20)

Webinar: Fusion for Data Science

  1. 1. Fusion for data science: Grant Ingersoll CTO, Lucidworks @gsingers Scalable search and analytics in one
  2. 2. Get Started https://github.com/LucidWorks/fusion-examples/tree/master/ fusion-for-datascience-webinar
  3. 3. • Best in breed search solution built on Apache Lucene and Solr • Easily capture signals like clicks, shares, ratings, etc. and make them actionable • Powerful data ingestion and analysis capabilities enabling machine learning, recommendations and positive user feedback loops • Effortless scale leveraging proven frameworks and algorithms • Easy integration with big data tools like Hadoop Fusion Foundations
  4. 4. Billions of Docs Optional REST Security woven throughout Proxy/LB Recs Worker Pipes Metrics NLP Sched. Blobs Msging Connectors Worker Cluster Mgr. Spark Shards Shards Solr HDFS Shared Config Mgmt Leader Election Load Balancing ZK 1 Zookeeper ZK N Signals Fusion Architecture Millions of Users
  5. 5. • Data exploration and visualization • Easy Ingestion, feature selection and data reduction • REST APIs for easy integration with commonly used tools • Quick and Dirty: classification, clustering • Powerful and scalable aggregations, math/stats framework leveraging Apache Spark • Out of the box NLP tools for part of speech, sentence detection, named entity and more • OOTB recommenders plus Mahout extensions Fusion Data Science Use Cases
  6. 6. Lucene: Core search, pluggable ranking, advanced storage, sparse matrix Solr: Faceting, function queries, basic stats, scaling, easy setup, UIMA, basic NLP, search clustering Fusion: Pipelines, Connectors/Crawlers, Dashboards/UI, Spark integration, advanced stats, large scale aggregations Fusion: Standing on the shoulders of giants.
  7. 7. Data Exploration Demo
  8. 8. • Ingestion • 60+ connectors, plus easily push data in using REST APIs • Feature Selection • Analyzers for all types • Easily get/calculate weights for terms and attach payloads • Term Vectors/Term Dictionary • Data Reduction • Filters • Analyzers • Data quality tools Ingestion, Selection, Reduction
  9. 9. • Math: • Search is essentially Vector * Matrix • Aggregations • Enable advanced computation over both core content as well as Fusion’s signals • Make it easy to try out by leveraging Solr • Ship with prebuilt “named” aggregations to cover common scenarios Aggregations and Math
  10. 10. • Effortless scale, integrated with Fusion and Solr • Leverage existing libraries like: • Mahout • Deep Learning 4J • GraphX, MLLib • As easy as: • bin/spark start • http://.../aggregator/jobs/twitter/hashtags_per_author?spark=true Spark FTW!
  11. 11. Aggregations Demo
  12. 12. • Fusion powers recommendation use cases such as: • People who bough this, bought that • Related searches, spellings and more • Session analysis • Fusion ships with several built in recommendation options - Graph and collaborative filtering based approaches • Easily enable multi-modal recommendations that combine: - Content - Collaborative Filtering - Spatial - Historic/Context Recommendations
  13. 13. • Spark • APIs for running non-Lucid Spark jobs • Integration with 3rd party Spark instances (from major Hadoop distros) • Solr RDD extensions for term dictionary, term vectors • UI for managing Aggregations • Full-fledged Graph API • More Math: matrices, functions, etc. What's Next
  14. 14. • Lucidworks: http://www.lucidworks.com • Me: grant@lucidworks.com • Key Docs: • https://docs.lucidworks.com • https://docs.lucidworks.com/display/fusion/Signals+Aggregator+API • https://docs.lucidworks.com/display/fusion/Aggregator+Functions • https://docs.lucidworks.com/display/fusion/Signals+Aggregations+and +Recommendations Resources

×