Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Cargando en…3
×
1 de 23

#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)

1

Compartir

Descargar para leer sin conexión

Today, if events change the decision model, we wait until the next batch model build for new insights. By extending fast “time-to-decisions” into the world of Big Data Analytics to get fast “time-to-insights”, apps will get what used to be batch insights in near real time. The technology enabling this includes smart in-memory data storage, new storage class memory, and products designed to do one or more parts of an analysis pipeline very well. In this talk we describe how Ampool is building on Apache Geode to allow Big Data analysis solutions to work together with a scalable smart storage class memory layer to allow fast and complex end-to-end pipelines to be built -- closing the loop and providing dramatically lower time to critical insights.

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)

  1. 1. Democratizing Fast Analytics with Ampool (Powered by Apache Geode-incubating) Avinash Dongre and Robert Geiger, Ampool Inc.
  2. 2.  " " " Analytics # # # # Apps Multi-Device Testing $ % & |  ) * Analytics needs to work in CLOSED LOOP with AppsAnalytics Needs to be Faster!
  3. 3.  " " " Analytics # # # # Apps Multi-Device Testing $ % & |  ) * What are the CHALLENGES? ⚠ Many data users/ stakeholders ⚠ Disparate tools & processing needs ⚠ Long time to insights
  4. 4. Meet the ChallengesDISTRIBUTED MEMORY LAYER … Smart Distributed In-Memory Object Store
  5. 5. Smart Distributed In-Memory Object Store CHOICE in Best of Breed Engines, …
  6. 6. Smart Distributed In-Memory Object Store … PLUGGABLE distributed memory layer… +  …
  7. 7. AnalyticsIngest App UseETL … FAST OBJECT ACCESS across the pipeline # # # , ,   - ,     … . Data Architect Data Developers . . Business Analysts Data Scientists .
  8. 8. What ENABLERS can help here? In-Memory Fabric Technology • Apache Geode! • Flexible, stable, and proven distributed in-memory technology New memory technologies and fast network fabrics • Storage Class Memory • low latency, high throughput, persistent • Initially exposed via file system interface • Regular or memory mapped
  9. 9. Emerging Storage Class Memory (SCM) is DISRUPTIVE Challenges the value proposition of in-memory solutions Near DRAM latency and throughput at lower cost Based on one of several types of memory technology • MRAM (magnetic) • ReRAM (resistive) • FRAM (ferroelectric), PCM (phase change) • 3D-XPointTM (Intel/Micron) Accessible via Java and C/C++ libraries • Mnemonic (Java) • Pmem.io (C++)
  10. 10. SCM is ATTRACTIVE in the Memory/ Storage Hierarchy
  11. 11. In-Memory Technology CHALLENGES Line between memory and storage is blurring File systems getting really fast, so the speed gap is closing • SCM File Systems will also be low latency • File system overhead still limits latency improvements • Before: disk based vs. in-memory • After: file system vs. byte addressable object store Managing multiple layers and types of memory
  12. 12. Fast Closed Loop Analytics, Powered by a Smart, Distributed In-Memory Fabric… High throughput and large data handling matters • Throughput, latency, and capacity: • each pipeline stage values these differently Common interfaces, multiple region types • Meet the needs of many types of best of breed engines Managing multiple layers of memory and storage • Speed (latency, throughput) differentiator will diminish More classifications for data now • Hot, cold => hot, warm, lukewarm, cold
  13. 13. …must handle MULTIPLE needs in one fabric Need for High Throughput Need for Low latency Early stages (ingest, ETL) Later stages (data driven insights & actions)
  14. 14. What Matters for App, DB, and Compute? The flexibility, suitability, and ease of use of the interfaces Memory & storage are managed transparently to provide QoS The service guarantee abstractions are provided Conflicts are managed and prevented Freeing developersfrom re-inventing the wheel
  15. 15. A Distributed, Memory-Centric, Object Store for Closed Loop Analytics Introducing….
  16. 16. Smart Distributed In-Memory Object Store PLUGGABLE distributed memory layer … +  3D XPointTM ......
  17. 17. Smart Distributed In-Memory Object Store … for MANAGED FLEXIBILITY... +  3D XPointTM ...... ✅ Flexible regions and interfaces for ‘Best of breed’ engines ✅ Extensible Core ✅ Pluggable stores
  18. 18. AnalyticsIngest App UseETL …and FAST OBJECT ACCESS across the pipeline # # # , ,   - ,     … . Data Architect Data Developers . . Business Analysts Data Scientists .
  19. 19. In-Memory Distributed Sys Low-latency Comms. Key-Value Store Function Pushdown + High Throughput Table Store Native InterfacePluggable Store Manager Java API MASH (CLI Ext) Java API Building on PROVEN In-memory Technology Smart Data Tiering Mature Event Model Tunable Consistency Metadata/ Catalog Security AuthZ
  20. 20. ampool + … ORC … First release covers MULTIPLE analytical needs…
  21. 21. No change in data application code Config. changes only No change in user experience Performance benefits No added hassles Current mgmt. tools …and deliver VALUE to all Analytics stakeholders . Data Architect Data Developers . . Business Analysts Data Scientists . . Data Admins Infra/ Sys Admins .
  22. 22. Contributing Back Plan for contributions back to Apache Geode: • Storage plug-ability layer • Off-heap memory plug-ability • SCM plugin (Mnemonic) • Impersonation support for security • Region type plug-ability
  23. 23. Thank You! Avinash Dongre Architect, Ampool India Pvt. Limited avinash@ampool.io Robert Geiger Chief Architect & VP Engineering, Ampool Inc. robert@ampool.io

×