Strangeloop2012

•Descargar como KEY, PDF•

2 recomendaciones•1,286 vistas

GridGain Systems - In-Memory Computing

StrangeLoop 2012 presentation slides for "Making

Tecnología

Making Hadoop Real Time with
Scala & GridGain

Nikita Ivanov, Founder & CEO GridGain Systems
September 2012 www.gridgain.com

1065 East Hillsdale Blvd, Suite 230
Foster City, CA 94404

#gridgain

Table Of Contents:

> 30% > 70%
> Why Real Time Hadoop? > Live Coding
> GridGain Overview > Real Time & Streaming Word Count
> In-Memory Computing
> Compute & Data Grids

www.gridgain.com Slide 2

Real Time Hadoop:

Why?

www.gridgain.com Slide 3

GridGain Overview:
> Scalable In-Memory Data Platform > Full ACID
In-Memory Compute Grid + In-Memory Data Grid Fully distributed ACID transactions
Real Time & Streaming MapReduce, CEP

> Three Editions:
> Simplicity and Productivity
Dramatically reduces cost of application development
> “Compute Grid” Edition Demonstrably faster time-to-market
Targeted at HPC market

> “Data Grid” Edition Example:
Targeted at Transactional Data Caching market
Full source code in Scala of world’s shortest real time MapReduce app
> “Big Data” Edition built with GridGain. Works on one or thousands of computers with no
code changes required.
Targeted at Real Time Big Data market

> Language support:
Server: Java, Scala, Groovy,
Clients: .NET, PHP, REST, C++

> Mobile platforms support:
iOS/ObjectiveC, Android clients

www.gridgain.com Slide 4

In-Memory Computing: Why Now?

“In-memory will have an industry impact
comparable to web and cloud. RAM is a new disk,
and disk is a new tape.”

Technology & Cost: Performance & Scalability Matters:
> 64-bit CPU can address 16 exabytes > Citi: 100ms == $1M loss
Entire active data set on the planet is addressable by just 1 CPU. Forex trading

> Disk up to 105 times slower than DRAM > Google: 500ms == 20% trafﬁc drop
Dropping 20% of revenue
SSD drives are up to 103 times slower

> Super effective in-memory parallelization
> SAP sees +206% in proﬁt in Q112
For in-memory SAP HANA products
Enabled by modern multicore CPUs
> Software AG sees 3x revenue in 2012
> DRAM prices drop 30% every 18 months For in-memory Terracota products
1TB RAM & 48 cores cluster ~ $40K (< $20K in 3 years)

www.gridgain.com Slide 5

GridGain: In-Memory Compute Grid

> Direct API for split and aggregation
> Pluggable failover, topology and collision resolution
> Distributed task session
> Distributed continuations & recursive split
> Support for Streaming MapReduce
> Support for Complex Event Processing (CEP)
> Node-local cache
> AOP-based, OOP/FP-based execution modes
> Direct closure distribution in Java, Scala and Groovy
> Cron-based scheduling
> Direct redundant mapping support
> Zero deployment with P2P class loading
> Partial asynchronous reduction
> Direct support for weighted and adaptive mapping
> State checkpoints for long running tasks
> Early and late load balancing
> Afﬁnity routing with data grid

www.gridgain.com Slide 6

GridGain: In-Memory Data Grid
> Zero deployment for data
> Local, full replicable and partitioned cache types
> Pluggable expiration policies (LRU, LIRS, random, time-based)
> Read-through and write-through logic with pluggable cache store
> Synchronous and asynchronous cache operations
> MVCC-based concurrency
> Pluggable data overﬂow storage via new swap space SPI
> PESSIMISTIC, OPTIMISTIC transactions
> Standard isolation levels, JTA/JCA integration
> Master/Master data replication/invalidation
> Write-behind cache store support
> Concurrent and transactional data preloading
> Delayed preloading support
> Afﬁnity routing with compute grid
> Partitioned cache with active replicas
> Structured and unstructured data
> Datacenter replication
> JDBC driver for in-memory object data store
> Off-heap memory support
> Pluggable indexing via Indexing SPI
> Tiered storage with on-heap, off-heap, swap space, SQL, and Hadoop
> Distributed in-memory query capability
> SQL, H2, Lucene, predicate-based afﬁnity co-located queries

www.gridgain.com Slide 7

Live Coding: GridGain + Scala

> 100% Live Coding:
> Nothing pre-built
> Every line & character
> Everything from the start

www.gridgain.com Slide 8

Más contenido relacionado

La actualidad más candente

Apache Hadoop India Summit 2011 talk "Data Infrastructure on Hadoop" by Venka...Yahoo Developer Network

End to End Machine Learning Open Source Solution Presented in Cisco Developer...Manish Harsh

MongoDB Sharding Webinar 2014Dylan Tong

NoSQL on MySQL - MySQL Document Store by Vadim TkachenkoData Con LA

In memory big data management and processing a surveyredpel dot com

Benchmarking Apache Druid Matt Sarrel

Developing Software for Persistent Memory / Willhalm Thomas (Intel)Ontico

Dsdt meetup-january2018JDA Labs MTL

Data infrastructure and Hadoop at LinkedInHari Shankar Sreekumar

In-Memory Computing: Myths and FactsDATAVERSITY

Dsdt meetup 2017 11-21JDA Labs MTL

Traveloka's journey to no ops streaming analyticsRendy Bambang Junior

Big Data/Hadoop Infrastructure ConsiderationsRichard McDougall

Introduction to SQream and the IoT environmentArnon Shimoni

Webinar: Choosing the Right Shard Key for High Performance and ScaleMongoDB

Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...VMware Tanzu

Girish Juneja - Intel Big Data & Cloud Summit 2013IntelAPAC

Interactive Analytics in Human TimeDataWorks Summit

Brian Bulkowski. AerospikeVolha Banadyseva

How To Achieve Real-Time Analytics On A Data Lake Using GPUsKinetica

La actualidad más candente (20)

Apache Hadoop India Summit 2011 talk "Data Infrastructure on Hadoop" by Venka...

End to End Machine Learning Open Source Solution Presented in Cisco Developer...

MongoDB Sharding Webinar 2014

NoSQL on MySQL - MySQL Document Store by Vadim Tkachenko

In memory big data management and processing a survey

Benchmarking Apache Druid

Developing Software for Persistent Memory / Willhalm Thomas (Intel)

Dsdt meetup-january2018

Data infrastructure and Hadoop at LinkedIn

In-Memory Computing: Myths and Facts

Dsdt meetup 2017 11-21

Traveloka's journey to no ops streaming analytics

Big Data/Hadoop Infrastructure Considerations

Introduction to SQream and the IoT environment

Webinar: Choosing the Right Shard Key for High Performance and Scale

Maximize Greenplum For Any Use Cases Decoupling Compute and Storage - Greenpl...

Girish Juneja - Intel Big Data & Cloud Summit 2013

Interactive Analytics in Human Time

Brian Bulkowski. Aerospike

How To Achieve Real-Time Analytics On A Data Lake Using GPUs

Destacado

CORPORATIONtechecm

Vicki Fullerton Presents to WCR 1960HAR Communications

Buy Nevada Holiday Gift Guide 2014-15Nevada Agriculture

BBB 2013 Annual ReportBetter Business Bureau serving N. Colo & Wyoming

Fredericksburg Reg. Bus. - Dec. 2014George Whitehurst

Successful Property Procurement: Another Amazing SJREI Eventsjreiassociation

Destacado (6)

CORPORATION

Vicki Fullerton Presents to WCR 1960

Buy Nevada Holiday Gift Guide 2014-15

BBB 2013 Annual Report

Fredericksburg Reg. Bus. - Dec. 2014

Successful Property Procurement: Another Amazing SJREI Event

Similar a Strangeloop2012

Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.

Advancing GPU Analytics with RAPIDS Accelerator for Spark and AlluxioAlluxio, Inc.

NoSQL meetup July 2011Shay Hassidim

DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...inside-BigData.com

GOAI: GPU-Accelerated Data Science DataSciCon 2017Joshua Patterson

20201006_PGconf_Online_Large_Data_ProcessingKohei KaiGai

ScyllaDB Virtual WorkshopScyllaDB

20190909_PGconf.ASIA_KaiGaiKohei KaiGai

PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions

An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017Codemotion

NVIDIA Rapids presentationtestSri1

Predictable Big Data Performance in Real-timeAerospike, Inc.

SFBigAnalytics_SparkRapid_20220622.pdfChester Chen

Python and H2O with Cliff Click at PyData Dallas 2015Sri Ambati

7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB

GPU-Accelerating UDFs in PySpark with Numba and PyGDFKeith Kraus

MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...Mydbops

BlazingSQL & Graphistry - Netflow DemoRodrigo Aramburu

In memory computing principles by Mac Moore of GridGainData Con LA

OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabricNETWAYS

Similar a Strangeloop2012 (20)

Backend.AI Technical Introduction (19.09 / 2019 Autumn)

Advancing GPU Analytics with RAPIDS Accelerator for Spark and Alluxio

NoSQL meetup July 2011

DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...

GOAI: GPU-Accelerated Data Science DataSciCon 2017

20201006_PGconf_Online_Large_Data_Processing

ScyllaDB Virtual Workshop

20190909_PGconf.ASIA_KaiGai

PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...

An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017

NVIDIA Rapids presentation

Predictable Big Data Performance in Real-time

SFBigAnalytics_SparkRapid_20220622.pdf

Python and H2O with Cliff Click at PyData Dallas 2015

7 Reasons Not to Put an External Cache in Front of Your Database.pptx

GPU-Accelerating UDFs in PySpark with Numba and PyGDF

MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...

BlazingSQL & Graphistry - Netflow Demo

In memory computing principles by Mac Moore of GridGain

OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric

Último

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Histor y of HAM Radio presentation slidevu2urc

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

How to convert PDF to text with Nanonetsnaman860154

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

A Call to Action for Generative AI in 2024Results

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Strangeloop2012

1. Making Hadoop Real Time with Scala & GridGain Nikita Ivanov, Founder & CEO GridGain Systems September 2012 www.gridgain.com 1065 East Hillsdale Blvd, Suite 230 Foster City, CA 94404 #gridgain

2. Table Of Contents: > 30% > 70% > Why Real Time Hadoop? > Live Coding > GridGain Overview > Real Time & Streaming Word Count > In-Memory Computing > Compute & Data Grids www.gridgain.com Slide 2

3. Real Time Hadoop: Why? www.gridgain.com Slide 3

4. GridGain Overview: > Scalable In-Memory Data Platform > Full ACID In-Memory Compute Grid + In-Memory Data Grid Fully distributed ACID transactions Real Time & Streaming MapReduce, CEP > Three Editions: > Simplicity and Productivity Dramatically reduces cost of application development > “Compute Grid” Edition Demonstrably faster time-to-market Targeted at HPC market > “Data Grid” Edition Example: Targeted at Transactional Data Caching market Full source code in Scala of world’s shortest real time MapReduce app > “Big Data” Edition built with GridGain. Works on one or thousands of computers with no code changes required. Targeted at Real Time Big Data market > Language support: Server: Java, Scala, Groovy, Clients: .NET, PHP, REST, C++ > Mobile platforms support: iOS/ObjectiveC, Android clients www.gridgain.com Slide 4

5. In-Memory Computing: Why Now? “In-memory will have an industry impact comparable to web and cloud. RAM is a new disk, and disk is a new tape.” Technology & Cost: Performance & Scalability Matters: > 64-bit CPU can address 16 exabytes > Citi: 100ms == $1M loss Entire active data set on the planet is addressable by just 1 CPU. Forex trading > Disk up to 105 times slower than DRAM > Google: 500ms == 20% trafﬁc drop Dropping 20% of revenue SSD drives are up to 103 times slower > Super effective in-memory parallelization > SAP sees +206% in proﬁt in Q112 For in-memory SAP HANA products Enabled by modern multicore CPUs > Software AG sees 3x revenue in 2012 > DRAM prices drop 30% every 18 months For in-memory Terracota products 1TB RAM & 48 cores cluster ~ $40K (< $20K in 3 years) www.gridgain.com Slide 5

6. GridGain: In-Memory Compute Grid > Direct API for split and aggregation > Pluggable failover, topology and collision resolution > Distributed task session > Distributed continuations & recursive split > Support for Streaming MapReduce > Support for Complex Event Processing (CEP) > Node-local cache > AOP-based, OOP/FP-based execution modes > Direct closure distribution in Java, Scala and Groovy > Cron-based scheduling > Direct redundant mapping support > Zero deployment with P2P class loading > Partial asynchronous reduction > Direct support for weighted and adaptive mapping > State checkpoints for long running tasks > Early and late load balancing > Afﬁnity routing with data grid www.gridgain.com Slide 6

7. GridGain: In-Memory Data Grid > Zero deployment for data > Local, full replicable and partitioned cache types > Pluggable expiration policies (LRU, LIRS, random, time-based) > Read-through and write-through logic with pluggable cache store > Synchronous and asynchronous cache operations > MVCC-based concurrency > Pluggable data overflow storage via new swap space SPI > PESSIMISTIC, OPTIMISTIC transactions > Standard isolation levels, JTA/JCA integration > Master/Master data replication/invalidation > Write-behind cache store support > Concurrent and transactional data preloading > Delayed preloading support > Affinity routing with compute grid > Partitioned cache with active replicas > Structured and unstructured data > Datacenter replication > JDBC driver for in-memory object data store > Off-heap memory support > Pluggable indexing via Indexing SPI > Tiered storage with on-heap, off-heap, swap space, SQL, and Hadoop > Distributed in-memory query capability > SQL, H2, Lucene, predicate-based affinity co-located queries www.gridgain.com Slide 7

8. Live Coding: GridGain + Scala > 100% Live Coding: > Nothing pre-built > Every line & character > Everything from the start www.gridgain.com Slide 8

9. Thank You! #gridgain

Notas del editor

\n
\n
\n
Comments:\n---------------- \nExample:\nCounts number of non-space characters in given argument by spreading the processing on multiple remote nodes. \nThis is 100% full source code. Can be written in Java or Groovy just as easily.\nCompile and run it. Works transparently on one computer or on thousands.\nFully distributed and parallelized processing.\nNo deployment or provisioning required - 100% zero deployment.\nDevelop and start in the cloud in minutes.\nIt will automatically load balance and fail over, if necessary.\nIt will automatically scale up and down based on dynamically available resources.\n
Comments:\n---------------\n- In-Memory Computing is high growth industry:\n- SAP has seen 206% increase in profit in Q112 in large due to SAP HANA, in-memory processing product\n- IBM grew in-memory computing revenue 80% in 2011\n- Software AG 3x revenue in 2012 for in-memory computing products\n- VMWare is looking at 30% increase in in-memory computing products\n\nGartner References:\n- Top 10 Strategic Technology Trends: In-Memory Computing Massimo Pezzini, Carl Claunch, Joseph Unsworth (G00230658)\n- Insight: Invest in In-Memory Computing for Breakthrough Competitive Advantage Massimo Pezzini (G00226070)\n- What CIOs Need to Know About In-Memory Database Management Systems Roxane Edjlali, Donald Feinberg (G00215735)\n- Need for Speed Powers In-Memory Business Intelligence James Richardson (G00212168)\n- Predicts 2012: Cloud and In-Memory Drive Innovation in Application Platforms Massimo Pezzini, Yefim Natis, Eric Knipp (G00226073)\n\nLinks:\n- http://slidesha.re/O0htOV\n- http://nyti.ms/Ngr4qF\n- http://bit.ly/Ngr8qh\n- http://bit.ly/NZu0Dw\n\n\n\n
\n
\n
\n
\n

Strangeloop2012

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (6)

Similar a Strangeloop2012

Similar a Strangeloop2012 (20)

Último

Último (20)

Strangeloop2012

Notas del editor