SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
P.K.Gupta, Megh Computing
Accelerating Real Time
Analytics with Spark
Streaming and FPGAaaS
#HWCSAIS17
Agenda
• Using Spark Streaming for Real Time Analytics
• Why FPGA : Low Latency and High Throughput
– Inline Processing
– Offload Processing
• Challenges in Using FPGA accelerators
• Megh Platform
– Arka Runtime
– Sira AFUs
• Demo Applications
• Conclusion
2#HWCSAIS17
Using Spark Streaming with ML / DL
for Real Time Analytics
ETL Data Processing
ML DLStreams
Application
Social
MediaOperations
Transportation
Marketing
Sensors
Web
Queries
Alerts
Analysis
3#HWCSAIS17
Real Time vs. Batch Insights
Real
Time
Secs Mins Hours Days Months
Time
ValueofDatatoDecisionMaking
Information Half-
Life in Decision
Making
Time Critical
Decisions
Traditional “Batch”
Business Intelligence
4#HWCSAIS17
Predictive/
Preventive
Actionable
Reactive
Historical
Real Time Insights
Hard Real
Time
Regular
Trading
Fraud
Prevention
Edge
Computing
Dashboard
(Inference)
Operational
Insights
< 1 us 10s us ms 10s ms seconds100s ms
5#HWCSAIS17
Real Time Analytics platform:
using Heterogeneous CPU+FPGA computing
Data Processing
CPU+FPGA Platform
Social
MediaOperations
Transportati
on
Marketing
Sensors
Web
Queries
Alerts
Analysis
Batch Mode
Real Time
Mode
Public Cloud Private Cloud Edge Cloud
Application
6#HWCSAIS17
In-Line Stream Processing:
using heterogeneous CPU+FPGA platform
7#HWCSAIS17
Worker Node
Executor
Filter #1
Task
System NIC
Worker Node
Executor
FPGA NIC
Filter # 1
FPGA
FPGA terminates Network and dynamically chains filters to provide
pre-processed / low latency DStreams to SPARK apps transparently
Filter #2
Task
MLLib
Task
MLLib
Task
Filter # 2
In-Line Stream Processing:
FPGA Architecture
8#HWCSAIS17
Data input
Packet
Processing
Engine
Filter
Filter
Filter
Filter
Filter
Filter
Streaming Engine
RDDs
FPGA
Sequencer
In-Line Stream Performance
9#HWCSAIS17
Lower Latency Higher Throughput
Source: An FPGA Based Low Latency Network Processing for Spark Streaming, K. Nakamura et.al.
Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
Off-load Processing (ML/DL):
using heterogeneous CPU+FPGA platform
10#HWCSAIS17
Worker Node
SQL
Executor
Task
DLLib
Task
Worker Node
Executor
SQL
Task
DLLib
FPGA
Accelerate ML/DL algorithm transparently by providing
SPARK bindings to FPGA implementations of ML/DL libraries
Off-load ML/DL Processing:
FPGA Architecture
11#HWCSAIS17
Source: CAN FPGAS BEAT GPUS IN ACCELERATING NEXT-GENERATION DEEP LEARNING?
The Next Platform, March 21, 2017
Off-load ML/DL Performance
12#HWCSAIS17
Source: Accelerating Persistent Neural Networks at Datacenter Scale
Eric Chung, et. Al, HotChips, 2017
Lower Latency Higher Throughput
10X
500fps
Challenges in using FPGA
13#HWCSAIS17
Programming
FPGAs
1
Managing
FPGAs in the
DataCenter
2
Integrating
FPGAs into
applications
3
Spark Driver
Client
Application
Worker Node
FPGA Runtime
Executor
Task Shell
AFU AFU
FPGA
FPGA Runtime
Task
Spark Streaming Architecture
using CPU+FPGA platform
Cluster Resource
Manager
14#HWCSAIS17
Spark Context
Driver
Master Node
Megh Platform:
abstracts the complexity of the FPGA
Packet RX
Streaming
Functions
ML / DL
Functions
Packet TX
FPGA
FPGA Driver
Arka Runtime
Java / C++ Library Adaptors
Other App Frameworks
Sira Accelerator Function Units (AFU)
CPU
In-line
Processing1
Off-load
Processing2
Application
Application:
• uses standard APIs
• And/or custom APIs
Arka Runtime:
• FPGA
management
• SW fallback
• Expose AFaaS
Sira Accelerators:
• Downloaded at
Runtime
• Bare Metal or
Exposed to VMs
via VMM
Infrastructure
Components
Megh
Components
15#HWCSAIS17
Virtualized Real Time Analytics Stack
16
#HWCSAIS17
zzz
CPU
FPGA Kernel Driver
VMM VFIO (or Windows equivalent) or PCIe passthrough
Spark Driver/Task
Custom Package/Lib
Arka JNI Access
Utilities: Resource Manager, Scheduler, etc.
ML Package/Lib
ML adapter
. . .
Megh Arka JAVA/SCALA
Arka Runtime
Low Level FPGA Access Lib
VMs
JVMs
JVM
Threads
Application:
• uses standard APIs
• And/or custom APIs
Runtime:
• FPGA management
• SW fallback
• Expose AFaaS
Accelerators:
• Downloaded at
Runtime
• Exposed to VMs via
VMM
FPGAs
Shell
AFU AFU…
In-Line Processing:
Smart rx/tx adaptor architecture
17#HWCSAIS17
CPU
FPGA
Kernel
Space
User
Space
Spark DStream Adapter
DMA (VirtIO)
Packet
Processor Filters
Streaming
Processor
Arka Runtime
FPGA Kernel Driver
Shell
Infrastructure
Components
Megh
Components
• Packet Processor: Intercepts
network packets destined to
Spark
• Filters: Performs data
cleaning, re-size, layout
transforms (ETL operations)
• Streaming Processor:
Creates D-Stream packets for
Spark
public final class JavaSqlNetworkWordCount {
private static final Pattern SPACE = Pattern.compile(" ");
public static void main(String[] args) throws Exception {
if (args.length < 2) {
System.err.println("Usage: JavaNetworkWordCount <hostname> <port>");
System.exit(1);
}
StreamingExamples.setStreamingLogLevels();
// Create the context with a 1 second batch size
SparkConf sparkConf = new SparkConf().setAppName("JavaSqlNetworkWordCount");
JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1));
// Create a JavaReceiverInputDStream on target ip:port and count the
// words in input stream of n delimited text (eg. generated by 'nc')
JavaReceiverInputDStream<String> lines = ssc.socketTextStream(
args[0], Integer.parseInt(args[1]), StorageLevels.MEMORY_AND_DISK_SER);
JavaDStream<String> words = lines.flatMap(Split2Words());
..
}
..
}
18#HWCSAIS17
Inline sample implementation
CPU IMPLEMENTATION
1. Sets up the DStream CPU adapter
connected to System NIC.
2. Configure IP/port on CPU NIC
3. etlLibCPU.jar (CPU implementation)
• split2Words()
• spilt2Sort()
• split2Count()
FPGA IMPLEMENATAION
1. Sets up the DStream FPGA adapter
connected to FPGA NIC.
2. Configures IP/Port on FPGA NIC
3. etlLibCPU.jar(FPGA implementation)
• split2Words()
• spilt2Sort()
• split2Count()
FPGA is setup to stream and filter data -
before passing it to SPARK as DStream object.
* Full implementation
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaSqlNetworkWordCount.java
1
2
3
Off-load Processing:
Low latency off-Load of ML/DL libraries
19#HWCSAIS17
CPU
FPGA
Kernel
Space
User
Space
Spark DStream Adapter
DMA (VirtIO)
ML Libraries DL Libraries
FPGA Kernel Driver
Shell
Infrastructure
Components
Megh
Components
Arka Runtime
Inter-FPGA Network
• Machine Learning Libraries:
Optimized libraries for K-
Means, SVM, etc.
• Deep Learning Libraries:
Optimized libraries for DNN
based inference engines.
• Inter-FPGA Network: FPGA
network for sharing FPGA
resources for larger DNN
topologies
public class JavaKMeansExample {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("JavaKMeansExample");
JavaSparkContext jsc = new JavaSparkContext(conf);
..
// Cluster the data into two classes using KMeans
int numClusters = 2;
int numIterations = 20;
KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters,numIterations);
..
double cost = clusters.computeCost(parsedData.rdd());
System.out.println("Cost: " + cost);
// Evaluate clustering by computing Within Set Sum of Squared Errors
double WSSSE = clusters.computeCost(parsedData.rdd());
System.out.println("Within Set Sum of Squared Errors = " + WSSSE);
..
jsc.stop();
}
}
20#HWCSAIS17
Offload Sample Implementation
mlib.jar
(CPU library implementation)
• KmeansModel.train()
• KmeansModel.computeCost()
mlibFPGA.jar
(FPGA accelerated library implementation)
• KmeansModel.train()
• KmeansModel.computeCost()
CPU and FPGA share the same function signature -
providing application transparent acceleration by using FPGA library
* Full implementation:
https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/mllib/JavaKMeansExample.java
public static void main( String[] args ) throws Exception {
System.out.println(" Java: NumAdd Spark Demo.n");
Long total = null;
SparkConf sparkConf = new SparkConf().setAppName( “NumAdd“ );
JavaSparkContext ctx = new JavaSparkContext( sparkConf );
JavaRDD<String> lines = ctx.textFile( args[0], 1 );
JavaRDD<Long> sums = lines.map( new sumOneString() );
total = sums.reduce( (a,b) -> (a+b) );
System.out.println( "Total is -> " + total );
ctx.stop();
}
21#HWCSAIS17
numAdd Demo:
Implementation details
numAdd is slight variation of the popular WordCount Sample
where numbers in the files are parsed and added up using SPARK
Accelerated Operation:
sumOneString
AFU.Factory fpgaFactory = new AFU.Factory();
AFU wc = fpgaFactory.createAFU("meghna");
TransferBuffer inbuf = wc.getTransferBuffer( input1.length() );
wc.queueInputBuffer( inbuf );
// Reuse buffer 1 for the output. AFU design ensures this is safe.
wc.queueOutputBuffer( inbuf ); // Arka permits it.
wc.startFunction(); // The real work starts here
TransferBuffer obuff = wc.waitOnOutputQueue();
return ( obuff.getByteBuffer().asLongBuffer().get(0) );
Instantiate AFU as a Service. Enables multiple distinct
implementations to co-exist and be selected dynamically:
specifically, an FPGA implementation and a CPU-based
fallback implementation.
Buffer Queue based model
• (Register interface available but not shown)
AFU optimized Transfer Buffers allow for:
• Zero copy to HW. And efficient access.
• Efficient access from Java/Scala
• AFU specific implementation.
• May use direct byte buffers, SVM, Netty, Apache Arrow
etc…
Start operation.
22#HWCSAIS17
Wait for results in output queue.
Demo: NumAdd Offload Profiling
23#HWCSAIS17
0
50
100
150
200
1M 2M 4M
ExecutionTime
(s)
FileSize
NumAdd
FPGA Offload Spark Streaming
* Executor/task on the worker node restricted to 1 thread
In Summary….
• Megh CPU+FPGA platform optimized for Real
Time Analytics
• Arka Runtime supports different streaming
frameworks
• Sira AFUs deliver low latency and high
throughput for inline and offload processing
24#HWCSAIS17
Thank You
info@meghcomputing.com

Más contenido relacionado

La actualidad más candente

Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Databricks
 
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark ClustersDownscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Databricks
 
Self-Service Apache Spark Structured Streaming Applications and Analytics
Self-Service Apache Spark Structured Streaming Applications and AnalyticsSelf-Service Apache Spark Structured Streaming Applications and Analytics
Self-Service Apache Spark Structured Streaming Applications and Analytics
Databricks
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Databricks
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Databricks
 

La actualidad más candente (20)

Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
 Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
Overview of Apache Spark 2.3: What’s New? with Sameer Agarwal
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
 
Reactive Streams, Linking Reactive Application To Spark Streaming
Reactive Streams, Linking Reactive Application To Spark StreamingReactive Streams, Linking Reactive Application To Spark Streaming
Reactive Streams, Linking Reactive Application To Spark Streaming
 
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik SivashanmugamSpark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik Sivashanmugam
 
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
Large Scale Feature Aggregation Using Apache Spark with Pulkit Bhanot and Ami...
 
Efficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out DatabasesEfficient State Management With Spark 2.0 And Scale-Out Databases
Efficient State Management With Spark 2.0 And Scale-Out Databases
 
Spark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg SchadSpark Summit EU talk by Jorg Schad
Spark Summit EU talk by Jorg Schad
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark ClustersDownscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
 
Self-Service Apache Spark Structured Streaming Applications and Analytics
Self-Service Apache Spark Structured Streaming Applications and AnalyticsSelf-Service Apache Spark Structured Streaming Applications and Analytics
Self-Service Apache Spark Structured Streaming Applications and Analytics
 
SSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine LearningSSR: Structured Streaming for R and Machine Learning
SSR: Structured Streaming for R and Machine Learning
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
 
Best Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale PlatformsBest Practices for Enabling Speculative Execution on Large Scale Platforms
Best Practices for Enabling Speculative Execution on Large Scale Platforms
 
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationsTop 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
 
Operational Tips For Deploying Apache Spark
Operational Tips For Deploying Apache SparkOperational Tips For Deploying Apache Spark
Operational Tips For Deploying Apache Spark
 
Accelerated Spark on Azure: Seamless and Scalable Hardware Offloads in the C...
 Accelerated Spark on Azure: Seamless and Scalable Hardware Offloads in the C... Accelerated Spark on Azure: Seamless and Scalable Hardware Offloads in the C...
Accelerated Spark on Azure: Seamless and Scalable Hardware Offloads in the C...
 
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy StarzhinskySpark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
Spark Summit EU talk by Yaroslav Nedashkovsky and Andy Starzhinsky
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 

Similar a Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabhat Gupta

Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 

Similar a Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabhat Gupta (20)

Introduction to FPGA acceleration
Introduction to FPGA accelerationIntroduction to FPGA acceleration
Introduction to FPGA acceleration
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Strata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark StreamingStrata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark Streaming
 
Spark streaming with kafka
Spark streaming with kafkaSpark streaming with kafka
Spark streaming with kafka
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streamin...
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Using a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application PerformanceUsing a Field Programmable Gate Array to Accelerate Application Performance
Using a Field Programmable Gate Array to Accelerate Application Performance
 
SamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentationSamzaSQL QCon'16 presentation
SamzaSQL QCon'16 presentation
 

Más de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Más de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Machine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack DetectionMachine Learning CI/CD for Email Attack Detection
Machine Learning CI/CD for Email Attack Detection
 

Último

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
gajnagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Último (20)

Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 

Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabhat Gupta

  • 1. P.K.Gupta, Megh Computing Accelerating Real Time Analytics with Spark Streaming and FPGAaaS #HWCSAIS17
  • 2. Agenda • Using Spark Streaming for Real Time Analytics • Why FPGA : Low Latency and High Throughput – Inline Processing – Offload Processing • Challenges in Using FPGA accelerators • Megh Platform – Arka Runtime – Sira AFUs • Demo Applications • Conclusion 2#HWCSAIS17
  • 3. Using Spark Streaming with ML / DL for Real Time Analytics ETL Data Processing ML DLStreams Application Social MediaOperations Transportation Marketing Sensors Web Queries Alerts Analysis 3#HWCSAIS17
  • 4. Real Time vs. Batch Insights Real Time Secs Mins Hours Days Months Time ValueofDatatoDecisionMaking Information Half- Life in Decision Making Time Critical Decisions Traditional “Batch” Business Intelligence 4#HWCSAIS17 Predictive/ Preventive Actionable Reactive Historical
  • 5. Real Time Insights Hard Real Time Regular Trading Fraud Prevention Edge Computing Dashboard (Inference) Operational Insights < 1 us 10s us ms 10s ms seconds100s ms 5#HWCSAIS17
  • 6. Real Time Analytics platform: using Heterogeneous CPU+FPGA computing Data Processing CPU+FPGA Platform Social MediaOperations Transportati on Marketing Sensors Web Queries Alerts Analysis Batch Mode Real Time Mode Public Cloud Private Cloud Edge Cloud Application 6#HWCSAIS17
  • 7. In-Line Stream Processing: using heterogeneous CPU+FPGA platform 7#HWCSAIS17 Worker Node Executor Filter #1 Task System NIC Worker Node Executor FPGA NIC Filter # 1 FPGA FPGA terminates Network and dynamically chains filters to provide pre-processed / low latency DStreams to SPARK apps transparently Filter #2 Task MLLib Task MLLib Task Filter # 2
  • 8. In-Line Stream Processing: FPGA Architecture 8#HWCSAIS17 Data input Packet Processing Engine Filter Filter Filter Filter Filter Filter Streaming Engine RDDs FPGA Sequencer
  • 9. In-Line Stream Performance 9#HWCSAIS17 Lower Latency Higher Throughput Source: An FPGA Based Low Latency Network Processing for Spark Streaming, K. Nakamura et.al. Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
  • 10. Off-load Processing (ML/DL): using heterogeneous CPU+FPGA platform 10#HWCSAIS17 Worker Node SQL Executor Task DLLib Task Worker Node Executor SQL Task DLLib FPGA Accelerate ML/DL algorithm transparently by providing SPARK bindings to FPGA implementations of ML/DL libraries
  • 11. Off-load ML/DL Processing: FPGA Architecture 11#HWCSAIS17 Source: CAN FPGAS BEAT GPUS IN ACCELERATING NEXT-GENERATION DEEP LEARNING? The Next Platform, March 21, 2017
  • 12. Off-load ML/DL Performance 12#HWCSAIS17 Source: Accelerating Persistent Neural Networks at Datacenter Scale Eric Chung, et. Al, HotChips, 2017 Lower Latency Higher Throughput 10X 500fps
  • 13. Challenges in using FPGA 13#HWCSAIS17 Programming FPGAs 1 Managing FPGAs in the DataCenter 2 Integrating FPGAs into applications 3
  • 14. Spark Driver Client Application Worker Node FPGA Runtime Executor Task Shell AFU AFU FPGA FPGA Runtime Task Spark Streaming Architecture using CPU+FPGA platform Cluster Resource Manager 14#HWCSAIS17 Spark Context Driver Master Node
  • 15. Megh Platform: abstracts the complexity of the FPGA Packet RX Streaming Functions ML / DL Functions Packet TX FPGA FPGA Driver Arka Runtime Java / C++ Library Adaptors Other App Frameworks Sira Accelerator Function Units (AFU) CPU In-line Processing1 Off-load Processing2 Application Application: • uses standard APIs • And/or custom APIs Arka Runtime: • FPGA management • SW fallback • Expose AFaaS Sira Accelerators: • Downloaded at Runtime • Bare Metal or Exposed to VMs via VMM Infrastructure Components Megh Components 15#HWCSAIS17
  • 16. Virtualized Real Time Analytics Stack 16 #HWCSAIS17 zzz CPU FPGA Kernel Driver VMM VFIO (or Windows equivalent) or PCIe passthrough Spark Driver/Task Custom Package/Lib Arka JNI Access Utilities: Resource Manager, Scheduler, etc. ML Package/Lib ML adapter . . . Megh Arka JAVA/SCALA Arka Runtime Low Level FPGA Access Lib VMs JVMs JVM Threads Application: • uses standard APIs • And/or custom APIs Runtime: • FPGA management • SW fallback • Expose AFaaS Accelerators: • Downloaded at Runtime • Exposed to VMs via VMM FPGAs Shell AFU AFU…
  • 17. In-Line Processing: Smart rx/tx adaptor architecture 17#HWCSAIS17 CPU FPGA Kernel Space User Space Spark DStream Adapter DMA (VirtIO) Packet Processor Filters Streaming Processor Arka Runtime FPGA Kernel Driver Shell Infrastructure Components Megh Components • Packet Processor: Intercepts network packets destined to Spark • Filters: Performs data cleaning, re-size, layout transforms (ETL operations) • Streaming Processor: Creates D-Stream packets for Spark
  • 18. public final class JavaSqlNetworkWordCount { private static final Pattern SPACE = Pattern.compile(" "); public static void main(String[] args) throws Exception { if (args.length < 2) { System.err.println("Usage: JavaNetworkWordCount <hostname> <port>"); System.exit(1); } StreamingExamples.setStreamingLogLevels(); // Create the context with a 1 second batch size SparkConf sparkConf = new SparkConf().setAppName("JavaSqlNetworkWordCount"); JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the // words in input stream of n delimited text (eg. generated by 'nc') JavaReceiverInputDStream<String> lines = ssc.socketTextStream( args[0], Integer.parseInt(args[1]), StorageLevels.MEMORY_AND_DISK_SER); JavaDStream<String> words = lines.flatMap(Split2Words()); .. } .. } 18#HWCSAIS17 Inline sample implementation CPU IMPLEMENTATION 1. Sets up the DStream CPU adapter connected to System NIC. 2. Configure IP/port on CPU NIC 3. etlLibCPU.jar (CPU implementation) • split2Words() • spilt2Sort() • split2Count() FPGA IMPLEMENATAION 1. Sets up the DStream FPGA adapter connected to FPGA NIC. 2. Configures IP/Port on FPGA NIC 3. etlLibCPU.jar(FPGA implementation) • split2Words() • spilt2Sort() • split2Count() FPGA is setup to stream and filter data - before passing it to SPARK as DStream object. * Full implementation https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaSqlNetworkWordCount.java 1 2 3
  • 19. Off-load Processing: Low latency off-Load of ML/DL libraries 19#HWCSAIS17 CPU FPGA Kernel Space User Space Spark DStream Adapter DMA (VirtIO) ML Libraries DL Libraries FPGA Kernel Driver Shell Infrastructure Components Megh Components Arka Runtime Inter-FPGA Network • Machine Learning Libraries: Optimized libraries for K- Means, SVM, etc. • Deep Learning Libraries: Optimized libraries for DNN based inference engines. • Inter-FPGA Network: FPGA network for sharing FPGA resources for larger DNN topologies
  • 20. public class JavaKMeansExample { public static void main(String[] args) { SparkConf conf = new SparkConf().setAppName("JavaKMeansExample"); JavaSparkContext jsc = new JavaSparkContext(conf); .. // Cluster the data into two classes using KMeans int numClusters = 2; int numIterations = 20; KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters,numIterations); .. double cost = clusters.computeCost(parsedData.rdd()); System.out.println("Cost: " + cost); // Evaluate clustering by computing Within Set Sum of Squared Errors double WSSSE = clusters.computeCost(parsedData.rdd()); System.out.println("Within Set Sum of Squared Errors = " + WSSSE); .. jsc.stop(); } } 20#HWCSAIS17 Offload Sample Implementation mlib.jar (CPU library implementation) • KmeansModel.train() • KmeansModel.computeCost() mlibFPGA.jar (FPGA accelerated library implementation) • KmeansModel.train() • KmeansModel.computeCost() CPU and FPGA share the same function signature - providing application transparent acceleration by using FPGA library * Full implementation: https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/mllib/JavaKMeansExample.java
  • 21. public static void main( String[] args ) throws Exception { System.out.println(" Java: NumAdd Spark Demo.n"); Long total = null; SparkConf sparkConf = new SparkConf().setAppName( “NumAdd“ ); JavaSparkContext ctx = new JavaSparkContext( sparkConf ); JavaRDD<String> lines = ctx.textFile( args[0], 1 ); JavaRDD<Long> sums = lines.map( new sumOneString() ); total = sums.reduce( (a,b) -> (a+b) ); System.out.println( "Total is -> " + total ); ctx.stop(); } 21#HWCSAIS17 numAdd Demo: Implementation details numAdd is slight variation of the popular WordCount Sample where numbers in the files are parsed and added up using SPARK
  • 22. Accelerated Operation: sumOneString AFU.Factory fpgaFactory = new AFU.Factory(); AFU wc = fpgaFactory.createAFU("meghna"); TransferBuffer inbuf = wc.getTransferBuffer( input1.length() ); wc.queueInputBuffer( inbuf ); // Reuse buffer 1 for the output. AFU design ensures this is safe. wc.queueOutputBuffer( inbuf ); // Arka permits it. wc.startFunction(); // The real work starts here TransferBuffer obuff = wc.waitOnOutputQueue(); return ( obuff.getByteBuffer().asLongBuffer().get(0) ); Instantiate AFU as a Service. Enables multiple distinct implementations to co-exist and be selected dynamically: specifically, an FPGA implementation and a CPU-based fallback implementation. Buffer Queue based model • (Register interface available but not shown) AFU optimized Transfer Buffers allow for: • Zero copy to HW. And efficient access. • Efficient access from Java/Scala • AFU specific implementation. • May use direct byte buffers, SVM, Netty, Apache Arrow etc… Start operation. 22#HWCSAIS17 Wait for results in output queue.
  • 23. Demo: NumAdd Offload Profiling 23#HWCSAIS17 0 50 100 150 200 1M 2M 4M ExecutionTime (s) FileSize NumAdd FPGA Offload Spark Streaming * Executor/task on the worker node restricted to 1 thread
  • 24. In Summary…. • Megh CPU+FPGA platform optimized for Real Time Analytics • Arka Runtime supports different streaming frameworks • Sira AFUs deliver low latency and high throughput for inline and offload processing 24#HWCSAIS17