Distributed TensorFlow: Scaling Google's Deep Learning Library on Spark

•

34 recomendaciones•27,599 vistas

We took Google’s recently open sourced deep learning library called “TensorFlow”, which is single-machine, and built a distributed implementation on Spark. We did this by using the abstraction layer called “Distributed DataFrame”, or DDF. TensorFlow started as an internal research project at Google and has since spread to be used across the organization. Google open sourced the project in Nov. 2015, but kept the distributed version of it closed. There are other deep learning frameworks that have been implemented on Spark, but TensorFlow’s API is elegant and powerful and has proven to work at Google scale. Putting TensorFlow on Spark makes it easier for companies to harness the power of deep learning in their own businesses without sacrificing scalability. KEY TAKE-AWAYS: (1) Google’s TensorFlow release is a single-machine implementation, (2) We have built a distributed implementation in Spark which allows TensorFlow to scale horizontally. This talk was presented at the Spark Summit East 2016; watch it here: https://www.youtube.com/watch?v=-QtcP3yRqyM&index=5&list=PLMpF-Q2WyoGudT3z-VPGs-6nnv45cod_R

Tecnología

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Christopher Nguyen, PhD. | Chris Smith | Ushnish De | Vu Pham | Nanda Kishore
DISTRIBUTED TENSORFLOW:
SCALING GOOGLE'S
DEEP LEARNING LIBRARY ON SPARK

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Feb 2016
Distributed
TensorFlow
on Spark
Spark Summit East
June 2016
TensorFlow
Ensembles
Spark
Hadoop Summit
Nov 2015
Distributed
DL on Spark
+ Tachyon
Strata NYC

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
1. Motivation
2. Architecture
3. Experimental Parameters
4. Results
5. Lesson Learned
6. Summary
Outline

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Motivation

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Google TensorFlow
1
Spark Deployments
2

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
TensorFlow Review

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Computational Graph

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Graph of a Neural Network

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Google uses TensorFlow for the following:
★ Search
★ Advertising
★ Speech Recognition
★ Photos
★ Maps
★ StreetView—Blurring out house numbers
★ Translate
★ YouTube
Why TensorFlow?

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
DistBelief
Large Scale Distributed Deep Networks, Jeff Dean et al, NIPS 2012
1. Compute
Gradients
2. Update Params  
(Descent)

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Arimo’s 3 Pillars
App Development
BIG APPS
PREDICTIVE
ANALYTICS
NATURAL
INTERFACES
COLLABO-
RATION
Big Data + Big Compute

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Architecture

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
TensorSpark Architecture

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Spark & Tachyon Architectural Options
Model as Broadcast
Variable
Spark-Only
Model Stored
as
Tachyon File
Tachyon-
Storage
Model Hosted
in
HTTP Server
Param-
Server
Model Stored
and Updated
by Tachyon
Tachyon-
CoProc

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Experimental
Parameters

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Data Set Size Parameters Hidden  
Layers
Hidden 
Units
MNIST 50K x 784
=39.2M
1.8M 2 1024
Higgs
10M x 28
=280M 8.5M 3 2048
Molecular
150K x 2871
=430M 14.2M 3 2048
Datasets

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Experimental Dimensions
Dataset Size1
Model Size2
Compute Size3
CPU vs. GPU4

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Results

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
MNIST
Higgs
Molecular

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Lessons Learned

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
GPU vs CPU
★ GPU is 8-10x faster on local; lower gains with Spark but better for larger
data sets
★ On local: GPU 8–10x faster
★ On distributed: GPU 2–3x faster, more with larger datasets
★ GPU memory is limited > Performance impact if hit
★ AWS has old GPUs, need to build TF from source

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Distributed Gradients
★ Initial warmup phase on param server needed
★ “Wobble” effect may still happen with convergence due to mini-batches
✦ Drop “obsolete” gradients
✦ Reduce learning rate and initial parameters
✦ Reduce mini-batch size
★ Work to maximize network throughput, e.g., model compression

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Deployment Configuration
★ Should run only 1 TF instance per machine
★ Compress gradients/parameters (e.g., as binaries) to reduce network overhead
★ Batch-size tradeoff: convergence vs. communication overhead

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Conclusions

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Conclusions
★ This implementation best for large datasets & small models
✦ For large models, we’re looking into  
a) model parallelism b) model compression
★ If data fits on single machine, single-machine GPU would be best (if you have it)
★ For large datasets, leverage both speed and scale using Spark cluster with GPUs
★ Future Work: a) Tachyon b) Ensembles (Hadoop Summit - June 2016)

@pentagoniac @arimoinc #SparkSummit #DDF arimo.com
Top 10 World’s 
Most Innovative  
Companies  
in Data Science
https://github.com/adatao/tensorspark
VISIT US
BOOTH
K8 We’re OPEN SOURCING
our work at

Más contenido relacionado

Último

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein

A Journey Into the Emotions of Software DevelopersNicole Novielli

Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani

Data governance with Unity Catalog PresentationKnoldus Inc.

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Destacado

AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

ChatGPT webinar slidesAlireza Esmikhani

More than Just Lines on a Map: Best Practices for U.S Bike RoutesProject for Public Spaces & National Center for Biking and Walking

Destacado (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

ChatGPT webinar slides

More than Just Lines on a Map: Best Practices for U.S Bike Routes

Distributed TensorFlow: Scaling Google's Deep Learning Library on Spark

1. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Christopher Nguyen, PhD. | Chris Smith | Ushnish De | Vu Pham | Nanda Kishore DISTRIBUTED TENSORFLOW: SCALING GOOGLE'S DEEP LEARNING LIBRARY ON SPARK

2. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Feb 2016 Distributed TensorFlow on Spark Spark Summit East June 2016 TensorFlow Ensembles Spark Hadoop Summit Nov 2015 Distributed DL on Spark + Tachyon Strata NYC

3. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com 1. Motivation 2. Architecture 3. Experimental Parameters 4. Results 5. Lesson Learned 6. Summary Outline

4. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Motivation

5. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Google TensorFlow 1 Spark Deployments 2

6. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com TensorFlow Review

7. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Computational Graph

8. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Graph of a Neural Network

9. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Google uses TensorFlow for the following: ★ Search ★ Advertising ★ Speech Recognition ★ Photos ★ Maps ★ StreetView—Blurring out house numbers ★ Translate ★ YouTube Why TensorFlow?

10. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com DistBelief Large Scale Distributed Deep Networks, Jeff Dean et al, NIPS 2012 1. Compute Gradients 2. Update Params   (Descent)

11. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Arimo’s 3 Pillars App Development BIG APPS PREDICTIVE ANALYTICS NATURAL INTERFACES COLLABO- RATION Big Data + Big Compute

12. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Architecture

13. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com TensorSpark Architecture

14. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Spark & Tachyon Architectural Options Model as Broadcast Variable Spark-Only Model Stored as Tachyon File Tachyon- Storage Model Hosted in HTTP Server Param- Server Model Stored and Updated by Tachyon Tachyon- CoProc

15. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Experimental Parameters

16. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Data Set Size Parameters Hidden   Layers Hidden  Units MNIST 50K x 784 =39.2M 1.8M 2 1024 Higgs 10M x 28 =280M 8.5M 3 2048 Molecular 150K x 2871 =430M 14.2M 3 2048 Datasets

17. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Experimental Dimensions Dataset Size1 Model Size2 Compute Size3 CPU vs. GPU4

18. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Results

19. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com

20. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com

21. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com MNIST Higgs Molecular

22. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com

23. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com

24. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Lessons Learned

25. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com GPU vs CPU ★ GPU is 8-10x faster on local; lower gains with Spark but better for larger data sets ★ On local: GPU 8–10x faster ★ On distributed: GPU 2–3x faster, more with larger datasets ★ GPU memory is limited > Performance impact if hit ★ AWS has old GPUs, need to build TF from source

26. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Distributed Gradients ★ Initial warmup phase on param server needed ★ “Wobble” effect may still happen with convergence due to mini-batches ✦ Drop “obsolete” gradients ✦ Reduce learning rate and initial parameters ✦ Reduce mini-batch size ★ Work to maximize network throughput, e.g., model compression

27. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Deployment Configuration ★ Should run only 1 TF instance per machine ★ Compress gradients/parameters (e.g., as binaries) to reduce network overhead ★ Batch-size tradeoff: convergence vs. communication overhead

28. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Conclusions

29. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Conclusions ★ This implementation best for large datasets & small models ✦ For large models, we’re looking into   a) model parallelism b) model compression ★ If data fits on single machine, single-machine GPU would be best (if you have it) ★ For large datasets, leverage both speed and scale using Spark cluster with GPUs ★ Future Work: a) Tachyon b) Ensembles (Hadoop Summit - June 2016)

30. @pentagoniac @arimoinc #SparkSummit #DDF arimo.com Top 10 World’s  Most Innovative   Companies   in Data Science https://github.com/adatao/tensorspark VISIT US BOOTH K8 We’re OPEN SOURCING our work at

Distributed TensorFlow: Scaling Google's Deep Learning Library on Spark

Recomendados

Recomendados

Más contenido relacionado

Último

Último (20)

Destacado

Destacado (20)

Distributed TensorFlow: Scaling Google's Deep Learning Library on Spark