Scalable Learning in Computer Vision

•Download as PPTX, PDF•

0 likes•535 views

butest

Scalable Learningin Computer Vision Adam Coates Honglak Lee RajatRaina Andrew Y. Ng Stanford University

Introduction One reason for difficulty: small datasets.

Introduction But the world is complex. Hard to get extremely high accuracy on real images if we haven’t seen enough examples. AUC Training Set Size

Introduction Small datasets: Clever features Carefully design to be robust to lighting, distortion, etc. Clever models Try to use knowledge of object structure. Some machine learning on top. ,[object Object]

Favor speed over invariance and expressive power.

Rely on machine learning to solve everything else.,[object Object]

The Learning Pipeline Need to scale up each part of the learning process to really large datasets. Image Data Learning Algorithm Low-level features

Synthetic Data Not enough labeled data for algorithms to learn all the knowledge they need. Lighting variation Object pose variation Intra-class variation Synthesize positive examples to include this knowledge. Much easier than building this knowledge into the algorithms.

Synthetic Data Collect images of object on a green-screen turntable. Green Screen image Synthetic Background Segmented Object Photometric/Geometric Distortion

Synthetic Data: Example Claw hammers: Synthetic Examples (Training set) Real Examples (Test set)

The Learning Pipeline Feature computations can be prohibitive for large numbers of images. E.g., 100 million examples x 1000 features.  100 billion feature values to compute. Image Data Learning Algorithm Low-level features

Features on CPUs vs. GPUs Difficult to keep scaling features on CPUs. CPUs are designed for general-purpose computing. GPUs outpacing CPUs dramatically. (nVidia CUDA Programming Guide)

Features on GPUs Features: Cross-correlation with image patches. High data locality; high arithmetic intensity. Implemented brute-force. Faster than FFT for small filter sizes. Orders of magnitude faster than FFT on CPU. 20x to 100x speedups (depending on filter size).

The Learning Pipeline Large number of feature vectors on disk are too slow to access repeatedly. E.g., Can run an online algorithm on one machine, but disk access is a difficult bottleneck. Image Data Learning Algorithm Low-level features

Distributed Training Solution: must store everything in RAM. No problem! RAM as low as $20/GB Our cluster with 120GB RAM: Capacity of >100 million examples. For 1000 features, 1 byte each.

Distributed Training Algorithms that can be trained from sufficient statistics are easy to distribute. Decision tree splits can be trained using histograms of each feature. Histograms can be computed for small chunks of data on separate machines, then combined. Split = + x x x Slave 2 Slave 1 Master Master

The Learning Pipeline We’ve scaled up each piece of the pipeline by a large factor over traditional approaches: > 1000x 20x – 100x > 10x Image Data Learning Algorithm Low-level features

Traditional supervised learning Testing: What is this? Cars Motorcycles

Self-taught learning Testing: What is this? Motorcycle Car Natural scenes

Learning representations Image Data Learning Algorithm Low-level features Where do we get good low-level representations?

Computer vision features SIFT Spin image HoG RIFT GLOH Textons

Unsupervised feature learning Higher layer (Combinations of edges; cf.V2) “Sparse coding” (edges; cf. V1) Input image (pixels) DBN (Hinton et al., 2006) with additional sparseness constraint. [Related work: Hinton, Bengio, LeCun, and others.] Note: No explicit “pooling.”

Unsupervised feature learning Higher layer (Model V3?) Higher layer (Model V2?) Model V1 Input image ,[object Object], > 1 million examples. >1 million parameters.

What's hot

Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit

(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...MYEONGGYU LEE

Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016MLconf

20181128 satellogic @ barcelona aiAlbert Pujol Torras

Image Classification Done Simply using Keras and TensorFlow Rajiv Shah

AI and Deep Learning for On-Board Satellite Image Analysis, OW2con'19, June 1...OW2

Meetup 18/10/2018 - Artificiële intelligentie en mobiliteitDigipolis Antwerpen

Energy Monitoring With Self-taught Deep NetworkYiqun Hu

Distributed deep learning optimizationsgeetachauhan

Tensorflowmarwa Ayad Mohamed

Computational decision makingBoris Adryan

AI powered emotion recognition: From Inception to Production - Global AI Conf...Vandana Kannan

Big data app meetup 2016-06-15Illia Polosukhin

Reinforcement Learning for Self Driving CarsSneha Ravikumar

Autoencoder Forest for Anomaly Detection from IoT Time SeriesYiqun Hu

Google Developer Groups Talk - TensorFlowHarini Gunabalan

Azure ML - November 2014 David Green

Practical deep learning for computer visionEran Shlomo

Deep learning from scratch Eran Shlomo

Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016MLconf

What's hot (20)

Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala

(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...

Hussein Mehanna, Engineering Director, ML Core - Facebook at MLconf ATL 2016

20181128 satellogic @ barcelona ai

Image Classification Done Simply using Keras and TensorFlow

AI and Deep Learning for On-Board Satellite Image Analysis, OW2con'19, June 1...

Meetup 18/10/2018 - Artificiële intelligentie en mobiliteit

Energy Monitoring With Self-taught Deep Network

Distributed deep learning optimizations

Tensorflow

Computational decision making

AI powered emotion recognition: From Inception to Production - Global AI Conf...

Big data app meetup 2016-06-15

Reinforcement Learning for Self Driving Cars

Autoencoder Forest for Anomaly Detection from IoT Time Series

Google Developer Groups Talk - TensorFlow

Azure ML - November 2014

Practical deep learning for computer vision

Deep learning from scratch

Nikhil Garg, Engineering Manager, Quora at MLconf SF 2016

Viewers also liked

Building Realtime Javascript Apps with PubNubTomomi Imura

PubNub EON Realtime Dashboard FrameworkIan Jennings

Watson DevCon 2016: Why Audience Intelligence Requires a Modular AI ApproachIBM Watson

Watson DevCon 2016 - Watson Beat: Making Music CognitiveIBM Watson

Building with Watson - Interpreting Language Using the Natural Language Class...IBM Watson

Refinery Advisor IBM Watson

SenchaCon 2016: An Ext JS Dashboard for IoT Data - Dan Gallo Sencha

SenchaCon 2016: Using Ext JS to Turn Big Data into Intelligence - Olga Petrov...Sencha

SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...Sencha

Building with Watson - Advanced Integrations with Watson ConversationIBM Watson

Ext JS Architecture Best Practices - Mitchell SimeonsSencha

Building Cognitive Applications with Watson APIs Dev_Events

SenchaCon 2016: LinkRest - Modern RESTful API Framework for Ext JS Apps - Rou...Sencha

Watson DevCon 2016 - The Future of Discovery: Deep Insights with Cognitive APIsIBM Watson

Building with Watson - Serverless Chatbots with PubNub and ConversationIBM Watson

Natural Language Classifier - Handbook (IBM)Davi Couto

Application Developer Predictions 2017 - It's All About CognitiveIBM Watson

Building with Watson - Training and Preparing Your Conversational SystemIBM Watson

SenchaCon 2016: Keynote Presentation - Art Landro, Gautam Agrawal, Mark BrocatoSencha

Viewers also liked (19)

Building Realtime Javascript Apps with PubNub

PubNub EON Realtime Dashboard Framework

Watson DevCon 2016: Why Audience Intelligence Requires a Modular AI Approach

Watson DevCon 2016 - Watson Beat: Making Music Cognitive

Building with Watson - Interpreting Language Using the Natural Language Class...

Refinery Advisor

SenchaCon 2016: An Ext JS Dashboard for IoT Data - Dan Gallo

SenchaCon 2016: Using Ext JS to Turn Big Data into Intelligence - Olga Petrov...

SenchaCon 2016: Using Ext JS 6 for Cross-Platform Development on Mobile - And...

Building with Watson - Advanced Integrations with Watson Conversation

Ext JS Architecture Best Practices - Mitchell Simeons

Building Cognitive Applications with Watson APIs

SenchaCon 2016: LinkRest - Modern RESTful API Framework for Ext JS Apps - Rou...

Watson DevCon 2016 - The Future of Discovery: Deep Insights with Cognitive APIs

Building with Watson - Serverless Chatbots with PubNub and Conversation

Natural Language Classifier - Handbook (IBM)

Application Developer Predictions 2017 - It's All About Cognitive

Building with Watson - Training and Preparing Your Conversational System

SenchaCon 2016: Keynote Presentation - Art Landro, Gautam Agrawal, Mark Brocato

Similar to Scalable Learning in Computer Vision

Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData

Making Machine Learning Scale: Single Machine and DistributedTuri, Inc.

Computer Vision for BeginnersSanghamitra Deb

"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...Edge AI and Vision Alliance

Big data 2.0, deep learning and financial UsecasesArvind Rapaka

B4UConference_machine learning_deeplearningHoa Le

Netflix machine learningAmer Ather

Artificial intelligence - A Teaser to the Topic.Dr. Kim (Kyllesbech Larsen)

(Im2col)accelerating deep neural networks on low power heterogeneous architec...Bomm Kim

DLD meetup 2017, Efficient Deep LearningBrodmann17

Deep-learning-for-computer-vision-applications-using-matlab.pdfAubainYro1

A Platform for Accelerating Machine Learning ApplicationsNVIDIA Taiwan

Deep Learning Made Easy with Deep FeaturesTuri, Inc.

AI powered emotion recognition: From Inception to Production - Global AI Conf...Apache MXNet

Week3-Deep Neural Network (DNN).pptxfahmi324663

AI and Deep Learning Subrat Panda, PhD

The Frontier of Deep Learning in 2020 and BeyondNUS-ISS

Open power ddl and lmsGanesan Narayanasamy

Distributed DNN training: Infrastructure, challenges, and lessons learnedWee Hyong Tok

(CMP305) Deep Learning on AWS Made EasyCmp305Amazon Web Services

Similar to Scalable Learning in Computer Vision (20)

Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak

Making Machine Learning Scale: Single Machine and Distributed

Computer Vision for Beginners

"Energy-efficient Hardware for Embedded Vision and Deep Convolutional Neural ...

Big data 2.0, deep learning and financial Usecases

B4UConference_machine learning_deeplearning

Netflix machine learning

Artificial intelligence - A Teaser to the Topic.

(Im2col)accelerating deep neural networks on low power heterogeneous architec...

DLD meetup 2017, Efficient Deep Learning

Deep-learning-for-computer-vision-applications-using-matlab.pdf

A Platform for Accelerating Machine Learning Applications

Deep Learning Made Easy with Deep Features

AI powered emotion recognition: From Inception to Production - Global AI Conf...

Week3-Deep Neural Network (DNN).pptx

AI and Deep Learning

The Frontier of Deep Learning in 2020 and Beyond

Open power ddl and lms

Distributed DNN training: Infrastructure, challenges, and lessons learned

(CMP305) Deep Learning on AWS Made EasyCmp305

More from butest

EL MODELO DE NEGOCIO DE YOUTUBEbutest

1. MPEG I.B.P frame之不同butest

LESSONS FROM THE MICHAEL JACKSON TRIALbutest

Timeline: The Life of Michael Jacksonbutest

Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest

LESSONS FROM THE MICHAEL JACKSON TRIALbutest

Com 380, Summer IIbutest

PPTbutest

The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest

MICHAEL JACKSON.docbutest

Social Networks: Twitter Facebook SL - Slide 1butest

Facebook butest

Executive Summary Hare Chevrolet is a General Motors dealership ...butest

Welcome to the Dougherty County Public Library's Facebook and ...butest

NEWS ANNOUNCEMENTbutest

C-2100 Ultra Zoom.docbutest

MAC Printing on ITS Printers.doc.docbutest

Mac OS X Guide.docbutest

hierbutest

WEB DESIGN!butest

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE

1. MPEG I.B.P frame之不同

LESSONS FROM THE MICHAEL JACKSON TRIAL

Timeline: The Life of Michael Jackson

Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...

LESSONS FROM THE MICHAEL JACKSON TRIAL

Com 380, Summer II

PPT

The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz

MICHAEL JACKSON.doc

Social Networks: Twitter Facebook SL - Slide 1

Facebook

Executive Summary Hare Chevrolet is a General Motors dealership ...

Welcome to the Dougherty County Public Library's Facebook and ...

NEWS ANNOUNCEMENT

C-2100 Ultra Zoom.doc

MAC Printing on ITS Printers.doc.doc

Mac OS X Guide.doc

hier

WEB DESIGN!

Scalable Learning in Computer Vision

1. Scalable Learningin Computer Vision Adam Coates Honglak Lee RajatRaina Andrew Y. Ng Stanford University

2. Computer Vision is Hard

3. Introduction One reason for difficulty: small datasets.

4. Introduction But the world is complex. Hard to get extremely high accuracy on real images if we haven’t seen enough examples. AUC Training Set Size

6. Simple features

7. Favor speed over invariance and expressive power.

8. Simple model

9. Generic; little human knowledge.

10.

11. The Learning Pipeline Need to scale up each part of the learning process to really large datasets. Image Data Learning Algorithm Low-level features

12. Synthetic Data Not enough labeled data for algorithms to learn all the knowledge they need. Lighting variation Object pose variation Intra-class variation Synthesize positive examples to include this knowledge. Much easier than building this knowledge into the algorithms.

13. Synthetic Data Collect images of object on a green-screen turntable. Green Screen image Synthetic Background Segmented Object Photometric/Geometric Distortion

14. Synthetic Data: Example Claw hammers: Synthetic Examples (Training set) Real Examples (Test set)

15. The Learning Pipeline Feature computations can be prohibitive for large numbers of images. E.g., 100 million examples x 1000 features.  100 billion feature values to compute. Image Data Learning Algorithm Low-level features

16. Features on CPUs vs. GPUs Difficult to keep scaling features on CPUs. CPUs are designed for general-purpose computing. GPUs outpacing CPUs dramatically. (nVidia CUDA Programming Guide)

17. Features on GPUs Features: Cross-correlation with image patches. High data locality; high arithmetic intensity. Implemented brute-force. Faster than FFT for small filter sizes. Orders of magnitude faster than FFT on CPU. 20x to 100x speedups (depending on filter size).

18. The Learning Pipeline Large number of feature vectors on disk are too slow to access repeatedly. E.g., Can run an online algorithm on one machine, but disk access is a difficult bottleneck. Image Data Learning Algorithm Low-level features

19. Distributed Training Solution: must store everything in RAM. No problem! RAM as low as $20/GB Our cluster with 120GB RAM: Capacity of >100 million examples. For 1000 features, 1 byte each.

20. Distributed Training Algorithms that can be trained from sufficient statistics are easy to distribute. Decision tree splits can be trained using histograms of each feature. Histograms can be computed for small chunks of data on separate machines, then combined. Split = + x x x Slave 2 Slave 1 Master Master

21. The Learning Pipeline We’ve scaled up each piece of the pipeline by a large factor over traditional approaches: > 1000x 20x – 100x > 10x Image Data Learning Algorithm Low-level features

22. Size Matters AUC Training Set Size

23. UNSUPERVISED FEATURE LEARNING

24. Traditional supervised learning Testing: What is this? Cars Motorcycles

25. Self-taught learning Testing: What is this? Motorcycle Car Natural scenes

26. Learning representations Image Data Learning Algorithm Low-level features Where do we get good low-level representations?

27. Computer vision features SIFT Spin image HoG RIFT GLOH Textons

28. Unsupervised feature learning Higher layer (Combinations of edges; cf.V2) “Sparse coding” (edges; cf. V1) Input image (pixels) DBN (Hinton et al., 2006) with additional sparseness constraint. [Related work: Hinton, Bengio, LeCun, and others.] Note: No explicit “pooling.”

29.

30. Learning Large RBMs on GPUs 2 weeks Dual-core CPU 1 week 35 hours 72x faster Learning time for 10 million examples (log scale) 1 day 8 hours 2 hours 5 hours GPU 1 hour ½ hour 1 18 36 45 Millions of parameters (Rajat Raina, Anand Madhavan, Andrew Y. Ng)

31. Can now train very complex networks. Can learn increasingly complex features. Both more specific and more general-purpose than hand-engineered features. Object models Object parts (combination of edges) Edges Pixels Learning features

32. Conclusion Performance gains from large training sets are significant, even for very simple learning algorithms. Scalability of the system allows these algorithms to improve “for free” over time. Unsupervised algorithms promise high-quality features and representations without the need for hand-collected data. GPUs are a major enabling technology.

33. THANK YOU

Scalable Learning in Computer Vision

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (19)

Similar to Scalable Learning in Computer Vision

Similar to Scalable Learning in Computer Vision (20)

More from butest

More from butest (20)

Scalable Learning in Computer Vision