Project Progress

•Download as KEY, PDF•

0 likes•503 views

sunnysomchok

Technology Business

What we’ve been doing(1)
• Hacking Hadoop API.
• Writing different kinds of programs to
understand it. (Not CV programs)
• Adaboost
• SIFT, SURF
• Reading, Reading

segmentation with overlap

get SIFT/SURF descriptor for partial segments

reduce no. of descriptors by grouping them.

region of interest (positive&negative)

count the frequency of occurrence of visual words

AdaBoost

Methodology

• For simplicity, assume the the same image is
stored on all slave nodes.
• Use ROI to run the algorithm.
• Hopefully this will make it easier for the
“Reduce”

Map-Reduce???
• It’s just a framework
• You can also implement it by reading the
paper[1]. :)
• Hadoop is one implementation. (Apache +
Yahoo)
• Google’s implementation is not made
public.

Map-Reduce for Machine
Learning on Multi-core

Introduction

• Algorithm ﬁtting Statistical Query Model
may be written in a certain “summation
form”
• Divide into data set into as many pieces as
the number of cores.

• Algorithm ﬁtting Statistical Query Model may be
written in a certain “summation form”
• Divide into data set into as many pieces as the number
of cores.

Algorithms(1)
• Locally Weight Linear Regression
• Naive Bayes
• Gaussian Discriminative Analysis
• k-means
• Logistic Regression
• Neural Network

Algorithms(2)

• Principal Components Analysis
• Independent Components Analysis
• Expansion Maximization
• Support Vector Machine

Example (LWLR)

divide the computation among different mappers to compute:

2 reducers sum up the partial values for A and b and ﬁnally computes the solution

Experiment Result
• Used UCI Machine Learning repository
• Used only 2 cores.
• 1.9x times faster
• 54 times speed up on 64 cores.
• Speed up is achieved by “throwing cores”
only

What's hot

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopJosh Patterson

Get involved with the Apache Software FoundationShalin Shekhar Mangar

Spark Summit EU talk by Heiko KorndorfSpark Summit

Spark Summit EU talk by Reza KarimiSpark Summit

Big Data LaboratoryJ Singh

Tailored for SparkDataWorks Summit/Hadoop Summit

Spark Summit EU talk by Elena LazovikSpark Summit

Introduction to MapReduce & hadoopColin Su

Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...Databricks

3rd Hivemall meetupMakoto Yui

Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit

A Database-Hadoop Hybrid Approach to Scalable Machine LearningMakoto Yui

Apache Hadoop Big Data TechnologyJay Nagar

Spark Summit EU talk by Oscar CastanedaSpark Summit

Spark Summit EU talk by Sital KediaSpark Summit

Deep Learning to Production with MLflow & RedisAIDatabricks

Spark_Intro_Syed_AcademySyed Hadoop

Spark Summit EU talk by Jakub HavaSpark Summit

Facebook Analytics with Elastic Map/ReduceJ Singh

The Evolution of Apache KylinDataWorks Summit/Hadoop Summit

What's hot (20)

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Get involved with the Apache Software Foundation

Spark Summit EU talk by Heiko Korndorf

Spark Summit EU talk by Reza Karimi

Big Data Laboratory

Tailored for Spark

Spark Summit EU talk by Elena Lazovik

Introduction to MapReduce & hadoop

Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...

3rd Hivemall meetup

Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins

A Database-Hadoop Hybrid Approach to Scalable Machine Learning

Apache Hadoop Big Data Technology

Spark Summit EU talk by Oscar Castaneda

Spark Summit EU talk by Sital Kedia

Deep Learning to Production with MLflow & RedisAI

Spark_Intro_Syed_Academy

Spark Summit EU talk by Jakub Hava

Facebook Analytics with Elastic Map/Reduce

The Evolution of Apache Kylin

Viewers also liked

Wildi 2009 Resume AddendumWildi

OW2con'14 - Nanoko, 2 years feedback, UbidreamsOW2

Chapter 13dphil002

Microsoft Power Point Customview360 Linked InMichiel Castelijns

Billboard Liberation Front - Steve LambertCrisis 999

OCCIware project and OCCI standard presented at China Cloud Computing & Stand...OW2

OpenPaas Collaboration Platform. OW2con'15, November 17, Paris. OW2

OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...OW2

Kalimucho Research Project, OW2con11, Nov 24-25, ParisOW2

NFPA Presentation Social Mediatellem

Git, как инструмент управления веб-контентомAlex Musayev

CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...OW2

OCCIware, a formal framework for Everything as a Service. OW2con'15, November...OW2

Chapter 6dphil002

Los 88 pelda+os del +ëxitov 02María Belén García Llamas

Serpica NaroCrisis 999

Big Data with SpagoBI. OW2con'15, November 17, Paris. OW2

Slide Boothphotosparisyoyo

Hahn Golf Academia & ClubCsaba Hahn

Adivina Que Ciudad Esalfcoltrane

Viewers also liked (20)

Wildi 2009 Resume Addendum

OW2con'14 - Nanoko, 2 years feedback, Ubidreams

Chapter 13

Microsoft Power Point Customview360 Linked In

Billboard Liberation Front - Steve Lambert

OCCIware project and OCCI standard presented at China Cloud Computing & Stand...

OpenPaas Collaboration Platform. OW2con'15, November 17, Paris.

OW2con' 14 - re-VAMP load testing with CLIF for continuous integration on the...

Kalimucho Research Project, OW2con11, Nov 24-25, Paris

NFPA Presentation Social Media

Git, как инструмент управления веб-контентом

CompatibleOne Multi PaaS Provisioning, Sami Yangui & Mohamed Mohamed, Institu...

OCCIware, a formal framework for Everything as a Service. OW2con'15, November...

Chapter 6

Los 88 pelda+os del +ëxitov 02

Serpica Naro

Big Data with SpagoBI. OW2con'15, November 17, Paris.

Slide Boothphotos

Hahn Golf Academia & Club

Adivina Que Ciudad Es

Similar to Project Progress

High-level languages for Big Data Analytics (Presentation)Jose Luis Lopez Pino

BDA R20 21NM - Summary Big Data AnalyticsNetajiGandi1

Hadoop introductionDong Ngoc

A performance analysis of OpenStack Cloud vs Real System on Hadoop ClustersKumari Surabhi

Implementing your own Google App Engine Virtual JBoss User Group

Programming in Spark using PySpark Mostafa

SSJS, NoSQL, GAE and AppengineJSEugene Lazutkin

Operational Intelligence Using HadoopDataWorks Summit

Large scale computing with mapreducehansen3032

Big Data trainingvishal192091

JavaFX 101Richard Bair

Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus

Extending Hadoop for Fun & ProfitMilind Bhandarkar

Apache Spark FundamentalsZahra Eskandari

Kylin and Druid Presentationargonauts007

Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Jen Aman

Apache Spark CoreGirish Khanzode

Internship final report@Treasure Data Inc.Ryuichi ITO

Advanced Analytics in HadoopAnalyticsWeek

Advanced Analytics and Big Data (August 2014)Thomas W. Dinsmore

Similar to Project Progress (20)

High-level languages for Big Data Analytics (Presentation)

BDA R20 21NM - Summary Big Data Analytics

Hadoop introduction

A performance analysis of OpenStack Cloud vs Real System on Hadoop Clusters

Implementing your own Google App Engine

Programming in Spark using PySpark

SSJS, NoSQL, GAE and AppengineJS

Operational Intelligence Using Hadoop

Large scale computing with mapreduce

Big Data training

JavaFX 101

Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus

Extending Hadoop for Fun & Profit

Apache Spark Fundamentals

Kylin and Druid Presentation

Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...

Apache Spark Core

Internship final report@Treasure Data Inc.

Advanced Analytics in Hadoop

Advanced Analytics and Big Data (August 2014)

Recently uploaded

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Sample pptx for embedding into website for demoHarshalMandlekar2

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Time Series Foundation Models - current state and future directionsNathaniel Shimoni

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

Advanced Computer Architecture – An IntroductionDilum Bandara

What is Artificial Intelligence?????????blackmambaettijean

Recently uploaded (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES

SIP trunking in Janus @ Kamailio World 2024

Developer Data Modeling Mistakes: From Postgres to NoSQL

Ensuring Technical Readiness For Copilot in Microsoft 365

Sample pptx for embedding into website for demo

What is DBT - The Ultimate Data Build Tool.pdf

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

DevEX - reference for building teams, processes, and platforms

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Generative AI for Technical Writer or Information Developers

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Time Series Foundation Models - current state and future directions

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

DevoxxFR 2024 Reproducible Builds with Apache Maven

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

What's New in Teams Calling, Meetings and Devices March 2024

Advanced Computer Architecture – An Introduction

What is Artificial Intelligence?????????

Project Progress

1. Project Progress

2. What we’ve been doing(1) • Hacking Hadoop API. • Writing different kinds of programs to understand it. (Not CV programs) • Adaboost • SIFT, SURF • Reading, Reading

3. Segmentation ROI ROI

4. segmentation with overlap get SIFT/SURF descriptor for partial segments reduce no. of descriptors by grouping them. region of interest (positive&negative) count the frequency of occurrence of visual words AdaBoost

5. Methodology • For simplicity, assume the the same image is stored on all slave nodes. • Use ROI to run the algorithm. • Hopefully this will make it easier for the “Reduce”

6. Map-Reduce??? • It’s just a framework • You can also implement it by reading the paper[1]. :) • Hadoop is one implementation. (Apache + Yahoo) • Google’s implementation is not made public.

7. Map-Reduce for Machine Learning on Multi-core

8. Introduction • Algorithm ﬁtting Statistical Query Model may be written in a certain “summation form” • Divide into data set into as many pieces as the number of cores.

9. • Algorithm ﬁtting Statistical Query Model may be written in a certain “summation form” • Divide into data set into as many pieces as the number of cores.

10. Algorithms(1) • Locally Weight Linear Regression • Naive Bayes • Gaussian Discriminative Analysis • k-means • Logistic Regression • Neural Network

11. Algorithms(2) • Principal Components Analysis • Independent Components Analysis • Expansion Maximization • Support Vector Machine

12. Example (LWLR) divide the computation among different mappers to compute: 2 reducers sum up the partial values for A and b and ﬁnally computes the solution

13. Experiment Result • Used UCI Machine Learning repository • Used only 2 cores. • 1.9x times faster • 54 times speed up on 64 cores. • Speed up is achieved by “throwing cores” only

Project Progress

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Project Progress

Similar to Project Progress (20)

Recently uploaded

Recently uploaded (20)

Project Progress