2. What we’ve been doing(1)
• Hacking Hadoop API.
• Writing different kinds of programs to
understand it. (Not CV programs)
• Adaboost
• SIFT, SURF
• Reading, Reading
4. segmentation with overlap
get SIFT/SURF descriptor for partial segments
reduce no. of descriptors by grouping them.
region of interest (positive&negative)
count the frequency of occurrence of visual words
AdaBoost
5. Methodology
• For simplicity, assume the the same image is
stored on all slave nodes.
• Use ROI to run the algorithm.
• Hopefully this will make it easier for the
“Reduce”
6. Map-Reduce???
• It’s just a framework
• You can also implement it by reading the
paper[1]. :)
• Hadoop is one implementation. (Apache +
Yahoo)
• Google’s implementation is not made
public.
8. Introduction
• Algorithm fitting Statistical Query Model
may be written in a certain “summation
form”
• Divide into data set into as many pieces as
the number of cores.
9. • Algorithm fitting Statistical Query Model may be
written in a certain “summation form”
• Divide into data set into as many pieces as the number
of cores.
12. Example (LWLR)
divide the computation among different mappers to compute:
2 reducers sum up the partial values for A and b and finally computes the solution
13. Experiment Result
• Used UCI Machine Learning repository
• Used only 2 cores.
• 1.9x times faster
• 54 times speed up on 64 cores.
• Speed up is achieved by “throwing cores”
only