SlideShare una empresa de Scribd logo
1 de 26
Distributed Computing … and introduction of Hadoop -  Ankit Minocha
Outline ,[object Object],[object Object]
Computer Speedup Moore’s Law: “ The density of transistors on a chip doubles every 18 months, for the same cost”  (1965)
Scope of problems ,[object Object],[object Object],[object Object]
Rendering multiple frames of high-quality animation
Parallelization Idea ,[object Object]
Parallelization Idea (2) In a parallel computation, we would like to have as many threads as we have processors. e.g., a four-processor computer would be able to run four threads at the same time.
Parallelization Idea (3)
Parallelization Idea (4)
Parallelization Pitfalls ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],What is the common theme of all of these problems?
Parallelization Pitfalls (2) ,[object Object],[object Object]
What is Wrong With This? ,[object Object],[object Object],[object Object],[object Object],[object Object],Thread 2: void bar() { y++; x+=3; } If the initial state is y = 0, x = 6, what happens after these threads finish running?
Multithreaded = Unpredictability ,[object Object],Thread 1: void foo() { eax = mem[x]; inc eax; mem[x] = eax; ebx = mem[x]; mem[y] = ebx; } Thread 2: void bar() { eax = mem[y]; inc eax; mem[y] = eax; eax = mem[x]; add eax, 3; mem[x] = eax; } ,[object Object]
Multithreaded = Unpredictability ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Synchronization Primitives ,[object Object],[object Object]
Semaphores ,[object Object],[object Object],Only one side of the semaphore can ever be red! (Can both be green?)
Semaphores ,[object Object],[object Object],[object Object]
The “corrected” example Thread 1: void foo() { sem.lock(); x++; y = x; sem.unlock(); } Thread 2: void bar() { sem.lock(); y++; x+=3; sem.unlock(); } Global var “Semaphore sem = new Semaphore();” guards access to x & y
Condition Variables ,[object Object],[object Object],[object Object]
The final example Thread 1: void foo() { sem.lock(); x++; y = x; fooDone = true; sem.unlock(); fooFinishedCV.notify(); } Thread 2: void bar() { sem.lock(); if(!fooDone) fooFinishedCV.wait(sem); y++; x+=3; sem.unlock(); } Global vars: Semaphore sem = new Semaphore(); ConditionVar fooFinishedCV = new ConditionVar(); boolean fooDone = false;
Too Much Synchronization? Deadlock Synchronization becomes even more complicated when multiple locks can be used Can cause entire system to “get stuck” Thread A: semaphore1.lock(); semaphore2.lock(); /* use data guarded by  semaphores */ semaphore1.unlock();  semaphore2.unlock(); Thread B: semaphore2.lock(); semaphore1.lock(); /* use data guarded by  semaphores */ semaphore1.unlock();  semaphore2.unlock(); (Image: RPI CSCI.4210 Operating Systems notes)
The Moral: Be Careful! ,[object Object],[object Object],[object Object],[object Object],[object Object]
Hadoop to the rescue!
Prelude to MapReduce ,[object Object],[object Object],[object Object]
Prelude to MapReduce ,[object Object],[object Object],[object Object]
Thank you so much “Clickable” for your sincere efforts.

Más contenido relacionado

Similar a Distributed computing presentation

Google: Cluster computing and MapReduce: Introduction to Distributed System D...
Google: Cluster computing and MapReduce: Introduction to Distributed System D...Google: Cluster computing and MapReduce: Introduction to Distributed System D...
Google: Cluster computing and MapReduce: Introduction to Distributed System D...
tugrulh
 
Introduction to Cluster Computing and Map Reduce (from Google)
Introduction to Cluster Computing and Map Reduce  (from Google)Introduction to Cluster Computing and Map Reduce  (from Google)
Introduction to Cluster Computing and Map Reduce (from Google)
Sri Prasanna
 
Low Level Exploits
Low Level ExploitsLow Level Exploits
Low Level Exploits
hughpearse
 
what every web and app developer should know about multithreading
what every web and app developer should know about multithreadingwhat every web and app developer should know about multithreading
what every web and app developer should know about multithreading
Ilya Haykinson
 
CS844 U1 Individual Project
CS844 U1 Individual ProjectCS844 U1 Individual Project
CS844 U1 Individual Project
ThienSi Le
 
Medical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUsMedical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUs
Daniel Blezek
 

Similar a Distributed computing presentation (20)

Google: Cluster computing and MapReduce: Introduction to Distributed System D...
Google: Cluster computing and MapReduce: Introduction to Distributed System D...Google: Cluster computing and MapReduce: Introduction to Distributed System D...
Google: Cluster computing and MapReduce: Introduction to Distributed System D...
 
Introduction to Cluster Computing and Map Reduce (from Google)
Introduction to Cluster Computing and Map Reduce  (from Google)Introduction to Cluster Computing and Map Reduce  (from Google)
Introduction to Cluster Computing and Map Reduce (from Google)
 
Lec1 Intro
Lec1 IntroLec1 Intro
Lec1 Intro
 
Lec1 Intro
Lec1 IntroLec1 Intro
Lec1 Intro
 
Multithreading Introduction and Lifecyle of thread
Multithreading Introduction and Lifecyle of threadMultithreading Introduction and Lifecyle of thread
Multithreading Introduction and Lifecyle of thread
 
Multi Threading
Multi ThreadingMulti Threading
Multi Threading
 
Low Level Exploits
Low Level ExploitsLow Level Exploits
Low Level Exploits
 
cs2110Concurrency1.ppt
cs2110Concurrency1.pptcs2110Concurrency1.ppt
cs2110Concurrency1.ppt
 
what every web and app developer should know about multithreading
what every web and app developer should know about multithreadingwhat every web and app developer should know about multithreading
what every web and app developer should know about multithreading
 
chap 7 : Threads (scjp/ocjp)
chap 7 : Threads (scjp/ocjp)chap 7 : Threads (scjp/ocjp)
chap 7 : Threads (scjp/ocjp)
 
[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis
 
Java concurrency
Java concurrencyJava concurrency
Java concurrency
 
Multithreading 101
Multithreading 101Multithreading 101
Multithreading 101
 
Programming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean?Programming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean?
 
CS844 U1 Individual Project
CS844 U1 Individual ProjectCS844 U1 Individual Project
CS844 U1 Individual Project
 
Medical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUsMedical Image Processing Strategies for multi-core CPUs
Medical Image Processing Strategies for multi-core CPUs
 
Multi-threaded Programming in JAVA
Multi-threaded Programming in JAVAMulti-threaded Programming in JAVA
Multi-threaded Programming in JAVA
 
04 threads-pbl-2-slots
04 threads-pbl-2-slots04 threads-pbl-2-slots
04 threads-pbl-2-slots
 
04 threads-pbl-2-slots
04 threads-pbl-2-slots04 threads-pbl-2-slots
04 threads-pbl-2-slots
 
Multithreading Presentation
Multithreading PresentationMultithreading Presentation
Multithreading Presentation
 

Último

Último (20)

Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Distributed computing presentation

  • 1. Distributed Computing … and introduction of Hadoop - Ankit Minocha
  • 2.
  • 3. Computer Speedup Moore’s Law: “ The density of transistors on a chip doubles every 18 months, for the same cost” (1965)
  • 4.
  • 5. Rendering multiple frames of high-quality animation
  • 6.
  • 7. Parallelization Idea (2) In a parallel computation, we would like to have as many threads as we have processors. e.g., a four-processor computer would be able to run four threads at the same time.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18. The “corrected” example Thread 1: void foo() { sem.lock(); x++; y = x; sem.unlock(); } Thread 2: void bar() { sem.lock(); y++; x+=3; sem.unlock(); } Global var “Semaphore sem = new Semaphore();” guards access to x & y
  • 19.
  • 20. The final example Thread 1: void foo() { sem.lock(); x++; y = x; fooDone = true; sem.unlock(); fooFinishedCV.notify(); } Thread 2: void bar() { sem.lock(); if(!fooDone) fooFinishedCV.wait(sem); y++; x+=3; sem.unlock(); } Global vars: Semaphore sem = new Semaphore(); ConditionVar fooFinishedCV = new ConditionVar(); boolean fooDone = false;
  • 21. Too Much Synchronization? Deadlock Synchronization becomes even more complicated when multiple locks can be used Can cause entire system to “get stuck” Thread A: semaphore1.lock(); semaphore2.lock(); /* use data guarded by semaphores */ semaphore1.unlock(); semaphore2.unlock(); Thread B: semaphore2.lock(); semaphore1.lock(); /* use data guarded by semaphores */ semaphore1.unlock(); semaphore2.unlock(); (Image: RPI CSCI.4210 Operating Systems notes)
  • 22.
  • 23. Hadoop to the rescue!
  • 24.
  • 25.
  • 26. Thank you so much “Clickable” for your sincere efforts.

Notas del editor

  1. There are multiple possible final states. Y is definitely a problem, because we don’t know if it will be “1” or “7”… but X can also be 7, 10, or 11!
  2. Inform students that the term we want here is “race condition”
  3. Ask: are there still any problems? (Yes – we still have two possible outcomes. We want some other system that allows us to serialize access on an event.)
  4. Go over wait() / notify() / broadcast() --- must be combined with a semaphore!