SlideShare una empresa de Scribd logo
1 de 44
Overview


Introduction
 Synchronization
 Non-blocking

Synchronization



Is Non-blocking Synchronization performancebeneficial for Parallel Applications?



NOBLE: A Non-blocking Synchronization Interface.
How can we make non-blocking synchronization
accessible to the parallel programmer?



Lock-free Skip lists



Conclusions, Future Work
Systems: SMP


Cache-coherent distributed shared
memory multiprocessor systems:
 UMA
 NUMA
Synchronization
Barriers
 Locks, semaphores,… (mutual
exclusion)


“A significant part of the work performed
by today’s parallel applications is spent on
synchronization.”
...
Lock-Based Synchronization:
Sequential
Non-blocking Synchronization


Lock-Free Synchronization
 Optimistic

approach

• Assumes it’s alone and prepares
operation which later takes place (unless
interfered) in one atomic step, using
hardware atomic primitives
• Interference is detected via shared
memory
• Retries until not interfered by other
operations
• Can cause starvation
Slide provided by Jim Anderson

Example: Shared Queue
The usual approach is to implement operations using retry loops.
Here’s an example:
type Qtype = record v: valtype; next: pointer to Qtype end
type Qtype = record v: valtype; next: pointer to Qtype end
shared var Tail: pointer to Qtype;
shared var Tail: pointer to Qtype;
local var old, new: pointer to Qtype
local var old, new: pointer to Qtype
procedure Enqueue (input: valtype)
procedure Enqueue (input: valtype)
new := (input, NIL);
new := (input, NIL);
repeat old := Tail
repeat old := Tail
until CAS2(&Tail, &(old->next), old, NIL, new, new)
until CAS2(&Tail, &(old->next), old, NIL, new, new)

old
Tail

new

old
Tail

new
Non-blocking Synchronization


Lock-Free Synchronization
 Avoids

problems that locks have

 Fast
 Starvation?



(not in the Context of HPC)

Wait-Free Synchronization
 Always

finishes in a finite number of its own

steps.
• Complex algorithms
• Memory consuming
• Less efficient on average than lock-free
Overview


Introduction
 Synchronization
 Non-blocking

Synchronization



Is Non-blocking Synchronization performancebeneficial for Parallel Scientific Applications?



NOBLE: A Non-blocking Synchronization Interface.
How can we make non-blocking synchronization
accessible to the parallel programmer?



Conclusions, Future Work
Non-blocking
Synchronisation
Synchronisation:
 An alternative approach for synchronisation
introduced 25 years ago
 Many theoretical results
Evaluation:
 Micro-benchmarks shows better
performance than mutual exclusion in real or
simulated multiprocessor systems.
Practice




Non-blocking synchronization is still not
used in practical applications
Non-blocking solutions are often
 complex
 having

non-standard or un-clear
interfaces
 non-practical

?

?
Practice
Question?
”How the performance of
parallel scientific
applications is affected by
the use of non-blocking
synchronisation rather than
lock-based one?”

?

?

?
Answers
How the performance of parallel scientific
applications is affected by the use of nonblocking synchronisation rather than lockbased one?






The identification of the basic locking
operations that parallel programmers use in
their applications.
The efficient non-blocking implementation of
these synchronisation operations.
The architectural implications on the design
of non-blocking synchronisation.
Comparison of the lock-based and lock-free
versions of the respective applications
Applications
Ocean

simulates eddy currents in an ocean basin.

Radiosity

computes the equilibrium distribution of light in a scene
using the radiosity method.

Volrend

renders 3D volume data into an image using a raycasting method.

Water

Evaluates forces and potentials that occur over time
between water molecules.

Spark98

a collection of sparse matrix kernels.
Each kernel performs a sequence of sparse matrix
vector product operations using matrices that are
derived from a family of three-dimensional finite
element earthquake applications.
Removing Locks in
Applications


Many locks are
“Simple Locks”.



Many critical
sections contain
shared floatingpoint variables.



Large critical
sections.







CAS, FAA and LL/SC can
be used to implement
non-blocking version.
Floating-point
synchronization primitives
are needed. A DoubleFetch-and-Add primitive
was designed.
Efficient Non-blocking
implementations of big
ADT are used.
Experimental Results:
Speedup
58P
58P

32P
24P

24P

58P
58P
SPARK98
Before:
spark_setlock(lockid);
w[col][0] += A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2];
w[col][1] += A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2];
w[col][2] += A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2];
spark_unsetlock(lockid);
After:
dfad(&w[col][0], A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2]);
dfad(&w[col][1], A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2]);
dfad(&w[col][2], A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2]);
Overview


Introduction
 Synchronization
 Non-blocking

Synchronization



Is Non-blocking Synchronization beneficial for
Parallel Scientific Applications?



NOBLE: A Non-blocking Synchronization Interface.
How can we make non-blocking synchronization
accessible to the parallel programmer?



Conclusions, Future Work
Practice




Non-blocking synchronization is still not
used in practical applications
Non-blocking solutions are often
 complex
 having

non-standard or un-clear
interfaces
 non-practical

?

?
NOBLE: Brings Non-blocking closer to Practice


Create a non-blocking inter-process
communication interface with the properties:
 Attractive

functionality
 Programmer friendly
 Easy to adapt existing solutions
 Efficient
 Portable
 Adaptable for different programming languages
NOBLE Design: Portable
Noble.h
#define NBL...
#define NBL...
#define NBL...

Exported definitions
Identical for all platforms
Platform in-dependent

QueueLF.c

StackLF.c

#include “Platform/Primitives.h”
…

#include “Platform/Primitives.h”
…

...

Platform dependent
SunHardware.asm

IntelHardware.asm

CAS, TAS, Spin-Locks
…

CAS, TAS, Spin-Locks
...

...
Using NOBLE
• First create a global variable
handling the shared data
object, for example a stack:
• Create the stack with the
appropriate implementation:

Globals
#include <noble.h>
...
NBLStack* stack;

Main
stack=NBLStackCreateLF(10000);
...

Threads
• When some thread wants to
do some operation:

NBLStackPush(stack, item);

or
item=NBLStackPop(stack);
Using NOBLE
Globals
#include <noble.h>
...
NBLStack* stack;

Main


When the data structure is
not in use anymore:

stack=NBLStackCreateLF(10000);
...
NBLStackFree(stack);
Using NOBLE
Globals
#include <noble.h>
...
NBLStack* stack;
• To change the synchronization mechanism, only one
line of code has to be changed!

Main
stack=NBLStackCreateLB();
...
NBLStackFree(stack);

Threads
NBLStackPush(stack, item);

or
item=NBLStackPop(stack);
Design: Attractive functionality


Data structures for multi-threaded usage
 FIFO

Queues
 Priority Queues
 Dictionaries
 Stacks
 Singly linked lists
 Snapshots
 MWCAS
 ...


Clear specifications
Status


Multiprocessor support
 Sun

Solaris (Sparc)
 Win32 (Intel x86)
 SGI (Mips)
 Linux (Intel x86)
Availiable for academic use:
http://www.noble-library.org/
Did our Work have any
Impact?
1)

2)

3)

Industry has initialized contacts and
uses a test version of NOBLE.
Free-ware developers has showed
interest.
Interest from research organisations.
NOBLE is freely availiable for
research and educational purposes.
A Lock-Free Skip list


Presented as part of the: H. Sundell, Ph. Tsigas
Fast and Lock-Free Concurrent Priority Queues
for Multi-Thread Systems. 17th IEEE/ACM
International Parallel and Distributed
Processing Symposium (IPDPS ´03), May 2003
(TR 2002). Best Paper Award

A very similar lock-free skip list algorithm will be
presented this August at the ACM Symposium
on Principles of Distributed Computing (PODC
2004):
”Lock-Free Linked Lists and Skip Lists”
Mikhail Fomitchev, Eric Ruppert
Randomized Algorithm: Skip Lists


William Pugh: ”Skip Lists: A Probabilistic
Alternative to Balanced Trees”, 1990
 Layers

of ordered lists with different
densities, achieves a tree-like behavior

Head

Tail

1

2
 Time

3

4

5

6

7

complexity: O(log2N) – probabilistic!

…
25%
50%
Our Lock-Free Concurrent
Skip List
 Define

node state to depend on the
insertion status at lowest level as well
as a deletion flag

1
3
2
1

p

D

2

D

 Insert
 Set

3

D

4

D

5

D

6

D

7

D

from lowest level going upwards

deletion flag. Delete from
highest level going downwards

3
2
1

p

D
Concurrent Insert vs. Delete
operations


b)

1

Problem:

2
Delete

3
Insert

- both nodes are deleted!


4

a)

Solution (Harris et al): Use bit 0 of
pointer to mark deletion status
1

b)

2 *
c)

a)

3

4
Dynamic Memory Management
Problem: System memory allocation
functionality is blocking!
 Solution (lock-free), IBM freelists:


 Pre-allocate

a number of nodes, link
them into a dynamic stack structure,
and allocate/reclaim using CAS
Allocate

Head

Mem 1

Mem 2

Reclaim

Used 1
© Ph. Tsigas 2003-2004

…

Mem n
The ABA problem


Problem: Because of concurrency
(pre-emption in particular), same
pointer value does not always mean
same node (i.e. CAS succeeds)!!!
Step 1:

1

6

7

3

7

4
Step 2:

2
4

© Ph. Tsigas 2003-2004
The ABA problem


Solution: (Valois et al) Add reference
counting to each node, in order to prevent
nodes that are of interest to some thread to
be reclaimed until all threads have left the
node
New Step 2:

1 *

6 *

1

1

CAS Failes!

2

3
?

7
?

4
1

© Ph. Tsigas 2003-2004

?
Helping Scheme


Threads need to traverse safely
2 *

1

4

or

1



4

?

?


2 *

Need to remove marked-to-be-deleted
nodes while traversing – Help!
Finds previous node, finish deletion and
continues traversing from previous node

1

2 *

4
© Ph. Tsigas 2003-2004
Overlapping operations on
Insert 2
shared data
2


Example: Insert operation 1

4

- which of 2 or 3 gets inserted?


Solution: Compare-And-Swap
atomic primitive:
CAS(p:pointer to word, old:word,
new:word):boolean
atomic do
if *p = old then
*p := new;
return true;
else return false;

© Ph. Tsigas 2003-2004

3
Insert 3
Experiments
1-30 threads on platforms with
different levels of real concurrency
 10000 Insert vs. DeleteMin operations
by each thread. 100 vs. 1000 initial
inserts
 Compare with other implementations:


 Lotan

and Shavit, 2000
 Hunt et al “An Efficient Algorithm for
Concurrent Priority Queue Heaps”,
1996
© Ph. Tsigas 2003-2004
Full Concurrency

© Ph. Tsigas 2003-2004
Medium Pre-emption

© Ph. Tsigas 2003-2004
High Pre-emption

© Ph. Tsigas 2003-2004
Lessons Learned








The Non-Blocking Synchronization
Paradigm can be suitable and beneficial to
large scale parallel applications.
Experimental Reproducable Work. Many
results claimed by simulation are not
consistent with what we observed.
Applications gave us nice problems to look
at and do theoretical work on. (IPDPS 2003
Algorithmic Best Paper Award)
NOBLE helped programmers to trust our
implementations.

© Ph. Tsigas 2003-2004
Future Work
Extend NOBLE for loosely coupled
systems.
 Extend the set of data structures
supported by NOBLE based on the
needs of the applications.
 Reactive-Synchronisation


© Ph. Tsigas 2003-2004
Questions?


Contact Information:
 Address:

Philippas Tsigas
Computing Science
Chalmers University of Technology

 Email:
 Web:

tsigas @ cs.chalmers.se
http://www.cs.chalmers.se/~tsigas
http://www.cs.chalmers.se/~dcs
http://www.noble-library.org

© Ph. Tsigas 2003-2004
Pointers:














NOBLE: A Non-Blocking Inter-Process Communication Library. ACM Workshop
on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR
´02).
Evaluating The Performance of Non-Blocking Synchronization on Shared Memory
Multiprocessors. ACM SIGMETRICS 2001/Performance2001 Joint International
Conference on Measurement and Modeling of Computer Systems (SIGMETRICS
2001).
Integrating Non-blocking Synchronization in Parallel Applications: Performance
Advantages and Methodologies. ACM Workshop on Software and Performance
(WOSP ´01).
A Simple, Fast and Scalable Non-Blocking Concurrent FIFO queue for Shared
Memory Multiprocessor Systems, ACM Symposium on Parallel Algorithms and
Architectures (SPAA ´01).
Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems. 17th
IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS
´03).
Fast, Reactive and Lock-free Multi-word Compare-and-swap Algorithms. 12th
EEE/ACM International Conference on Parallel Architectures and Compilation
Techniques (PACT ´03)
Scalable and Lock-free Cuncurrent Dictionaries. Proceedings of the 19th ACM
Symposium on Applied Computing (SAC ’04).

© Ph. Tsigas 2003-2004

Más contenido relacionado

La actualidad más candente

NXTTour: An Open Source Robotic System Operated over the Internet
NXTTour: An Open Source Robotic System Operated over the InternetNXTTour: An Open Source Robotic System Operated over the Internet
NXTTour: An Open Source Robotic System Operated over the InternetJoao Alves
 
Task and Data Parallelism: Real-World Examples
Task and Data Parallelism: Real-World ExamplesTask and Data Parallelism: Real-World Examples
Task and Data Parallelism: Real-World ExamplesSasha Goldshtein
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a functionTaisuke Oe
 
convolutional neural network (CNN, or ConvNet)
convolutional neural network (CNN, or ConvNet)convolutional neural network (CNN, or ConvNet)
convolutional neural network (CNN, or ConvNet)RakeshSaran5
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2Jeong Choi
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural NetworksTianxiang Xiong
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 
Event driven, mobile artificial intelligence algorithms
Event driven, mobile artificial intelligence algorithmsEvent driven, mobile artificial intelligence algorithms
Event driven, mobile artificial intelligence algorithmsDinesh More
 
Collaborative archietyped for ipv4
Collaborative archietyped for ipv4Collaborative archietyped for ipv4
Collaborative archietyped for ipv4Fredrick Ishengoma
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Deep Learning
Deep LearningDeep Learning
Deep LearningJun Wang
 
HardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionHardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionDmytro Mishkin
 

La actualidad más candente (13)

NXTTour: An Open Source Robotic System Operated over the Internet
NXTTour: An Open Source Robotic System Operated over the InternetNXTTour: An Open Source Robotic System Operated over the Internet
NXTTour: An Open Source Robotic System Operated over the Internet
 
Task and Data Parallelism: Real-World Examples
Task and Data Parallelism: Real-World ExamplesTask and Data Parallelism: Real-World Examples
Task and Data Parallelism: Real-World Examples
 
Neural Network as a function
Neural Network as a functionNeural Network as a function
Neural Network as a function
 
convolutional neural network (CNN, or ConvNet)
convolutional neural network (CNN, or ConvNet)convolutional neural network (CNN, or ConvNet)
convolutional neural network (CNN, or ConvNet)
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
Event driven, mobile artificial intelligence algorithms
Event driven, mobile artificial intelligence algorithmsEvent driven, mobile artificial intelligence algorithms
Event driven, mobile artificial intelligence algorithms
 
Collaborative archietyped for ipv4
Collaborative archietyped for ipv4Collaborative archietyped for ipv4
Collaborative archietyped for ipv4
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
HardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image DescriptionHardNet: Convolutional Network for Local Image Description
HardNet: Convolutional Network for Local Image Description
 

Destacado (17)

Advancedrn
AdvancedrnAdvancedrn
Advancedrn
 
Caqa5e ch4
Caqa5e ch4Caqa5e ch4
Caqa5e ch4
 
Lec13 cdn
Lec13 cdnLec13 cdn
Lec13 cdn
 
worklight_development_environment
worklight_development_environmentworklight_development_environment
worklight_development_environment
 
Asp controls
Asp  controlsAsp  controls
Asp controls
 
Visual studio-2012-product-guide
Visual studio-2012-product-guideVisual studio-2012-product-guide
Visual studio-2012-product-guide
 
Ch20
Ch20Ch20
Ch20
 
(148064384) bfs
(148064384) bfs(148064384) bfs
(148064384) bfs
 
Cdn imw01
Cdn imw01Cdn imw01
Cdn imw01
 
Introto netthreads-090906214344-phpapp01
Introto netthreads-090906214344-phpapp01Introto netthreads-090906214344-phpapp01
Introto netthreads-090906214344-phpapp01
 
Collaborative filtering hyoungtae cho
Collaborative filtering hyoungtae choCollaborative filtering hyoungtae cho
Collaborative filtering hyoungtae cho
 
Hans enocson how big data creates opportunities for productivity improvements...
Hans enocson how big data creates opportunities for productivity improvements...Hans enocson how big data creates opportunities for productivity improvements...
Hans enocson how big data creates opportunities for productivity improvements...
 
Big data trendsdirections nimführ.ppt
Big data trendsdirections nimführ.pptBig data trendsdirections nimführ.ppt
Big data trendsdirections nimführ.ppt
 
Des
DesDes
Des
 
Intellij idea features
Intellij idea featuresIntellij idea features
Intellij idea features
 
Android sql examples
Android sql examplesAndroid sql examples
Android sql examples
 
Going beyond-data-and-analytics-v4
Going beyond-data-and-analytics-v4Going beyond-data-and-analytics-v4
Going beyond-data-and-analytics-v4
 

Similar a Role of locking- cds

Comparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus UsingComparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus Usingjorgerodriguessimao
 
Architecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemArchitecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemZongYing Lyu
 
Wireless Ad Hoc Networks
Wireless Ad Hoc NetworksWireless Ad Hoc Networks
Wireless Ad Hoc NetworksTara Hardin
 
Towards high performance computing(hpc) through parallel programming paradigm...
Towards high performance computing(hpc) through parallel programming paradigm...Towards high performance computing(hpc) through parallel programming paradigm...
Towards high performance computing(hpc) through parallel programming paradigm...ijpla
 
Introduction to OpenVX
Introduction to OpenVXIntroduction to OpenVX
Introduction to OpenVX家榮 張
 
Performance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryPerformance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryZongYing Lyu
 
CRIWG 2010: Enabling Collaboration transparency
CRIWG 2010: Enabling Collaboration transparencyCRIWG 2010: Enabling Collaboration transparency
CRIWG 2010: Enabling Collaboration transparencypgarcial
 
Actor model in F# and Akka.NET
Actor model in F# and Akka.NETActor model in F# and Akka.NET
Actor model in F# and Akka.NETRiccardo Terrell
 
An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed DebuggingAnant Narayanan
 
Life & Work of Butler Lampson | Turing100@Persistent
Life & Work of Butler Lampson | Turing100@PersistentLife & Work of Butler Lampson | Turing100@Persistent
Life & Work of Butler Lampson | Turing100@PersistentPersistent Systems Ltd.
 
Linux Assignment 3
Linux Assignment 3Linux Assignment 3
Linux Assignment 3Diane Allen
 
An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...IJERA Editor
 
FIFO Based Routing Scheme for Clock-less System
FIFO Based Routing Scheme for Clock-less SystemFIFO Based Routing Scheme for Clock-less System
FIFO Based Routing Scheme for Clock-less SystemWaqas Tariq
 
Performance analysis of synchronisation problem
Performance analysis of synchronisation problemPerformance analysis of synchronisation problem
Performance analysis of synchronisation problemharshit200793
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringRafael Ferreira da Silva
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesAngelos Kapsimanis
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problemsRichard Ashworth
 

Similar a Role of locking- cds (20)

Comparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus UsingComparing Write-Ahead Logging and the Memory Bus Using
Comparing Write-Ahead Logging and the Memory Bus Using
 
Architecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory systemArchitecture of the oasis mobile shared virtual memory system
Architecture of the oasis mobile shared virtual memory system
 
Harmful interupts
Harmful interuptsHarmful interupts
Harmful interupts
 
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-CentersTowards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
Towards Edge Computing as a Service: Dynamic Formation of the Micro Data-Centers
 
Wireless Ad Hoc Networks
Wireless Ad Hoc NetworksWireless Ad Hoc Networks
Wireless Ad Hoc Networks
 
Towards high performance computing(hpc) through parallel programming paradigm...
Towards high performance computing(hpc) through parallel programming paradigm...Towards high performance computing(hpc) through parallel programming paradigm...
Towards high performance computing(hpc) through parallel programming paradigm...
 
Tools and Methods for Continuously Expanding Software Applications
Tools and Methods for Continuously Expanding Software ApplicationsTools and Methods for Continuously Expanding Software Applications
Tools and Methods for Continuously Expanding Software Applications
 
Introduction to OpenVX
Introduction to OpenVXIntroduction to OpenVX
Introduction to OpenVX
 
Performance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memoryPerformance improvement techniques for software distributed shared memory
Performance improvement techniques for software distributed shared memory
 
CRIWG 2010: Enabling Collaboration transparency
CRIWG 2010: Enabling Collaboration transparencyCRIWG 2010: Enabling Collaboration transparency
CRIWG 2010: Enabling Collaboration transparency
 
Actor model in F# and Akka.NET
Actor model in F# and Akka.NETActor model in F# and Akka.NET
Actor model in F# and Akka.NET
 
An Overview of Distributed Debugging
An Overview of Distributed DebuggingAn Overview of Distributed Debugging
An Overview of Distributed Debugging
 
Life & Work of Butler Lampson | Turing100@Persistent
Life & Work of Butler Lampson | Turing100@PersistentLife & Work of Butler Lampson | Turing100@Persistent
Life & Work of Butler Lampson | Turing100@Persistent
 
Linux Assignment 3
Linux Assignment 3Linux Assignment 3
Linux Assignment 3
 
An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...An Implementation on Effective Robot Mission under Critical Environemental Co...
An Implementation on Effective Robot Mission under Critical Environemental Co...
 
FIFO Based Routing Scheme for Clock-less System
FIFO Based Routing Scheme for Clock-less SystemFIFO Based Routing Scheme for Clock-less System
FIFO Based Routing Scheme for Clock-less System
 
Performance analysis of synchronisation problem
Performance analysis of synchronisation problemPerformance analysis of synchronisation problem
Performance analysis of synchronisation problem
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
 
Software Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniquesSoftware Architectures, Week 2 - Decomposition techniques
Software Architectures, Week 2 - Decomposition techniques
 
Producer consumer-problems
Producer consumer-problemsProducer consumer-problems
Producer consumer-problems
 

Último

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

Role of locking- cds

  • 1.
  • 2. Overview  Introduction  Synchronization  Non-blocking Synchronization  Is Non-blocking Synchronization performancebeneficial for Parallel Applications?  NOBLE: A Non-blocking Synchronization Interface. How can we make non-blocking synchronization accessible to the parallel programmer?  Lock-free Skip lists  Conclusions, Future Work
  • 3. Systems: SMP  Cache-coherent distributed shared memory multiprocessor systems:  UMA  NUMA
  • 4. Synchronization Barriers  Locks, semaphores,… (mutual exclusion)  “A significant part of the work performed by today’s parallel applications is spent on synchronization.” ...
  • 6. Non-blocking Synchronization  Lock-Free Synchronization  Optimistic approach • Assumes it’s alone and prepares operation which later takes place (unless interfered) in one atomic step, using hardware atomic primitives • Interference is detected via shared memory • Retries until not interfered by other operations • Can cause starvation
  • 7. Slide provided by Jim Anderson Example: Shared Queue The usual approach is to implement operations using retry loops. Here’s an example: type Qtype = record v: valtype; next: pointer to Qtype end type Qtype = record v: valtype; next: pointer to Qtype end shared var Tail: pointer to Qtype; shared var Tail: pointer to Qtype; local var old, new: pointer to Qtype local var old, new: pointer to Qtype procedure Enqueue (input: valtype) procedure Enqueue (input: valtype) new := (input, NIL); new := (input, NIL); repeat old := Tail repeat old := Tail until CAS2(&Tail, &(old->next), old, NIL, new, new) until CAS2(&Tail, &(old->next), old, NIL, new, new) old Tail new old Tail new
  • 8. Non-blocking Synchronization  Lock-Free Synchronization  Avoids problems that locks have  Fast  Starvation?  (not in the Context of HPC) Wait-Free Synchronization  Always finishes in a finite number of its own steps. • Complex algorithms • Memory consuming • Less efficient on average than lock-free
  • 9. Overview  Introduction  Synchronization  Non-blocking Synchronization  Is Non-blocking Synchronization performancebeneficial for Parallel Scientific Applications?  NOBLE: A Non-blocking Synchronization Interface. How can we make non-blocking synchronization accessible to the parallel programmer?  Conclusions, Future Work
  • 10. Non-blocking Synchronisation Synchronisation:  An alternative approach for synchronisation introduced 25 years ago  Many theoretical results Evaluation:  Micro-benchmarks shows better performance than mutual exclusion in real or simulated multiprocessor systems.
  • 11. Practice   Non-blocking synchronization is still not used in practical applications Non-blocking solutions are often  complex  having non-standard or un-clear interfaces  non-practical ? ?
  • 12. Practice Question? ”How the performance of parallel scientific applications is affected by the use of non-blocking synchronisation rather than lock-based one?” ? ? ?
  • 13. Answers How the performance of parallel scientific applications is affected by the use of nonblocking synchronisation rather than lockbased one?     The identification of the basic locking operations that parallel programmers use in their applications. The efficient non-blocking implementation of these synchronisation operations. The architectural implications on the design of non-blocking synchronisation. Comparison of the lock-based and lock-free versions of the respective applications
  • 14. Applications Ocean simulates eddy currents in an ocean basin. Radiosity computes the equilibrium distribution of light in a scene using the radiosity method. Volrend renders 3D volume data into an image using a raycasting method. Water Evaluates forces and potentials that occur over time between water molecules. Spark98 a collection of sparse matrix kernels. Each kernel performs a sequence of sparse matrix vector product operations using matrices that are derived from a family of three-dimensional finite element earthquake applications.
  • 15. Removing Locks in Applications  Many locks are “Simple Locks”.  Many critical sections contain shared floatingpoint variables.  Large critical sections.    CAS, FAA and LL/SC can be used to implement non-blocking version. Floating-point synchronization primitives are needed. A DoubleFetch-and-Add primitive was designed. Efficient Non-blocking implementations of big ADT are used.
  • 17. SPARK98 Before: spark_setlock(lockid); w[col][0] += A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2]; w[col][1] += A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2]; w[col][2] += A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2]; spark_unsetlock(lockid); After: dfad(&w[col][0], A[Anext][0][0]*v[i][0] + A[Anext][1][0]*v[i][1] + A[Anext][2][0]*v[i][2]); dfad(&w[col][1], A[Anext][0][1]*v[i][0] + A[Anext][1][1]*v[i][1] + A[Anext][2][1]*v[i][2]); dfad(&w[col][2], A[Anext][0][2]*v[i][0] + A[Anext][1][2]*v[i][1] + A[Anext][2][2]*v[i][2]);
  • 18. Overview  Introduction  Synchronization  Non-blocking Synchronization  Is Non-blocking Synchronization beneficial for Parallel Scientific Applications?  NOBLE: A Non-blocking Synchronization Interface. How can we make non-blocking synchronization accessible to the parallel programmer?  Conclusions, Future Work
  • 19. Practice   Non-blocking synchronization is still not used in practical applications Non-blocking solutions are often  complex  having non-standard or un-clear interfaces  non-practical ? ?
  • 20. NOBLE: Brings Non-blocking closer to Practice  Create a non-blocking inter-process communication interface with the properties:  Attractive functionality  Programmer friendly  Easy to adapt existing solutions  Efficient  Portable  Adaptable for different programming languages
  • 21. NOBLE Design: Portable Noble.h #define NBL... #define NBL... #define NBL... Exported definitions Identical for all platforms Platform in-dependent QueueLF.c StackLF.c #include “Platform/Primitives.h” … #include “Platform/Primitives.h” … ... Platform dependent SunHardware.asm IntelHardware.asm CAS, TAS, Spin-Locks … CAS, TAS, Spin-Locks ... ...
  • 22. Using NOBLE • First create a global variable handling the shared data object, for example a stack: • Create the stack with the appropriate implementation: Globals #include <noble.h> ... NBLStack* stack; Main stack=NBLStackCreateLF(10000); ... Threads • When some thread wants to do some operation: NBLStackPush(stack, item); or item=NBLStackPop(stack);
  • 23. Using NOBLE Globals #include <noble.h> ... NBLStack* stack; Main  When the data structure is not in use anymore: stack=NBLStackCreateLF(10000); ... NBLStackFree(stack);
  • 24. Using NOBLE Globals #include <noble.h> ... NBLStack* stack; • To change the synchronization mechanism, only one line of code has to be changed! Main stack=NBLStackCreateLB(); ... NBLStackFree(stack); Threads NBLStackPush(stack, item); or item=NBLStackPop(stack);
  • 25. Design: Attractive functionality  Data structures for multi-threaded usage  FIFO Queues  Priority Queues  Dictionaries  Stacks  Singly linked lists  Snapshots  MWCAS  ...  Clear specifications
  • 26. Status  Multiprocessor support  Sun Solaris (Sparc)  Win32 (Intel x86)  SGI (Mips)  Linux (Intel x86) Availiable for academic use: http://www.noble-library.org/
  • 27. Did our Work have any Impact? 1) 2) 3) Industry has initialized contacts and uses a test version of NOBLE. Free-ware developers has showed interest. Interest from research organisations. NOBLE is freely availiable for research and educational purposes.
  • 28. A Lock-Free Skip list  Presented as part of the: H. Sundell, Ph. Tsigas Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems. 17th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS ´03), May 2003 (TR 2002). Best Paper Award A very similar lock-free skip list algorithm will be presented this August at the ACM Symposium on Principles of Distributed Computing (PODC 2004): ”Lock-Free Linked Lists and Skip Lists” Mikhail Fomitchev, Eric Ruppert
  • 29. Randomized Algorithm: Skip Lists  William Pugh: ”Skip Lists: A Probabilistic Alternative to Balanced Trees”, 1990  Layers of ordered lists with different densities, achieves a tree-like behavior Head Tail 1 2  Time 3 4 5 6 7 complexity: O(log2N) – probabilistic! … 25% 50%
  • 30. Our Lock-Free Concurrent Skip List  Define node state to depend on the insertion status at lowest level as well as a deletion flag 1 3 2 1 p D 2 D  Insert  Set 3 D 4 D 5 D 6 D 7 D from lowest level going upwards deletion flag. Delete from highest level going downwards 3 2 1 p D
  • 31. Concurrent Insert vs. Delete operations  b) 1 Problem: 2 Delete 3 Insert - both nodes are deleted!  4 a) Solution (Harris et al): Use bit 0 of pointer to mark deletion status 1 b) 2 * c) a) 3 4
  • 32. Dynamic Memory Management Problem: System memory allocation functionality is blocking!  Solution (lock-free), IBM freelists:   Pre-allocate a number of nodes, link them into a dynamic stack structure, and allocate/reclaim using CAS Allocate Head Mem 1 Mem 2 Reclaim Used 1 © Ph. Tsigas 2003-2004 … Mem n
  • 33. The ABA problem  Problem: Because of concurrency (pre-emption in particular), same pointer value does not always mean same node (i.e. CAS succeeds)!!! Step 1: 1 6 7 3 7 4 Step 2: 2 4 © Ph. Tsigas 2003-2004
  • 34. The ABA problem  Solution: (Valois et al) Add reference counting to each node, in order to prevent nodes that are of interest to some thread to be reclaimed until all threads have left the node New Step 2: 1 * 6 * 1 1 CAS Failes! 2 3 ? 7 ? 4 1 © Ph. Tsigas 2003-2004 ?
  • 35. Helping Scheme  Threads need to traverse safely 2 * 1 4 or 1  4 ? ?  2 * Need to remove marked-to-be-deleted nodes while traversing – Help! Finds previous node, finish deletion and continues traversing from previous node 1 2 * 4 © Ph. Tsigas 2003-2004
  • 36. Overlapping operations on Insert 2 shared data 2  Example: Insert operation 1 4 - which of 2 or 3 gets inserted?  Solution: Compare-And-Swap atomic primitive: CAS(p:pointer to word, old:word, new:word):boolean atomic do if *p = old then *p := new; return true; else return false; © Ph. Tsigas 2003-2004 3 Insert 3
  • 37. Experiments 1-30 threads on platforms with different levels of real concurrency  10000 Insert vs. DeleteMin operations by each thread. 100 vs. 1000 initial inserts  Compare with other implementations:   Lotan and Shavit, 2000  Hunt et al “An Efficient Algorithm for Concurrent Priority Queue Heaps”, 1996 © Ph. Tsigas 2003-2004
  • 38. Full Concurrency © Ph. Tsigas 2003-2004
  • 39. Medium Pre-emption © Ph. Tsigas 2003-2004
  • 40. High Pre-emption © Ph. Tsigas 2003-2004
  • 41. Lessons Learned     The Non-Blocking Synchronization Paradigm can be suitable and beneficial to large scale parallel applications. Experimental Reproducable Work. Many results claimed by simulation are not consistent with what we observed. Applications gave us nice problems to look at and do theoretical work on. (IPDPS 2003 Algorithmic Best Paper Award) NOBLE helped programmers to trust our implementations. © Ph. Tsigas 2003-2004
  • 42. Future Work Extend NOBLE for loosely coupled systems.  Extend the set of data structures supported by NOBLE based on the needs of the applications.  Reactive-Synchronisation  © Ph. Tsigas 2003-2004
  • 43. Questions?  Contact Information:  Address: Philippas Tsigas Computing Science Chalmers University of Technology  Email:  Web: tsigas @ cs.chalmers.se http://www.cs.chalmers.se/~tsigas http://www.cs.chalmers.se/~dcs http://www.noble-library.org © Ph. Tsigas 2003-2004
  • 44. Pointers:        NOBLE: A Non-Blocking Inter-Process Communication Library. ACM Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR ´02). Evaluating The Performance of Non-Blocking Synchronization on Shared Memory Multiprocessors. ACM SIGMETRICS 2001/Performance2001 Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2001). Integrating Non-blocking Synchronization in Parallel Applications: Performance Advantages and Methodologies. ACM Workshop on Software and Performance (WOSP ´01). A Simple, Fast and Scalable Non-Blocking Concurrent FIFO queue for Shared Memory Multiprocessor Systems, ACM Symposium on Parallel Algorithms and Architectures (SPAA ´01). Fast and Lock-Free Concurrent Priority Queues for Multi-Thread Systems. 17th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS ´03). Fast, Reactive and Lock-free Multi-word Compare-and-swap Algorithms. 12th EEE/ACM International Conference on Parallel Architectures and Compilation Techniques (PACT ´03) Scalable and Lock-free Cuncurrent Dictionaries. Proceedings of the 19th ACM Symposium on Applied Computing (SAC ’04). © Ph. Tsigas 2003-2004