JavaOne2013: Implement a High Level Parallel API - Richard Ning

© 2013 IBM Corporation
Richard Ning – Enterprise Developer
9/24/2013
Implement high-level parallel
API in JDK
1

2
Important Disclaimers
– THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.
– WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS
PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.
– ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR
OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES.
– ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.
– IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND
STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.
– IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR
OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
– NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: CREATING ANY WARRANT OR
REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS.

About me
 Richard Ning
 IBM JDK development
 Developing enterprise application
software since 1999 (C++, Java)
 My contact information:
mail:huaningnh@gmail.com
3

What should you get from this talk?
■By the end of this session, you should be able to:
–Understand implementation of high-level parallel API in JDK
–Understand how parallel computing works on multi-cores
4

Agenda
Introduction: multi-threading, multi-cores, parallel computing
Case study
Other high-level parallel API
1
2
3
Roadmap4
5

Introduction
Multi-Threading
Multi-core computer
Parallel computing
6

Case study
■ Execute the same task for every element in a loop
■ Use multi-threading for the execution
7

■ Can it improve performance?
8

time
C
P
U
t1
t2
t1
t2
t1
■ Multi-threading on computer with one core
9

■ 100% CPU usage with single thread and multi-threading
• Performance
even decreases
with extra
threading
consuming
• Can't improve
performance
• It is useless to
use multi-
threading(paral
lel API)
10

■ Multi-threading on computer with multi-core
11

Cor4
t4
t2
t3
t1
Cor3
Cor2
Cor1
Thread runs separately
on every core
time
12

■ Raw thread
 Any improvement? Executor
–Users need to create and manage it
 Disadvantages
– Not flexible – the number of threads is hard to configure flexibly
> core number, resources are consumed in thread context, even decrease performance
< core number, some cores are wasted
No balance, the calculation can't be allocated into every core equally
13

■ Separate creation and execution of thread
■ Use thread pool to reuse thread
14

■ A high-level API concurrent_for
15


The API is easy to use, users only need to input executed task and data range and
don't care about how they are executed. However they still have disadvantages.
1. The number of
thread in thread
pool isn't
aligned to core
number
2. Task executes
an entry once,
which isn't
sufficient
3. A task is
targeted to a
thread, which
isn't flexible
17

1 2 3 n
Thread Pool
1 3 n2
Tasks
m
1 2 3 4
CPU
Core
Thread
Task
Core: 4
Thread: n
Task: m
Overloading:
n>>4
Not flexible: m >n
18

1 2 3 4
Thread Pool
1 2 3 4
CPU
Core
Thread
Thread number = core number

Core number doesn't align to thread number: Use fixed thread pool
19


Task division: another task division strategy ForkJoinPool
Fork
Join
Task2 Task3
Task5 Task6 Task7
Divide and conquer
1. Divide big task into small tasks recursively
2. Execute the same operation for every task
3. Join result of every small task
Task4
20
Task1


Better use for divide and conquer problem

Balancing: Work queue by thread and task stealing

Oversubscription and starvation: Configuring thread number
Task dividing is static instead of dynamic. Task dividing granularity isn't
configured properly according to running condition.
Task daviding strategy is from programmers who need to design it
themselves in different implementation scenarios.
23


New parallel API based on task scheduler
24

1 2 3 4
Thread Pool
1 2 3 4
CPU
Core
Thread
1
2
3
4
5
T
A
S
K
Q
U
E
U
E
6
7
8
11
12
16
13
14
15
9
10
17
18
19
20
Initial status

Tasks are allocated equally,

One thread by one core

Every thread maintains its task
queue which consists of
affiliated tasks
25

1 2 3 4
Thread Pool
1 2 3 4
CPU
Core
Thread
2
3
4
5
10 15
Unbalancing loading
T
A
S
K
Q
U
E
U
E
26

1 2 3 4
Thread Pool
1 2 3 4
CPU
Core
Thread
2
3 22
10
4
15
5
21
Balancing loading by
task stealing and
adding new tasks who
probably have different
task granularity.
T
A
S
K
Q
U
E
U
E
27


Parallel API with new working mechanism - concurrent_for
Range: the range of data set [0, n)
Strategy: the strategy of dividing range: automatic, static with fixed granularity. In
automatic case, task granularity is probably different
Task: the task which executes the same operation on range
28

Other high-level parallel API
Can add data set while executing it concurrently.
concurrent
_while
Use divide_join based task to return calculation result.concurrent_
reduce
Sort data set concurrently.concurrents
ort
for example, a matrix multiply another matrix
int[5][10] matrix1 , int[10][5] matrix2
int[5][5] matrix3 = matrix1 * matrix2
int[5][5] matrix3 = concurrent_multiply(matrix1, matrix2)
Math
calculation
31

Anyway we always can achieve performance improvement by
parallel computing based on multi-cores.
32

Scalable
Roadmap
■Implement high-level parallel API in JDK based on new task scheduler
Correct
Portable
High
performance
33

Review of Objectives
■Now that you’ve completed this session, you are able to:
–Understand design of new parallel API based on task.
–Understand what parallel computing is and what is good for
34

Q & A
35

Thanks!
36

JavaOne2013: Implement a High Level Parallel API - Richard Ning

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (16)

Destacado

Destacado (20)

Similar a JavaOne2013: Implement a High Level Parallel API - Richard Ning

Similar a JavaOne2013: Implement a High Level Parallel API - Richard Ning (20)

Más de Chris Bailey

Más de Chris Bailey (20)

Último

Último (20)

JavaOne2013: Implement a High Level Parallel API - Richard Ning