Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor

Implementing 3D SPHARM Surfaces
Registration on Cell Processor

Huian Li (huili@indiana.edu) Mi Yan (miyan@us.ibm.com)
Robert Henschel (rhensche@indiana edu)
(rhensche@indiana.edu) Li Shen (shenli@iupui edu)
(shenli@iupui.edu)

July 29, 2009

Contents
• SPHARM registration
• Matlab implementation
• Cell implementation
• Performance Analysis
• Conclusion

SPHARM Surfaces
• R di l and stellar surfaces
Radial d t ll f
• Simply connected, arbitrarily shaped
• Vision, graphics, imaging, bioinformatics

SPHARM Expansion

( )  (x y z)
(,)  (x,y,z)
( )
(,) (x,y,z)
( )
Area-preserving
mapping

SHREC

(a) template, (b) object, (c) after ICP, (d) after
registration of p
g parameterization

Calculation of coefficients
• After rotating the parameter net on the surface in
Euler angles (α, β, γ), new coefficients will be:
l
c (  ) 
m
l 
nl
D l
mn (  ) c l
n

where
min( l  n ,l  m )
D mn ( )  e (  i m  in ) (
l
 (  1) t d mnt (  ))
t  max( 0 , n  m )
l

and

(l  n)!(l  n)!(l  m)!(l  m)!  
d mnt (  ) 
l
 (cos ) ( 2l nm2t ) (sin ) ( 2t mn )
(l  n  t )!(l  m  t )!(t  m  n)!t! 2 2

RMSD
• RMSD (Root Mean Square Distance): distance
between two SPHARM models

L max l
1
RMSD 
4
 
l0 m l
|| c 1ml  c 2 , l || 2
,
m

m m
c and c
1 ,l 2 ,l are coefficients of two
SPHARM models

Matlab implementation
• A straightforward implementation in Matlab:

for l = 0 Lmax
0,
for m = -l, l
for n = -l, l
l
for t = max(0, n-m), min(l+m, l-n)
... performing calculations ...

• One rotation for Lmax = 50 took 823 seconds on 2GHz quad
quad-
core Intel Xeon E5335

Cell implementation
• Domain decomposition:
for l = 0, Lmax
for m = -l l
l,
for n = -l, l
for t = max(0 n-m) min(l+m l-n)
max(0, n m), min(l+m, l n)
... calculations ...

• Decomposition along l leads to work load
imbalance among SPUs

• Decomposition along m creates unnecessary data
p g y
communication

Cell implementation
• Loop fusion:
for l = 0, Lmax
for m = -l l
l,
for n = -l, l
for t = max(0 n-m) min(l+m l-n)
max(0, n m), min(l+m, l n)
... calculations ...
• Unique index for combined loop:
f(l, m) = l2 + m + l
• W kl d f each SPE :
Workload for h
(Lmax + 1)2/(total # of SPEs)

Cell implementation
• Lookup table T for factorial
• Transform exponentials & multiplications into
multiplications & additions respectively
additions, respectively.
(l  n)!(l  n)!(l  m)!(l  m)!  
d l
( )   (cos ) ( 2l nm2t ) (sin ) ( 2t mn )
(l  n  t )!(l  m  t )!(t  m  n)!t!
mnt
2 2

 exp(
1
 (T (l  n )  T (l  n )  T (l  m )  T (l  m ))
2
 T (l  n  t )  T (l  m  t )  T (t  m  n )  T (t )
 
 ( 2l  n  m  2t )  log(cos )  ( 2t  m  n )  log(sin ))
2 2

Cell implementation
• Others that specific to Cell:
• Vectorization & data alignment
• DMA data transfer between main memory &
local store
• SPU d decrementert

Cell implementation
• Single p
g precision vs. double p
precision: all data in single p
g precision

Cell implementation
• Single p
precision: p
partial data in double p
precision

Cell implementation
• Single p
precision: all critical data in double p
precision

Performance analysis
Performance of one rotation on Cell BE

1.8
18
1.6
1.4
s)
Time (seconds

1.2
1
0.8
0.6
0.4
04
T

0.2
0
1 2 4 8 16
Number of SPEs

Performance analysis
Performance of finding the shortest
distance at Level 3 on Cell BE
7000

6000

5000
s)
seconds

4000
Time (s

3000 GNU gcc
IBM xlc
2000

1000

0
4 8 12 16
Number of SPEs

Conclusion
• Performance increases dramatically on Cell due to
its unique architecture and algorithm optimization.
• Carefulness must be taken for data placement due
to limited local store.
• Carefulness must also be taken for data transfer
between local store and main memory.

Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (6)

Similar a Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor

Similar a Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor (20)

Más de PTIHPA

Más de PTIHPA (15)

Último

Último (20)

Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor