Más contenido relacionado
La actualidad más candente (19)
Similar a Supercomputers and Cloud Games (20)
Más de Shinra_Technologies (6)
Supercomputers and Cloud Games
- 1. Super computer & cloud gaming
Shinra Technologies, Inc.
Senior vice president
Tetsuji Iwasaki
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 1
- 2. About me
Tetsuji Iwasaki
Hobby: Beer
Started working in the industry in 1990, Joined Square-Enix in 1994
Some Famous titles
FFT/FFXI/Crysis
+17 game projects
Currently holding these positions:
2011 Square-Enix holdings Technology planning specialist
2012 Development director, Eidos Montreal
2014 Shinra Technologies, Inc. SVP(Technology)
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 2
- 3. What is cloud gaming?
Controller Input
Internet
Streaming Video
Data center
「Mini Ninjas」
© 2009 Eidos Interactive Ltd. Co-published by Eidos, Inc. and Warner Bros. Interactive Entertainment,
a division of Warner Bros. Home Entertainment Inc.
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 3
- 4. What is super computer?
There is no clear definition…
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 4
- 5. What is the imagine of supercomputer in your mind?
http://jp.fujitsu.com/about/tech/k/ スーパーコンピュータ「京」より転載2014/9/17閲覧
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 5
- 6. Lets see the top 10
http://www.top500.org/
1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH
Express-2, Intel Xeon Phi 31S1P
NUDT China
2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x
Cray Inc. United States
3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States
4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan
5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States
6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect
, NVIDIA K20x
Cray Inc. Switzerland
7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband
FDR, Intel Xeon Phi SE10P
Dell United States
8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM Germany
9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM United States
10
Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries
interconnect Cray Inc. United States
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 6
- 7. Intel® Xeon®
1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH
Express-2, Intel Xeon Phi 31S1P
NUDT China
2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x
Cray Inc. United States
3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States
4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan
5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States
6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect
, NVIDIA K20x
Cray Inc. Switzerland
7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband
FDR, Intel Xeon Phi SE10P
Dell United States
8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM Germany
9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM United States
10
Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries
interconnect Cray Inc. United States
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 7
- 8. IBM® Power® BQC
1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH
Express-2, Intel Xeon Phi 31S1P
NUDT China
2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x
Cray Inc. United States
3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States
4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan
5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States
6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect
, NVIDIA K20x
Cray Inc. Switzerland
7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband
FDR, Intel Xeon Phi SE10P
Dell United States
8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM Germany
9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM United States
10
Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries
interconnect Cray Inc. United States
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 8
- 9. Fujitsu® SPARC®64 Villfx
1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH
Express-2, Intel Xeon Phi 31S1P
NUDT China
2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x
Cray Inc. United States
3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States
4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan
5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States
6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect
, NVIDIA K20x
Cray Inc. Switzerland
7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband
FDR, Intel Xeon Phi SE10P
Dell United States
8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM Germany
9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM United States
10
Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries
interconnect Cray Inc. United States
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 9
- 10. NVIDIA® tesla®/Intel® phi
1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH
Express-2, Intel Xeon Phi 31S1P
NUDT China
2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x
Cray Inc. United States
3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States
4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan
5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States
6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect
, NVIDIA K20x
Cray Inc. Switzerland
7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband
FDR, Intel Xeon Phi SE10P
Dell United States
8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM Germany
9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom
Interconnect
IBM United States
10
Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries
interconnect Cray Inc. United States
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 10
- 11. The trend
General purpose processor
85.4% of TOP500 is using Intel…not sure exactly but probably most of them is Xeon
Amazon EC2 is ranked as 76th
Amazon EC2 C3 Instance cluster Intel Xeon E5-2680v2 10C 2.800GHz, 10G Ethernet
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 11
- 12. Super computer and GPU
TESLA GPU ACCELERATORS FOR SERVERS http://www.nvidia.com/object/tesla-servers.html
2014-9-17閲覧
NVIDIA® Tesla®
Intel® Xeon Phi™ Coprocessor
インテル® Xeon Phi™ コプロセッサー製品仕様
http://www.intel.co.jp/content/www/jp/ja/processors/xeon/xeon-phi-detail.html 2014-9-17閲覧
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 12
- 13. The impact of DEGIMA
*Tsuyoshi Hamada, Tetsu Narumi, Rio Yokota, Kenji Yasuoka and Keigo Nitadori. 42 TFlops Hierarchical N-body Simulations on GPUs with Applications in both Astrophysics and Turbulence.
SC '09 Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Article No. 62
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 13
- 14. *長崎大学GPUクラスタDEGIMA(DEstination for Gpu Intensive MAchine)の紹介https://www.cps-jp.org/seminar/fy2010/2010-12-01/hamada/pub/20101201_hamada_02.pdf page5
2014-9-17閲覧
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 14
- 15. Be careful, just incase
The value supercomputers can’t tell by just
Linpack benchmark performance
Maintenance, usability, purpose of
calculations are not considered by Top 500
ranking
But maybe people should mind the cost more…
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 15
- 16. How to check super computers
1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C
2.200GHz, TH Express-2, Intel Xeon Phi 31S1P
TH-IVB-FEP Cluster -> system name
Intel Xeon E5-2692 12C 2.200GHz -> cpu name
TH Express-2 -> interconnection
Intel Xeon Phi 31S1P -> accelarator
K-Computer’s Inter connection “Tofu”
6 dimension mesh taurus
スーパーコンピュータの高次元接続技術が「恩賜発明賞」を受賞
http://pr.fujitsu.com/jp/news/2014/05/29.html 2014-09-17閲覧
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 16
- 17. Questions so far?
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 17
- 18. Some parts of Shinra System Technology components
Remote rendering architecture
RDMA/TCP dual protocol inter connection
Distribution models depending on game design
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 18
- 20. Remote rendering architecture
• Rendering on GPU server
• DirectX11API calls are
executed in my laptop
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 20
- 21. Remote rendering archtecture
Game.exe
(third-party)
…
Process environment
Fake
Fake
… dxgi.dll
d3d11.dll
dinput.dll dxgi.dll d3d11.dll
nvwgf2umx.ws2_32.dll
dll
nvlddmkm.sys
Renderer.exe
ws2_32.dll dxgi.dll d3d11.dll
nvwgf2umx.dll
nvlddmkm.sys
Network card Network card
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 21
- 22. Remote rendering architecture
• Separate CPU & GPU Servers
• Many users per logical unit
• Flexible architecture allows
efficient CPU/GPU usage
Logical unit of game system
Physical unit
CPU GPU GPU
CPU GPU GPU
GPU GPU
GPU GPU
CPU
CPU
CPU CPU
CPU CPU
CPU CPU
CPU CPU
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 22
- 23. GPU
CPU
CPU/GPU performance mismatch
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 23
- 24. The relationship between the cost and performance
y = 1037.3x-0.826
R² = 0.9055
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
Twice expensive doesn’t mean double performance.
0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 24
- 25. Rendering 60 games in a server
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 25
- 26. RDMA/TCP Dual protocol inter connection
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 26
- 27. The performance of a latest network card(TCP)
Comp01<->GPU01 Effective bandwidth 8.8Gbps loopback(E5-1650@3.2GHz)
Effective bandwidth
3.59 Gbps
Unit size RTT(μsec) Unit size RTT(μsec)
4 42,09 4 15,080261
8 41,75 8 14,986181
16 42,18 16 15,00307
32 41,86 32 15,097176
64 42,69 64 15,081717
128 42,91 128 15,106041
256 43,35 256 15,17368
512 44,6 512 15,301775
1024 46,6 1024 15,67151
2048 64,19 2048 24,330402
4096 79,87 4096 30,921734
8192 140,06 8192 45,846207
16384 186,85 16384 79,473488
32768 291,19 32768 129,546127
65536 497,89 65536 227,030136
131072 909,93 131072 435,540619
262144 1800,49 262144 929,645325
524288 3483,36 524288 1904,819336
1048576 6841,73 1048576 4009,06958
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 27
- 28. Mellanox Connect X3
-can use RDMA in Ether net environment
-the interconection of Tianhe-2 (MilkyWay-2) using
RDMA as well
-can skip most of OS/Driver layer and directly move
memory to remote machines
http://www.mellanox.com/page/products_dyn?product_family=127 2014-9-17閲覧
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 28
- 29. The interconnection of Shinra system
Game.exe
(third-party)
dinput.dll
Fake
Fake
00100101000
11101011100
… dxgi.dll
d3d11.10011101010
dll
nvwgf2umx.dll
Renderer.exe
ws2_32.dll dxgi.dll d3d11.dll
nvwgf2umx.dll
nvlddmkm.sys
…
ws2_32.dll
Compression (500μs / Ratio 1:8)
Transmission to the Renderer
• Using TCP over Gigabit Ethernet (500μs)
• Using RDMA over Converged Ethernet (50μs)
Decompression (200μs)
Delay ≈ 1.2ms
Video card
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 29
- 30. Distribution models depending on game design
Stand alone architecture
SS Architecture
MK Architecture
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 30
- 31. Stand alone architecture
Compute Server Rendering Server
Rendering Commands GPU
Game.exe Rendering.exe
Input Video
internet internet
GPU
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 31
- 32. SS Architecture
Compute Server Rendering Server
Remote Renderer
Rendering Commands
Server
Game
Game
Game
Game
internet internet
Input
GPU
GPU
4 x Video Streams
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 32
- 33. MK Architecture
Compute Server Rendering Server
Game GPU
User
User
User
User
Rendering Commands
Remote Renderer
4 users in a single process…
internet internet
GPU
Input
4 x Video Streams
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 33
- 34. We will make a SDK for these 3 architectures
standardized
11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 34