SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
1
ARM 64bit has come!
Tetsuyuki Kobayashi
2014.5.23 Japan Technical Jamboree
2014.5.25 Updated for カーネル /VM 探検隊
2
 The latest version of this slide will
be available from here
 http://www.slideshare.net/tetsu.koba/presentati
ons
3
Who am I?
 20+ years involved in embedded systems
 10 years in real time OS, such as iTRON
 10 years in embedded Java Virtual Machine
 Now GCC, Linux, QEMU, Android, …
 Blogs
 http://d.hatena.ne.jp/embedded/ (Personal)
 http://blog.kmckk.com/ (Corporate)
 http://kobablog.wordpress.com/(English)
 Twitter
 @tetsu_koba
Today's topics
 Introduction of ARM 64bit
 But does not cover all, only
something interesting for me :)
 Try aarch64 using QEMU
ARMv8 terminology
 AArch64: 64 bit mode
 1 instruction set: A64
 A64: 32bit fixed length instructions
 AArch32: 32 bit mode
 Upper compatible with ARMv7-A architecture
 2 instruction sets: A32, T32
 A32: ARM, 32bit fixed length instructions
 T32: Thumb2, 16bit/32bit instructions
6
ARM64 is not official name
 In the kernel source
 arch/arm64
Exception level
 4 levels
 Typical usage
 EL0: User application
 EL1: Kernel of OS
 EL2: Hypervisor
 EL3: Secure monitor
 Aarch64/aarch32 can change between
exception level
 CF. PL0-PL2 (Privilege level) at ARMv7
Aarch64 execution model
 R0 – R30: 64bit length general purpose
registers
 Wn: lower 32bit
 Xn: 64bit
 32th register means zero register(XZR, WZR) or SP
 SP: Stack Pointer
 Must be 16 byte aligned
 WSP for lower 32bit
 PC: Program Counter
 Can not use for calculate destination
Aarch64 execution model (cont.)
 V0 – V31: 128 bit length registers
 For floating point and SIMD
 Aarch64 must have FPU. No calling standard for
soft-float.
 Scalar
 Bn, Hn, Sn, Dn, Qn
 Vector
 Vn.8B, Vn.16B, Vn.4H, Vn.8H, Vn.2S, Vn.4S,
Vn.1D, Vn.2D
 FPCR: Floating Point Control Register
 FPSR: Floating Point Status Register
Aarch64 addressing model
 Without tag: 64bit virtual address
 With tag: 8bit tag + 56bit virtual address
 Tag is ignored when load/store/branch
 Good for implementing type-less languages
 Effective virtual address length is 48bit.
Calling standard (AAPCS64)
 R30 = LR (Link Register)
 R29 = FP (Frame Pointer)
 Parameter passing
 R0 – R7 for integer and pointer
 V0 – V7 for float
 Callee must preserve
 R19 – R29, SP
 V8 – V15
 No calling standard for soft-float
A64 instruction set
 Brand-new, clean design for 64bit architecture
 Not all, very small set of ”conditional data
processing” instructions
 No equivalent of Thumb2's IT instruction.
No multiple load/store
 No multiple load/store GP registers such
as LDM/STM, PUSH/POP
 Instead, there are 2 register load/store
such as LDP/STP
YIELD instruction
 NOP with hinting not important
 Use in spin-loop and trigger context
switching in SMT(Symmetric Multi-
Threading)
Sample #1 source
#include <stdio.h>
int main()
{
int i;
for (i = 5; i >=0; i--) {
printf("count down: %dn", i);
}
return 0;
}
Sample #1 Thumb2
000083f8 <main>:
83f8: b570 push {r4, r5, r6, lr}
83fa: 2405 movs r4, #5
83fc: f248 456c movw r5, #33900 ; 0x846c
8400: f2c0 0500 movt r5, #0
8404: 2601 movs r6, #1
8406: 4630 mov r0, r6
8408: 4629 mov r1, r5
840a: 4622 mov r2, r4
840c: f7ff ef7a blx 8304 <_init+0x38>
8410: 3c01 subs r4, #1
8412: f1b4 3fff cmp.w r4, #4294967295 ; 0xffffffff
8416: d1f6 bne.n 8406 <main+0xe>
8418: 2000 movs r0, #0
841a: bd70 pop {r4, r5, r6, pc}
Sample #1 A64
0000000000400440 <main>:
400440: a9be7bfd stp x29, x30, [sp,#-32]!
400444: 910003fd mov x29, sp
400448: a90153f3 stp x19, x20, [sp,#16]
40044c: 90000014 adrp x20, 400000 <_init-0x3c0>
400450: 528000b3 mov w19, #0x5 // #5
400454: 911a0294 add x20, x20, #0x680
400458: 2a1303e2 mov w2, w19
40045c: 52800020 mov w0, #0x1 // #1
400460: aa1403e1 mov x1, x20
400464: 97ffffeb bl 400410 <__printf_chk@plt>
400468: 51000673 sub w19, w19, #0x1
40046c: 3100067f cmn w19, #0x1
400470: 54ffff41 b.ne 400458 <main+0x18>
400474: 52800000 mov w0, #0x0 // #0
400478: a94153f3 ldp x19, x20, [sp,#16]
40047c: a8c27bfd ldp x29, x30, [sp],#32
400480: d65f03c0 ret
Sample #2 source
int iaload(int *base, int index)
{
return base[index];
}
long long laload(long long *base, int index)
{
return base[index];
}
char ibload(char *base, int index)
{
return base[index];
}
short isload(short *base, int index)
{
return base[index];
}
Sample #2 Thumb2
00000000 <iaload>:
0: f850 0021 ldr.w r0, [r0, r1, lsl #2]
4: 4770 bx lr
6: bf00 nop
00000008 <laload>:
8: eb00 01c1 add.w r1, r0, r1, lsl #3
c: e9d1 0100 ldrd r0, r1, [r1]
10: 4770 bx lr
12: bf00 nop
00000014 <ibload>:
14: 5c40 ldrb r0, [r0, r1]
16: 4770 bx lr
00000018 <isload>:
18: f930 0011 ldrsh.w r0, [r0, r1, lsl #1]
1c: 4770 bx lr
1e: bf00 nop
Sample #2 A64
0000000000000000 <iaload>:
0: b861d800 ldr w0, [x0,w1,sxtw #2]
4: d65f03c0 ret
0000000000000008 <laload>:
8: f861d800 ldr x0, [x0,w1,sxtw #3]
c: d65f03c0 ret
0000000000000010 <ibload>:
10: 3861c800 ldrb w0, [x0,w1,sxtw]
14: d65f03c0 ret
0000000000000018 <isload>:
18: 7861d800 ldrh w0, [x0,w1,sxtw #1]
1c: d65f03c0 ret
Sample #3 source
double range(double x, double min, double max)
{
if (x < min)
return min;
else if (x > max)
return max;
else
return x;
}
Sample #3 Thumb2
00000000 <range>:
0: eeb4 0bc1 vcmpe.f64 d0, d1
4: eef1 fa10 vmrs APSR_nzcv, fpscr
8: d407 bmi.n 1a <range+0x1a>
a: eeb4 0bc2 vcmpe.f64 d0, d2
e: eef1 fa10 vmrs APSR_nzcv, fpscr
12: bfc8 it gt
14: eeb0 0b42 vmovgt.f64 d0, d2
18: 4770 bx lr
1a: eeb0 0b41 vmov.f64d0, d1
1e: 4770 bx lr
Sample #3 A64
0000000000000000 <range>:
0: 1e612010 fcmpe d0, d1
4: 540000a4 b.mi 18 <range+0x18>
8: 1e622010 fcmpe d0, d2
c: 1e604041 fmov d1, d2
10: 5400004c b.gt 18 <range+0x18>
14: 1e604001 fmov d1, d0
18: 1e604020 fmov d0, d1
1c: d65f03c0 ret
Cache control
 Application level cache instructions
 Data cache
 DC VAU
 DC CVAC
 DC CIVAC
 Instruction cache
 IC IVAU
 No need to call kernel syscall
 JIT friendly
Preloading cache
 PRFM <prfop>, addr|label
 <prfop> ::= <type><target><policy>
 <type> ::= PLD | PST | PLI
 <target> ::= L1 | L2 | L3
 <policy> ::= KEEP | STRM
Non-temporal load/store
 LDNP/STNP
 Hinting unlikely to be accessed again
(like streaming)
Aarch32
 Upper compatible with ARMv7
 Added encrypt extension
 Added other some new instructions
aligned to aarch64
 Removed Jazelle, ThumbEE
Let's try Aarch64 using QEMU
 Qemu 2.0 supports aarch64 user mode
emulation
 Ubuntu 14.04 has qemu 2.0 and cross compiler
for aarch64
$ sudo apt-get install qemu-user-static
$ sudo apt-get install g++-aarch64-linux-gnu
Prepare gdb for aarch64
$ sudo apt-get build-dep gdb
$ wget http://ftp.gnu.org/gnu/gdb/gdb-7.7.1.tar.bz2
$ tar xf gdb-7.7.1.tar.bz2
$ mkdir obj
$ cd obj
$ ../gdb-7.7.1/configure --target=aarch64-linux-gnu
$ make
$ sudo make install
Execute by qemu and connect
gdb
$ aarch64-linux-gnu-gcc -g a.c
$ export QEMU_LD_PREFIX=/usr/aarch64-linux-gnu/
$ qemu-aarch64-static -g 1234 ./a.out
$ aarch64-linux-gnu-gdb ./a.out
  ...
(gdb) target remote :1234
(gdb) b main
(gdb) c
(gdb) x/i $pc
=> 0x4005a0 <main>: stp x29, x30, [sp,#-48]!
(gdb)
DEMO
32
References
 ARMv8Technology Preview
 ARMv8 Instruction Set Overview
 ARM®Architecture Reference Manual
 Procedure Call Standard for theARM 64-bitArch
itecture(AArch64)
 ARM 64bit ARMv8の アーキテクチャ の概要
 Ubuntu 14.04 arm 64bit(aarch6で
4)のコードをコンパイルして動かしてみる
33
Any comment?
@tetsu_koba
Thank you for listening!

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
RISC-V : Berkeley Boot Loader & Proxy Kernelのソースコード解析
 
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
Faster Container Image Distribution on a Variety of Tools with Lazy PullingFaster Container Image Distribution on a Variety of Tools with Lazy Pulling
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
 
ARMアーキテクチャにおけるセキュリティ機構の紹介
ARMアーキテクチャにおけるセキュリティ機構の紹介ARMアーキテクチャにおけるセキュリティ機構の紹介
ARMアーキテクチャにおけるセキュリティ機構の紹介
 
仮想化技術によるマルウェア対策とその問題点
仮想化技術によるマルウェア対策とその問題点仮想化技術によるマルウェア対策とその問題点
仮想化技術によるマルウェア対策とその問題点
 
NEDIA_SNIA_CXL_講演資料.pdf
NEDIA_SNIA_CXL_講演資料.pdfNEDIA_SNIA_CXL_講演資料.pdf
NEDIA_SNIA_CXL_講演資料.pdf
 
知っているようで知らないNeutron -仮想ルータの冗長と分散- - OpenStack最新情報セミナー 2016年3月
知っているようで知らないNeutron -仮想ルータの冗長と分散- - OpenStack最新情報セミナー 2016年3月 知っているようで知らないNeutron -仮想ルータの冗長と分散- - OpenStack最新情報セミナー 2016年3月
知っているようで知らないNeutron -仮想ルータの冗長と分散- - OpenStack最新情報セミナー 2016年3月
 
Function Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe DriverFunction Level Analysis of Linux NVMe Driver
Function Level Analysis of Linux NVMe Driver
 
Embedded Virtualization applied in Mobile Devices
Embedded Virtualization applied in Mobile DevicesEmbedded Virtualization applied in Mobile Devices
Embedded Virtualization applied in Mobile Devices
 
TEE (Trusted Execution Environment)は第二の仮想化技術になるか?
TEE (Trusted Execution Environment)は第二の仮想化技術になるか?TEE (Trusted Execution Environment)は第二の仮想化技術になるか?
TEE (Trusted Execution Environment)は第二の仮想化技術になるか?
 
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
XPDDS17: Shared Virtual Memory Virtualization Implementation on Xen - Yi Liu,...
 
今話題のいろいろなコンテナランタイムを比較してみた
今話題のいろいろなコンテナランタイムを比較してみた今話題のいろいろなコンテナランタイムを比較してみた
今話題のいろいろなコンテナランタイムを比較してみた
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
RISC-Vのセキュリティ技術(TEE, Root of Trust, Remote Attestation)
RISC-Vのセキュリティ技術(TEE, Root of Trust, Remote Attestation)RISC-Vのセキュリティ技術(TEE, Root of Trust, Remote Attestation)
RISC-Vのセキュリティ技術(TEE, Root of Trust, Remote Attestation)
 
Ceph アーキテクチャ概説
Ceph アーキテクチャ概説Ceph アーキテクチャ概説
Ceph アーキテクチャ概説
 
FPGAアクセラレータの作り方
FPGAアクセラレータの作り方FPGAアクセラレータの作り方
FPGAアクセラレータの作り方
 
[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020[KubeCon NA 2020] containerd: Rootless Containers 2020
[KubeCon NA 2020] containerd: Rootless Containers 2020
 
OpenStack超入門シリーズ いまさら聞けないNeutronの使い方
OpenStack超入門シリーズ いまさら聞けないNeutronの使い方OpenStack超入門シリーズ いまさら聞けないNeutronの使い方
OpenStack超入門シリーズ いまさら聞けないNeutronの使い方
 
Podman rootless containers
Podman rootless containersPodman rootless containers
Podman rootless containers
 
Android 10
Android 10Android 10
Android 10
 
Introduction and Deep Dive Into Containerd
Introduction and Deep Dive Into ContainerdIntroduction and Deep Dive Into Containerd
Introduction and Deep Dive Into Containerd
 

Destacado

Simple and efficient way to get the last log using MMAP
Simple and efficient way to get the last log using MMAPSimple and efficient way to get the last log using MMAP
Simple and efficient way to get the last log using MMAP
Tetsuyuki Kobayashi
 

Destacado (20)

Tricky implementation of Go ARM soft float
Tricky implementation of Go ARM soft floatTricky implementation of Go ARM soft float
Tricky implementation of Go ARM soft float
 
Lesotho
LesothoLesotho
Lesotho
 
Simple and efficient way to get the last log using MMAP
Simple and efficient way to get the last log using MMAPSimple and efficient way to get the last log using MMAP
Simple and efficient way to get the last log using MMAP
 
Ca in lesotho
Ca in lesothoCa in lesotho
Ca in lesotho
 
Patch101
Patch101Patch101
Patch101
 
Opensource Android
Opensource AndroidOpensource Android
Opensource Android
 
Tips of Malloc & Free
Tips of Malloc & FreeTips of Malloc & Free
Tips of Malloc & Free
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
ARM Cortex-A53 Errata on Andoid
ARM Cortex-A53 Errata on AndoidARM Cortex-A53 Errata on Andoid
ARM Cortex-A53 Errata on Andoid
 
Basic of virtual memory of Linux
Basic of virtual memory of LinuxBasic of virtual memory of Linux
Basic of virtual memory of Linux
 
1000 speakers conference in English, 2nd, 1/20/2013
1000 speakers conference in English, 2nd, 1/20/20131000 speakers conference in English, 2nd, 1/20/2013
1000 speakers conference in English, 2nd, 1/20/2013
 
kernel code reading party on March 28th, 2014
kernel code reading party on March 28th, 2014kernel code reading party on March 28th, 2014
kernel code reading party on March 28th, 2014
 
Tokaido 53 walk
Tokaido 53 walkTokaido 53 walk
Tokaido 53 walk
 
Hacker culture at an internet company. 文明塾, 2014/04/23
Hacker culture at an internet company. 文明塾, 2014/04/23Hacker culture at an internet company. 文明塾, 2014/04/23
Hacker culture at an internet company. 文明塾, 2014/04/23
 
English Book Club at 1000 Speakers Conference in English, 3rd, on Feb 7th, 2014
English Book Club at 1000 Speakers Conference in English, 3rd, on Feb 7th, 2014English Book Club at 1000 Speakers Conference in English, 3rd, on Feb 7th, 2014
English Book Club at 1000 Speakers Conference in English, 3rd, on Feb 7th, 2014
 
Project Based Learning using by PaaS
Project Based Learning using by PaaSProject Based Learning using by PaaS
Project Based Learning using by PaaS
 
対人関係におけるアルコールの効用
対人関係におけるアルコールの効用対人関係におけるアルコールの効用
対人関係におけるアルコールの効用
 
Introduction to Git and GitHub #git_nyan
Introduction to Git and GitHub #git_nyanIntroduction to Git and GitHub #git_nyan
Introduction to Git and GitHub #git_nyan
 
Hacker centric culture @devlove 110423
Hacker centric culture @devlove 110423Hacker centric culture @devlove 110423
Hacker centric culture @devlove 110423
 
1000 Speakers Conference in English, 5th on April 11th, 2014 #1000eng
1000 Speakers Conference in English, 5th on April 11th, 2014 #1000eng1000 Speakers Conference in English, 5th on April 11th, 2014 #1000eng
1000 Speakers Conference in English, 5th on April 11th, 2014 #1000eng
 

Similar a ARM 64bit has come!

Ak12 upgrade
Ak12 upgradeAk12 upgrade
Ak12 upgrade
Accenture
 
Vectorization on x86: all you need to know
Vectorization on x86: all you need to knowVectorization on x86: all you need to know
Vectorization on x86: all you need to know
Roberto Agostino Vitillo
 
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Jagadisha Maiya
 

Similar a ARM 64bit has come! (20)

RISC-V Zce Extension
RISC-V Zce ExtensionRISC-V Zce Extension
RISC-V Zce Extension
 
Ak12 upgrade
Ak12 upgradeAk12 upgrade
Ak12 upgrade
 
The forgotten art of assembly
The forgotten art of assemblyThe forgotten art of assembly
The forgotten art of assembly
 
Debugging 2013- Jesper Brouer
Debugging 2013- Jesper BrouerDebugging 2013- Jesper Brouer
Debugging 2013- Jesper Brouer
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
HackLU 2018 Make ARM Shellcode Great Again
HackLU 2018 Make ARM Shellcode Great AgainHackLU 2018 Make ARM Shellcode Great Again
HackLU 2018 Make ARM Shellcode Great Again
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
 
Kernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering OopsiesKernel Recipes 2013 - Deciphering Oopsies
Kernel Recipes 2013 - Deciphering Oopsies
 
Symbolic Debugging with DWARF
Symbolic Debugging with DWARFSymbolic Debugging with DWARF
Symbolic Debugging with DWARF
 
Dynamic user trace
Dynamic user traceDynamic user trace
Dynamic user trace
 
Scale17x buffer overflows
Scale17x buffer overflowsScale17x buffer overflows
Scale17x buffer overflows
 
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
 
Reverse engineering of binary programs for custom virtual machines
Reverse engineering of binary programs for custom virtual machinesReverse engineering of binary programs for custom virtual machines
Reverse engineering of binary programs for custom virtual machines
 
Network Docs
Network DocsNetwork Docs
Network Docs
 
Make ARM Shellcode Great Again - HITB2018PEK
Make ARM Shellcode Great Again - HITB2018PEKMake ARM Shellcode Great Again - HITB2018PEK
Make ARM Shellcode Great Again - HITB2018PEK
 
Vectorization on x86: all you need to know
Vectorization on x86: all you need to knowVectorization on x86: all you need to know
Vectorization on x86: all you need to know
 
Linux+sensor+device-tree+shell=IoT !
Linux+sensor+device-tree+shell=IoT !Linux+sensor+device-tree+shell=IoT !
Linux+sensor+device-tree+shell=IoT !
 
892 f sfp configuration example
892 f sfp configuration example892 f sfp configuration example
892 f sfp configuration example
 
Troubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device DriversTroubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting Linux Kernel Modules And Device Drivers
 
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
 

Más de Tetsuyuki Kobayashi

ADB(Android Debug Bridge): How it works?
ADB(Android Debug Bridge): How it works?ADB(Android Debug Bridge): How it works?
ADB(Android Debug Bridge): How it works?
Tetsuyuki Kobayashi
 

Más de Tetsuyuki Kobayashi (20)

some topic of ffmpeg
some topic of ffmpeg some topic of ffmpeg
some topic of ffmpeg
 
Try new transport protocol SRT (ver. 2)
Try new transport protocol SRT  (ver. 2)Try new transport protocol SRT  (ver. 2)
Try new transport protocol SRT (ver. 2)
 
Try new transport protocol SRT
Try new transport protocol SRTTry new transport protocol SRT
Try new transport protocol SRT
 
Raspberry Pi Zero とカメラモジュールで作るライブ配信実験機
Raspberry Pi Zero とカメラモジュールで作るライブ配信実験機Raspberry Pi Zero とカメラモジュールで作るライブ配信実験機
Raspberry Pi Zero とカメラモジュールで作るライブ配信実験機
 
WebOS Open Source Edition を試してみた
WebOS Open Source Edition を試してみたWebOS Open Source Edition を試してみた
WebOS Open Source Edition を試してみた
 
Linuxのユーザーランドをinitから全てまるごとgolangで書く
Linuxのユーザーランドをinitから全てまるごとgolangで書くLinuxのユーザーランドをinitから全てまるごとgolangで書く
Linuxのユーザーランドをinitから全てまるごとgolangで書く
 
組み込みLinuxでのGolangのススメ(Go con版)
組み込みLinuxでのGolangのススメ(Go con版)組み込みLinuxでのGolangのススメ(Go con版)
組み込みLinuxでのGolangのススメ(Go con版)
 
組み込みLinuxでのGolangのススメ
組み込みLinuxでのGolangのススメ組み込みLinuxでのGolangのススメ
組み込みLinuxでのGolangのススメ
 
Virtual memory 20070222-en
Virtual memory 20070222-enVirtual memory 20070222-en
Virtual memory 20070222-en
 
ADB(Android Debug Bridge): How it works?
ADB(Android Debug Bridge): How it works?ADB(Android Debug Bridge): How it works?
ADB(Android Debug Bridge): How it works?
 
Tweaking Google TV emulator
Tweaking Google TV emulatorTweaking Google TV emulator
Tweaking Google TV emulator
 
Inter-process communication of Android
Inter-process communication of AndroidInter-process communication of Android
Inter-process communication of Android
 
Android is NOT just 'Java on Linux'
Android is NOT just 'Java on Linux'Android is NOT just 'Java on Linux'
Android is NOT just 'Java on Linux'
 
Android ipm 20110409
Android ipm 20110409Android ipm 20110409
Android ipm 20110409
 
Init of Android
Init of AndroidInit of Android
Init of Android
 
Android On Ubuntu for developer
Android On Ubuntu for developerAndroid On Ubuntu for developer
Android On Ubuntu for developer
 
Reusing your existing software on Android
Reusing your existing software on AndroidReusing your existing software on Android
Reusing your existing software on Android
 
Logging system of Android
Logging system of AndroidLogging system of Android
Logging system of Android
 
Logging system of Android
Logging system of AndroidLogging system of Android
Logging system of Android
 
Froyo DalvikVM JIT
Froyo DalvikVM JITFroyo DalvikVM JIT
Froyo DalvikVM JIT
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

ARM 64bit has come!

  • 1. 1 ARM 64bit has come! Tetsuyuki Kobayashi 2014.5.23 Japan Technical Jamboree 2014.5.25 Updated for カーネル /VM 探検隊
  • 2. 2  The latest version of this slide will be available from here  http://www.slideshare.net/tetsu.koba/presentati ons
  • 3. 3 Who am I?  20+ years involved in embedded systems  10 years in real time OS, such as iTRON  10 years in embedded Java Virtual Machine  Now GCC, Linux, QEMU, Android, …  Blogs  http://d.hatena.ne.jp/embedded/ (Personal)  http://blog.kmckk.com/ (Corporate)  http://kobablog.wordpress.com/(English)  Twitter  @tetsu_koba
  • 4. Today's topics  Introduction of ARM 64bit  But does not cover all, only something interesting for me :)  Try aarch64 using QEMU
  • 5. ARMv8 terminology  AArch64: 64 bit mode  1 instruction set: A64  A64: 32bit fixed length instructions  AArch32: 32 bit mode  Upper compatible with ARMv7-A architecture  2 instruction sets: A32, T32  A32: ARM, 32bit fixed length instructions  T32: Thumb2, 16bit/32bit instructions
  • 6. 6 ARM64 is not official name  In the kernel source  arch/arm64
  • 7. Exception level  4 levels  Typical usage  EL0: User application  EL1: Kernel of OS  EL2: Hypervisor  EL3: Secure monitor  Aarch64/aarch32 can change between exception level  CF. PL0-PL2 (Privilege level) at ARMv7
  • 8. Aarch64 execution model  R0 – R30: 64bit length general purpose registers  Wn: lower 32bit  Xn: 64bit  32th register means zero register(XZR, WZR) or SP  SP: Stack Pointer  Must be 16 byte aligned  WSP for lower 32bit  PC: Program Counter  Can not use for calculate destination
  • 9. Aarch64 execution model (cont.)  V0 – V31: 128 bit length registers  For floating point and SIMD  Aarch64 must have FPU. No calling standard for soft-float.  Scalar  Bn, Hn, Sn, Dn, Qn  Vector  Vn.8B, Vn.16B, Vn.4H, Vn.8H, Vn.2S, Vn.4S, Vn.1D, Vn.2D  FPCR: Floating Point Control Register  FPSR: Floating Point Status Register
  • 10. Aarch64 addressing model  Without tag: 64bit virtual address  With tag: 8bit tag + 56bit virtual address  Tag is ignored when load/store/branch  Good for implementing type-less languages  Effective virtual address length is 48bit.
  • 11. Calling standard (AAPCS64)  R30 = LR (Link Register)  R29 = FP (Frame Pointer)  Parameter passing  R0 – R7 for integer and pointer  V0 – V7 for float  Callee must preserve  R19 – R29, SP  V8 – V15  No calling standard for soft-float
  • 12. A64 instruction set  Brand-new, clean design for 64bit architecture  Not all, very small set of ”conditional data processing” instructions  No equivalent of Thumb2's IT instruction.
  • 13. No multiple load/store  No multiple load/store GP registers such as LDM/STM, PUSH/POP  Instead, there are 2 register load/store such as LDP/STP
  • 14. YIELD instruction  NOP with hinting not important  Use in spin-loop and trigger context switching in SMT(Symmetric Multi- Threading)
  • 15. Sample #1 source #include <stdio.h> int main() { int i; for (i = 5; i >=0; i--) { printf("count down: %dn", i); } return 0; }
  • 16. Sample #1 Thumb2 000083f8 <main>: 83f8: b570 push {r4, r5, r6, lr} 83fa: 2405 movs r4, #5 83fc: f248 456c movw r5, #33900 ; 0x846c 8400: f2c0 0500 movt r5, #0 8404: 2601 movs r6, #1 8406: 4630 mov r0, r6 8408: 4629 mov r1, r5 840a: 4622 mov r2, r4 840c: f7ff ef7a blx 8304 <_init+0x38> 8410: 3c01 subs r4, #1 8412: f1b4 3fff cmp.w r4, #4294967295 ; 0xffffffff 8416: d1f6 bne.n 8406 <main+0xe> 8418: 2000 movs r0, #0 841a: bd70 pop {r4, r5, r6, pc}
  • 17. Sample #1 A64 0000000000400440 <main>: 400440: a9be7bfd stp x29, x30, [sp,#-32]! 400444: 910003fd mov x29, sp 400448: a90153f3 stp x19, x20, [sp,#16] 40044c: 90000014 adrp x20, 400000 <_init-0x3c0> 400450: 528000b3 mov w19, #0x5 // #5 400454: 911a0294 add x20, x20, #0x680 400458: 2a1303e2 mov w2, w19 40045c: 52800020 mov w0, #0x1 // #1 400460: aa1403e1 mov x1, x20 400464: 97ffffeb bl 400410 <__printf_chk@plt> 400468: 51000673 sub w19, w19, #0x1 40046c: 3100067f cmn w19, #0x1 400470: 54ffff41 b.ne 400458 <main+0x18> 400474: 52800000 mov w0, #0x0 // #0 400478: a94153f3 ldp x19, x20, [sp,#16] 40047c: a8c27bfd ldp x29, x30, [sp],#32 400480: d65f03c0 ret
  • 18. Sample #2 source int iaload(int *base, int index) { return base[index]; } long long laload(long long *base, int index) { return base[index]; } char ibload(char *base, int index) { return base[index]; } short isload(short *base, int index) { return base[index]; }
  • 19. Sample #2 Thumb2 00000000 <iaload>: 0: f850 0021 ldr.w r0, [r0, r1, lsl #2] 4: 4770 bx lr 6: bf00 nop 00000008 <laload>: 8: eb00 01c1 add.w r1, r0, r1, lsl #3 c: e9d1 0100 ldrd r0, r1, [r1] 10: 4770 bx lr 12: bf00 nop 00000014 <ibload>: 14: 5c40 ldrb r0, [r0, r1] 16: 4770 bx lr 00000018 <isload>: 18: f930 0011 ldrsh.w r0, [r0, r1, lsl #1] 1c: 4770 bx lr 1e: bf00 nop
  • 20. Sample #2 A64 0000000000000000 <iaload>: 0: b861d800 ldr w0, [x0,w1,sxtw #2] 4: d65f03c0 ret 0000000000000008 <laload>: 8: f861d800 ldr x0, [x0,w1,sxtw #3] c: d65f03c0 ret 0000000000000010 <ibload>: 10: 3861c800 ldrb w0, [x0,w1,sxtw] 14: d65f03c0 ret 0000000000000018 <isload>: 18: 7861d800 ldrh w0, [x0,w1,sxtw #1] 1c: d65f03c0 ret
  • 21. Sample #3 source double range(double x, double min, double max) { if (x < min) return min; else if (x > max) return max; else return x; }
  • 22. Sample #3 Thumb2 00000000 <range>: 0: eeb4 0bc1 vcmpe.f64 d0, d1 4: eef1 fa10 vmrs APSR_nzcv, fpscr 8: d407 bmi.n 1a <range+0x1a> a: eeb4 0bc2 vcmpe.f64 d0, d2 e: eef1 fa10 vmrs APSR_nzcv, fpscr 12: bfc8 it gt 14: eeb0 0b42 vmovgt.f64 d0, d2 18: 4770 bx lr 1a: eeb0 0b41 vmov.f64d0, d1 1e: 4770 bx lr
  • 23. Sample #3 A64 0000000000000000 <range>: 0: 1e612010 fcmpe d0, d1 4: 540000a4 b.mi 18 <range+0x18> 8: 1e622010 fcmpe d0, d2 c: 1e604041 fmov d1, d2 10: 5400004c b.gt 18 <range+0x18> 14: 1e604001 fmov d1, d0 18: 1e604020 fmov d0, d1 1c: d65f03c0 ret
  • 24. Cache control  Application level cache instructions  Data cache  DC VAU  DC CVAC  DC CIVAC  Instruction cache  IC IVAU  No need to call kernel syscall  JIT friendly
  • 25. Preloading cache  PRFM <prfop>, addr|label  <prfop> ::= <type><target><policy>  <type> ::= PLD | PST | PLI  <target> ::= L1 | L2 | L3  <policy> ::= KEEP | STRM
  • 26. Non-temporal load/store  LDNP/STNP  Hinting unlikely to be accessed again (like streaming)
  • 27. Aarch32  Upper compatible with ARMv7  Added encrypt extension  Added other some new instructions aligned to aarch64  Removed Jazelle, ThumbEE
  • 28. Let's try Aarch64 using QEMU  Qemu 2.0 supports aarch64 user mode emulation  Ubuntu 14.04 has qemu 2.0 and cross compiler for aarch64 $ sudo apt-get install qemu-user-static $ sudo apt-get install g++-aarch64-linux-gnu
  • 29. Prepare gdb for aarch64 $ sudo apt-get build-dep gdb $ wget http://ftp.gnu.org/gnu/gdb/gdb-7.7.1.tar.bz2 $ tar xf gdb-7.7.1.tar.bz2 $ mkdir obj $ cd obj $ ../gdb-7.7.1/configure --target=aarch64-linux-gnu $ make $ sudo make install
  • 30. Execute by qemu and connect gdb $ aarch64-linux-gnu-gcc -g a.c $ export QEMU_LD_PREFIX=/usr/aarch64-linux-gnu/ $ qemu-aarch64-static -g 1234 ./a.out $ aarch64-linux-gnu-gdb ./a.out   ... (gdb) target remote :1234 (gdb) b main (gdb) c (gdb) x/i $pc => 0x4005a0 <main>: stp x29, x30, [sp,#-48]! (gdb)
  • 31. DEMO
  • 32. 32 References  ARMv8Technology Preview  ARMv8 Instruction Set Overview  ARM®Architecture Reference Manual  Procedure Call Standard for theARM 64-bitArch itecture(AArch64)  ARM 64bit ARMv8の アーキテクチャ の概要  Ubuntu 14.04 arm 64bit(aarch6で 4)のコードをコンパイルして動かしてみる