SlideShare una empresa de Scribd logo
1 de 14
REAL-TIME FACE TRACKING 
NOV 2014
LOOKSERY 
+ + 
VIDEO SELFIES FACE FILTERS INTEGRATED CHAT
REAL-TIME FACE TRACKING DEMO 
3
- Algorithm based on Active Appearance Model. 
- Algorithm complexity is independent from image size. 
- You can control balance between tracking quality and tracking speed 
using only two constants. 
- Algorithm is iterative. Solve Least-Square problem at each iteration. 
- Average 5 iterations per frame. Maximum 10, minimum 1. 
- If you want run on 30 fps you have to perform about 150 iterations per second. 
4 
TRACKING ALGORITHM
Optimisation flow 
—— : Algorithm asymptotic optimisation 
3 FPS: First implementation 
8 FPS: Memory preallocation 
10 FPS: Algorithm parameters optimisation 
13 FPS: Matrix storage optimisation and removing OOP code 
18 FPS: Rewrite bottleneck code at assembler 
24 FPS: Asymptotic optimisation of matrices multiplication 
27 FPS: Replacing operations with float to operations with int 
30 FPS: Multithreading 
5
From float to int 
6 
G[i][j] = (X[i][j] - Y[i][j]) / d[j]; 
We had to build so-called pseudo-inverse, that is 
So we have to perform many multiplication operations. Multiplication of two int 
is much faster then multiplication of two float. Lets create int matrix V: 
V[i][j] = X[i][j] - Y[i][j]; 
And float matrix D: 
D[i][j] = ( i== j ? d[i] : 0); // diagonal matrix 
Then G = V * D. From linear algebra:
7 
CODE TIME 
const int ITERATIONS = 2000000000; 
long long sum = 0; 
for (int i = 0; i < ITERATIONS; i++) 
sum += i * (long long)i; 
cout<<sum<<endl; 
0.00 sec 
const int ITERATIONS = 2000000000; 
long long sum = 0; 
for (int i = 0; i < ITERATIONS; i++) 
sum += i * (long long)i / 3; 
cout<<sum<<endl; 
2.10 sec 
const int ITERATIONS = 2000000000; 
float sum = 0; 
for (int i = 0; i < ITERATIONS; i++) 
sum += i * (float)i / 3; 
cout<<sum<<endl; 
4.29 sec 
Demo benchmarks
Matrices multiplication optimisations 
1) Don’t create a matrix with power of two size. Cache uses simple hash function to 
select a cash line in which the memory will be cached. This hash is just 
a some low (i.e. 16) bits of the memory address. 
When you use the matrix with the size power of two, each of the row has the same 
lowest bits, so you contain only one row in your cache instead of nearly a whole 
matrix. 
2) Change the order of matrices multiplication: to multiply two matrix n x m and m x s 
you have to perform n * m * s operations. If you want to multiply the matrices 
A(n x m) * B(m x s) * C(s x k), you can do it in two ways with the same result: 
(A * B) * C with n*m*s + n*s*k operations. 
or 
A * (B * C) with m*s*k + n*m*k operations. 
n*m*s + n*s*k != m*s*k + n*m*k in general case, choose the smallest one. 
8
Hello assembler 
9 
int *row = GT[i]; 
for (int j = i, pos = (int)(i * GT.columnCount()); j < GT.rowCount(); j++) 
{ 
int curr = 0; 
for (int k = 0; k < GT.columnCount(); k++, pos++) 
curr += row[k] * GT.val[pos]; 
GTG[i][j] = GTG[j][i] = curr; 
} 
It looks optimised enough. Is there anything we can improve? 
Well, let’s have a look at ASM code.. 
0x149ac2: ldr.w lr, [r5, r9, lsl #2] 
0x149ac6: add.w r9, r9, #0x1 
0x149aca: cmp r9, r2 
0x149acc: ldr r8, [r12], #4 
0x149ad0: mla r11, lr, r8, r11 
0x149ad4: blo 0x149ac2 ;at AppearanceTracker.cpp:555 
No SIMD instructions there :(
Let’s add some SIMD 
10 
int *row = GT[i]; 
int *rowInit = row; 
int *rowPos = GT.val + i * GT.columnCount(); 
int *rowEnd = row + processedCnt; 
for (int j = i; j < GT.rowCount(); j++) 
{ 
row = rowInit; 
int accum[8] = {0}; 
__asm__ volatile 
( 
"vld1.32 {d8-d11}, [%[accum]] nt" 
"L_mulStart%=:nt" 
"vld1.32 {d0-d3}, [%[row]]! nt" 
"vld1.32 {d4-d7}, [%[val]]! nt" 
"vmla.i32 q4, q2, q0 nt" 
"vmla.i32 q5, q3, q1 nt" 
"cmp %[row], %[rowEnd]nt" 
"blo L_mulStart%=nt" 
"vst1.32 {d8-d11}, [%[accum]]nt" 
: [row] "+r" (row), [val] "+r" (rowPos) 
: [rowEnd] "r" (rowEnd), [accum] "r" (accum) 
); 
//собирание 8 значений из accum 
//допроцесс остатка mod 8 
} 
int *row = GT[i]; 
for (int j = i, pos = (int)(i * GT.columnCount()); 
j < GT.rowCount(); j++) 
{ 
int curr = 0; 
for (int k = 0; k < GT.columnCount(); 
k++, pos++) 
curr += row[k] * GT.val[pos]; 
GTG[i][j] = GTG[j][i] = curr; 
}
Practical difference? 
11 
Let’s profile it 
Before: 
After: 
Approx. 2-2.5 times faster
12 
Some issue about hardware 
Task: Crop a square from CMSampleBuffer(that contains CVImageBufferRef) 
and write it using AVAssetWriterInputPixelBufferAdaptor 
Input buffer address 
Target image address 
Create CMSampleBuffer by 
just moving base address and new 
setting height. 
O(1) operation. 
BAD 
Create CMSampleBuffer by 
creating new CVPixelBufferRef 
from CVTextureCache and copy 
image. 
O(Height*Width) operation 
GOOD
13 
iOS 8 strikes back 
iPhone 5S iOS 7.1 - 30 FPS 
iPhone 5S iOS 8.0 - 15 FPS O_o 
Possible reasons: 
1) Memory corruption at C++ core code 
2) iOS 8 QOS: 
Wrong queue priority: QOS_CLASS_BACKGROUND instead of QOS_CLASS_USER_INITIATED 
3) Blinking of this guy
CONTACT INFORMATION 
FEDOR POLYAKOV 
Mobile: +38 097 59 0000 9 
E-Mail: fedor@looksery.com 
YURII MONASTYRSHYN 
Mobile: +38 067 482 60 97 
E-Mail: yurii@looksery.com 
VICTOR SHABUROV, FOUNDER 
Mobile: +1 650 575 9359 
Fax: +1 866 626 9582 
E-Mail: victor@looksery.com 
WEB 
looksery.com 
facebook.com/looksery 
twitter.com/looksery

Más contenido relacionado

La actualidad más candente

An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
An Open Discussion of RISC-V BitManip, trends, and comparisons _ ClaireRISC-V International
 
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...AMD Developer Central
 
JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020Joseph Kuo
 
C++ AMP 실천 및 적용 전략
C++ AMP 실천 및 적용 전략 C++ AMP 실천 및 적용 전략
C++ AMP 실천 및 적용 전략 명신 김
 
Compilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMCompilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMLinaro
 
Building Efficient and Highly Run-Time Adaptable Virtual Machines
Building Efficient and Highly Run-Time Adaptable Virtual MachinesBuilding Efficient and Highly Run-Time Adaptable Virtual Machines
Building Efficient and Highly Run-Time Adaptable Virtual MachinesGuido Chari
 
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019 Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019 Unity Technologies
 
Compiler optimization
Compiler optimizationCompiler optimization
Compiler optimizationZongYing Lyu
 
Unit v memory &amp; programmable logic devices
Unit v   memory &amp; programmable logic devicesUnit v   memory &amp; programmable logic devices
Unit v memory &amp; programmable logic devicesKanmaniRajamanickam
 
Compiler presention
Compiler presentionCompiler presention
Compiler presentionFaria Priya
 
Model-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical ConstraintsModel-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical ConstraintsQuoc-Sang Phan
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerMarina Kolpakova
 
Concurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System DiscussionConcurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System DiscussionCherryBerry2
 
TinyML - 4 speech recognition
TinyML - 4 speech recognition TinyML - 4 speech recognition
TinyML - 4 speech recognition 艾鍗科技
 
Post-processing SAR images on Xeon Phi - a porting exercise
Post-processing SAR images on Xeon Phi - a porting exercisePost-processing SAR images on Xeon Phi - a porting exercise
Post-processing SAR images on Xeon Phi - a porting exerciseIntel IT Center
 
Java lejos-multithreading
Java lejos-multithreadingJava lejos-multithreading
Java lejos-multithreadingMr. Chanuwan
 

La actualidad más candente (20)

An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
An Open Discussion of RISC-V BitManip, trends, and comparisons _ Claire
 
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by  ...
WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by ...
 
JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020JCConf 2020 - New Java Features Released in 2020
JCConf 2020 - New Java Features Released in 2020
 
OpenMP And C++
OpenMP And C++OpenMP And C++
OpenMP And C++
 
C++ AMP 실천 및 적용 전략
C++ AMP 실천 및 적용 전략 C++ AMP 실천 및 적용 전략
C++ AMP 실천 및 적용 전략
 
Compilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVMCompilation of COSMO for GPU using LLVM
Compilation of COSMO for GPU using LLVM
 
Building Efficient and Highly Run-Time Adaptable Virtual Machines
Building Efficient and Highly Run-Time Adaptable Virtual MachinesBuilding Efficient and Highly Run-Time Adaptable Virtual Machines
Building Efficient and Highly Run-Time Adaptable Virtual Machines
 
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019 Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
Intrinsics: Low-level engine development with Burst - Unite Copenhagen 2019
 
Compiler optimization
Compiler optimizationCompiler optimization
Compiler optimization
 
Unit v memory &amp; programmable logic devices
Unit v   memory &amp; programmable logic devicesUnit v   memory &amp; programmable logic devices
Unit v memory &amp; programmable logic devices
 
Compiler presention
Compiler presentionCompiler presention
Compiler presention
 
Model-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical ConstraintsModel-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical Constraints
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
The low level awesomeness of Go
The low level awesomeness of GoThe low level awesomeness of Go
The low level awesomeness of Go
 
Concurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System DiscussionConcurrent Programming OpenMP @ Distributed System Discussion
Concurrent Programming OpenMP @ Distributed System Discussion
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
TinyML - 4 speech recognition
TinyML - 4 speech recognition TinyML - 4 speech recognition
TinyML - 4 speech recognition
 
openmp
openmpopenmp
openmp
 
Post-processing SAR images on Xeon Phi - a porting exercise
Post-processing SAR images on Xeon Phi - a porting exercisePost-processing SAR images on Xeon Phi - a porting exercise
Post-processing SAR images on Xeon Phi - a porting exercise
 
Java lejos-multithreading
Java lejos-multithreadingJava lejos-multithreading
Java lejos-multithreading
 

Destacado

Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionJia-Bin Huang
 
TargetSummit Moscow Late 2016 | Looksery, Julie Krasnienko
TargetSummit Moscow Late 2016 | Looksery, Julie KrasnienkoTargetSummit Moscow Late 2016 | Looksery, Julie Krasnienko
TargetSummit Moscow Late 2016 | Looksery, Julie KrasnienkoTargetSummit
 
Go an Epic Selfie Adventure at Your Next Conference
Go an Epic Selfie Adventure at Your Next ConferenceGo an Epic Selfie Adventure at Your Next Conference
Go an Epic Selfie Adventure at Your Next ConferenceShelly Sanchez Terrell
 
User Interfaces and User Centered Design Techniques for Augmented Reality and...
User Interfaces and User Centered Design Techniques for Augmented Reality and...User Interfaces and User Centered Design Techniques for Augmented Reality and...
User Interfaces and User Centered Design Techniques for Augmented Reality and...Stuart Murphy
 

Destacado (6)

Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes Recognition
 
Resources optimisation for OpenGL — Lesya Voronova (Looksery, Tech Stage)
Resources optimisation for OpenGL — Lesya Voronova (Looksery, Tech Stage)Resources optimisation for OpenGL — Lesya Voronova (Looksery, Tech Stage)
Resources optimisation for OpenGL — Lesya Voronova (Looksery, Tech Stage)
 
TargetSummit Moscow Late 2016 | Looksery, Julie Krasnienko
TargetSummit Moscow Late 2016 | Looksery, Julie KrasnienkoTargetSummit Moscow Late 2016 | Looksery, Julie Krasnienko
TargetSummit Moscow Late 2016 | Looksery, Julie Krasnienko
 
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
 
Go an Epic Selfie Adventure at Your Next Conference
Go an Epic Selfie Adventure at Your Next ConferenceGo an Epic Selfie Adventure at Your Next Conference
Go an Epic Selfie Adventure at Your Next Conference
 
User Interfaces and User Centered Design Techniques for Augmented Reality and...
User Interfaces and User Centered Design Techniques for Augmented Reality and...User Interfaces and User Centered Design Techniques for Augmented Reality and...
User Interfaces and User Centered Design Techniques for Augmented Reality and...
 

Similar a Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реального времени.”

Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source codePVS-Studio
 
Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for SpeedYung-Yu Chen
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4Abed Bukhari
 
Compiler optimization techniques
Compiler optimization techniquesCompiler optimization techniques
Compiler optimization techniquesHardik Devani
 
L14-Caches-I.pptx
L14-Caches-I.pptxL14-Caches-I.pptx
L14-Caches-I.pptxshakeela33
 
What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++Microsoft
 
Optimization in Programming languages
Optimization in Programming languagesOptimization in Programming languages
Optimization in Programming languagesAnkit Pandey
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITEgor Bogatov
 
C++20 the small things - Timur Doumler
C++20 the small things - Timur DoumlerC++20 the small things - Timur Doumler
C++20 the small things - Timur Doumlercorehard_by
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCIgor Sfiligoi
 
Vectorization on x86: all you need to know
Vectorization on x86: all you need to knowVectorization on x86: all you need to know
Vectorization on x86: all you need to knowRoberto Agostino Vitillo
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
 
ParallelProgrammingBasics_v2.pdf
ParallelProgrammingBasics_v2.pdfParallelProgrammingBasics_v2.pdf
ParallelProgrammingBasics_v2.pdfChen-Hung Hu
 
Профилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кодаПрофилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кодаsamsolutionsby
 
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...ssuserd6b1fd
 
Introduction to cpp (c++)
Introduction to cpp (c++)Introduction to cpp (c++)
Introduction to cpp (c++)Arun Umrao
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012Connor McDonald
 

Similar a Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реального времени.” (20)

Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for Speed
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4
 
Compiler optimization techniques
Compiler optimization techniquesCompiler optimization techniques
Compiler optimization techniques
 
L14-Caches-I.pptx
L14-Caches-I.pptxL14-Caches-I.pptx
L14-Caches-I.pptx
 
What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++What&rsquo;s new in Visual C++
What&rsquo;s new in Visual C++
 
Tiap
TiapTiap
Tiap
 
Optimization in Programming languages
Optimization in Programming languagesOptimization in Programming languages
Optimization in Programming languages
 
How to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJITHow to add an optimization for C# to RyuJIT
How to add an optimization for C# to RyuJIT
 
C++20 the small things - Timur Doumler
C++20 the small things - Timur DoumlerC++20 the small things - Timur Doumler
C++20 the small things - Timur Doumler
 
Accelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
 
Vectorization on x86: all you need to know
Vectorization on x86: all you need to knowVectorization on x86: all you need to know
Vectorization on x86: all you need to know
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 
ParallelProgrammingBasics_v2.pdf
ParallelProgrammingBasics_v2.pdfParallelProgrammingBasics_v2.pdf
ParallelProgrammingBasics_v2.pdf
 
Профилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кодаПрофилирование и оптимизация производительности Ruby-кода
Профилирование и оптимизация производительности Ruby-кода
 
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
Notes for C++ Programming / Object Oriented C++ Programming for MCA, BCA and ...
 
Introduction to cpp (c++)
Introduction to cpp (c++)Introduction to cpp (c++)
Introduction to cpp (c++)
 
Exploiting vectorization with ISPC
Exploiting vectorization with ISPCExploiting vectorization with ISPC
Exploiting vectorization with ISPC
 
How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012How to tune a query - ODTUG 2012
How to tune a query - ODTUG 2012
 
Xgboost
XgboostXgboost
Xgboost
 

Más de Provectus

Сергей Моренец: "Gradle. Write once, build everywhere"
Сергей Моренец: "Gradle. Write once, build everywhere"Сергей Моренец: "Gradle. Write once, build everywhere"
Сергей Моренец: "Gradle. Write once, build everywhere"Provectus
 
Василий Захарченко: "Взгляд на queryDsl-sql фреймворк как альтернатива Hiber...
Василий Захарченко: "Взгляд на  queryDsl-sql фреймворк как альтернатива Hiber...Василий Захарченко: "Взгляд на  queryDsl-sql фреймворк как альтернатива Hiber...
Василий Захарченко: "Взгляд на queryDsl-sql фреймворк как альтернатива Hiber...Provectus
 
Get to know provectus
Get to know provectusGet to know provectus
Get to know provectusProvectus
 
Why I want to Kazan
Why I want to KazanWhy I want to Kazan
Why I want to KazanProvectus
 
Артем Тритяк, Lead Front-End developer в Electric Cloud
 Артем Тритяк, Lead Front-End developer в Electric Cloud Артем Тритяк, Lead Front-End developer в Electric Cloud
Артем Тритяк, Lead Front-End developer в Electric CloudProvectus
 
Максим Мазурок “Material Design in Web Applications.”
Максим Мазурок “Material Design in Web Applications.”Максим Мазурок “Material Design in Web Applications.”
Максим Мазурок “Material Design in Web Applications.”Provectus
 
Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...
Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...
Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...Provectus
 
Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...
Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...
Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...Provectus
 
Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”
Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”
Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”Provectus
 
Артем Крикун (AT Production) “Промо-видео для приложений.”
Артем Крикун (AT Production) “Промо-видео для приложений.”Артем Крикун (AT Production) “Промо-видео для приложений.”
Артем Крикун (AT Production) “Промо-видео для приложений.”Provectus
 
Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...
Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...
Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...Provectus
 
Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”
Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”
Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”Provectus
 
Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...
Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...
Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...Provectus
 
Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"
Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"
Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"Provectus
 
Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"
Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"
Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"Provectus
 
Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...
Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...
Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...Provectus
 
Логотип — Бизнес или творчество
Логотип — Бизнес или творчествоЛоготип — Бизнес или творчество
Логотип — Бизнес или творчествоProvectus
 
ЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКА
ЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКАЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКА
ЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКАProvectus
 
Требования к заказчику. Роль QA в процессе постановки тех. задания
Требования к заказчику. Роль QA в процессе постановки тех. заданияТребования к заказчику. Роль QA в процессе постановки тех. задания
Требования к заказчику. Роль QA в процессе постановки тех. заданияProvectus
 

Más de Provectus (20)

Сергей Моренец: "Gradle. Write once, build everywhere"
Сергей Моренец: "Gradle. Write once, build everywhere"Сергей Моренец: "Gradle. Write once, build everywhere"
Сергей Моренец: "Gradle. Write once, build everywhere"
 
Василий Захарченко: "Взгляд на queryDsl-sql фреймворк как альтернатива Hiber...
Василий Захарченко: "Взгляд на  queryDsl-sql фреймворк как альтернатива Hiber...Василий Захарченко: "Взгляд на  queryDsl-sql фреймворк как альтернатива Hiber...
Василий Захарченко: "Взгляд на queryDsl-sql фреймворк как альтернатива Hiber...
 
Get to know provectus
Get to know provectusGet to know provectus
Get to know provectus
 
Why I want to Kazan
Why I want to KazanWhy I want to Kazan
Why I want to Kazan
 
Артем Тритяк, Lead Front-End developer в Electric Cloud
 Артем Тритяк, Lead Front-End developer в Electric Cloud Артем Тритяк, Lead Front-End developer в Electric Cloud
Артем Тритяк, Lead Front-End developer в Electric Cloud
 
Hackathon
HackathonHackathon
Hackathon
 
Максим Мазурок “Material Design in Web Applications.”
Максим Мазурок “Material Design in Web Applications.”Максим Мазурок “Material Design in Web Applications.”
Максим Мазурок “Material Design in Web Applications.”
 
Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...
Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...
Дима Гадомский (Юскутум) “Можно ли позаимствовать дизайн и функционал так, чт...
 
Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...
Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...
Михаил Лебединский (Termopal) “Особенности разработки веб и мобильных приложе...
 
Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”
Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”
Виталий Чмыхун (Provectus) “Как мы автоматизировали мобайл деплоймент.”
 
Артем Крикун (AT Production) “Промо-видео для приложений.”
Артем Крикун (AT Production) “Промо-видео для приложений.”Артем Крикун (AT Production) “Промо-видео для приложений.”
Артем Крикун (AT Production) “Промо-видео для приложений.”
 
Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...
Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...
Роман Колос (ComboApp) “Методология использования инструментов аналитики в ма...
 
Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”
Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”
Галина Дивакова (Clickky) “Вывод мобильных приложений в ТОП.”
 
Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...
Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...
Евгений Плохой (CapableBits) “Продвижение приложений до и после выхода на рын...
 
Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"
Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"
Сергей Укустов (Provectus IT): "Несоциалочка на Рельсах"
 
Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"
Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"
Василевский Илья (Fun-box): "автоматизация браузера при помощи PhantomJS"
 
Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...
Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...
Гатиятов Руслан, технический директор ООО “Дроид Лабс”: “Система управления п...
 
Логотип — Бизнес или творчество
Логотип — Бизнес или творчествоЛоготип — Бизнес или творчество
Логотип — Бизнес или творчество
 
ЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКА
ЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКАЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКА
ЕСЛИ БЫ УОЛТ ДИСНЕЙ ДЕЛАЛ ИНТЕРФЕЙСЫ. MOTION DESIGN. ПРАКТИКА
 
Требования к заказчику. Роль QA в процессе постановки тех. задания
Требования к заказчику. Роль QA в процессе постановки тех. заданияТребования к заказчику. Роль QA в процессе постановки тех. задания
Требования к заказчику. Роль QA в процессе постановки тех. задания
 

Último

9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Servicenishacall1
 
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Leading Mobile App Development Companies in India (2).pdf
Leading Mobile App Development Companies in India (2).pdfLeading Mobile App Development Companies in India (2).pdf
Leading Mobile App Development Companies in India (2).pdfCWS Technology
 
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRFULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRnishacall1
 
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPowerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPsychicRuben LoveSpells
 

Último (6)

9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 52 (Delhi) Call Girl Service
 
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 71 Noida Escorts >༒8448380779 Escort Service
 
Leading Mobile App Development Companies in India (2).pdf
Leading Mobile App Development Companies in India (2).pdfLeading Mobile App Development Companies in India (2).pdf
Leading Mobile App Development Companies in India (2).pdf
 
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
Obat Penggugur Kandungan Di Apotik Kimia Farma (087776558899)
 
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCRFULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
FULL ENJOY - 9999218229 Call Girls in {Mahipalpur}| Delhi NCR
 
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost LoverPowerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
Powerful Love Spells in Arkansas, AR (310) 882-6330 Bring Back Lost Lover
 

Федор Поляков (Looksery) “Face Tracking на мобильных устройствах в режиме реального времени.”

  • 2. LOOKSERY + + VIDEO SELFIES FACE FILTERS INTEGRATED CHAT
  • 4. - Algorithm based on Active Appearance Model. - Algorithm complexity is independent from image size. - You can control balance between tracking quality and tracking speed using only two constants. - Algorithm is iterative. Solve Least-Square problem at each iteration. - Average 5 iterations per frame. Maximum 10, minimum 1. - If you want run on 30 fps you have to perform about 150 iterations per second. 4 TRACKING ALGORITHM
  • 5. Optimisation flow —— : Algorithm asymptotic optimisation 3 FPS: First implementation 8 FPS: Memory preallocation 10 FPS: Algorithm parameters optimisation 13 FPS: Matrix storage optimisation and removing OOP code 18 FPS: Rewrite bottleneck code at assembler 24 FPS: Asymptotic optimisation of matrices multiplication 27 FPS: Replacing operations with float to operations with int 30 FPS: Multithreading 5
  • 6. From float to int 6 G[i][j] = (X[i][j] - Y[i][j]) / d[j]; We had to build so-called pseudo-inverse, that is So we have to perform many multiplication operations. Multiplication of two int is much faster then multiplication of two float. Lets create int matrix V: V[i][j] = X[i][j] - Y[i][j]; And float matrix D: D[i][j] = ( i== j ? d[i] : 0); // diagonal matrix Then G = V * D. From linear algebra:
  • 7. 7 CODE TIME const int ITERATIONS = 2000000000; long long sum = 0; for (int i = 0; i < ITERATIONS; i++) sum += i * (long long)i; cout<<sum<<endl; 0.00 sec const int ITERATIONS = 2000000000; long long sum = 0; for (int i = 0; i < ITERATIONS; i++) sum += i * (long long)i / 3; cout<<sum<<endl; 2.10 sec const int ITERATIONS = 2000000000; float sum = 0; for (int i = 0; i < ITERATIONS; i++) sum += i * (float)i / 3; cout<<sum<<endl; 4.29 sec Demo benchmarks
  • 8. Matrices multiplication optimisations 1) Don’t create a matrix with power of two size. Cache uses simple hash function to select a cash line in which the memory will be cached. This hash is just a some low (i.e. 16) bits of the memory address. When you use the matrix with the size power of two, each of the row has the same lowest bits, so you contain only one row in your cache instead of nearly a whole matrix. 2) Change the order of matrices multiplication: to multiply two matrix n x m and m x s you have to perform n * m * s operations. If you want to multiply the matrices A(n x m) * B(m x s) * C(s x k), you can do it in two ways with the same result: (A * B) * C with n*m*s + n*s*k operations. or A * (B * C) with m*s*k + n*m*k operations. n*m*s + n*s*k != m*s*k + n*m*k in general case, choose the smallest one. 8
  • 9. Hello assembler 9 int *row = GT[i]; for (int j = i, pos = (int)(i * GT.columnCount()); j < GT.rowCount(); j++) { int curr = 0; for (int k = 0; k < GT.columnCount(); k++, pos++) curr += row[k] * GT.val[pos]; GTG[i][j] = GTG[j][i] = curr; } It looks optimised enough. Is there anything we can improve? Well, let’s have a look at ASM code.. 0x149ac2: ldr.w lr, [r5, r9, lsl #2] 0x149ac6: add.w r9, r9, #0x1 0x149aca: cmp r9, r2 0x149acc: ldr r8, [r12], #4 0x149ad0: mla r11, lr, r8, r11 0x149ad4: blo 0x149ac2 ;at AppearanceTracker.cpp:555 No SIMD instructions there :(
  • 10. Let’s add some SIMD 10 int *row = GT[i]; int *rowInit = row; int *rowPos = GT.val + i * GT.columnCount(); int *rowEnd = row + processedCnt; for (int j = i; j < GT.rowCount(); j++) { row = rowInit; int accum[8] = {0}; __asm__ volatile ( "vld1.32 {d8-d11}, [%[accum]] nt" "L_mulStart%=:nt" "vld1.32 {d0-d3}, [%[row]]! nt" "vld1.32 {d4-d7}, [%[val]]! nt" "vmla.i32 q4, q2, q0 nt" "vmla.i32 q5, q3, q1 nt" "cmp %[row], %[rowEnd]nt" "blo L_mulStart%=nt" "vst1.32 {d8-d11}, [%[accum]]nt" : [row] "+r" (row), [val] "+r" (rowPos) : [rowEnd] "r" (rowEnd), [accum] "r" (accum) ); //собирание 8 значений из accum //допроцесс остатка mod 8 } int *row = GT[i]; for (int j = i, pos = (int)(i * GT.columnCount()); j < GT.rowCount(); j++) { int curr = 0; for (int k = 0; k < GT.columnCount(); k++, pos++) curr += row[k] * GT.val[pos]; GTG[i][j] = GTG[j][i] = curr; }
  • 11. Practical difference? 11 Let’s profile it Before: After: Approx. 2-2.5 times faster
  • 12. 12 Some issue about hardware Task: Crop a square from CMSampleBuffer(that contains CVImageBufferRef) and write it using AVAssetWriterInputPixelBufferAdaptor Input buffer address Target image address Create CMSampleBuffer by just moving base address and new setting height. O(1) operation. BAD Create CMSampleBuffer by creating new CVPixelBufferRef from CVTextureCache and copy image. O(Height*Width) operation GOOD
  • 13. 13 iOS 8 strikes back iPhone 5S iOS 7.1 - 30 FPS iPhone 5S iOS 8.0 - 15 FPS O_o Possible reasons: 1) Memory corruption at C++ core code 2) iOS 8 QOS: Wrong queue priority: QOS_CLASS_BACKGROUND instead of QOS_CLASS_USER_INITIATED 3) Blinking of this guy
  • 14. CONTACT INFORMATION FEDOR POLYAKOV Mobile: +38 097 59 0000 9 E-Mail: fedor@looksery.com YURII MONASTYRSHYN Mobile: +38 067 482 60 97 E-Mail: yurii@looksery.com VICTOR SHABUROV, FOUNDER Mobile: +1 650 575 9359 Fax: +1 866 626 9582 E-Mail: victor@looksery.com WEB looksery.com facebook.com/looksery twitter.com/looksery