The document discusses parallel computing in modern C++. It introduces native threads, standard threads in C++11, thread pools, std::async, and examples of parallelizing real applications. It also covers potential issues like data races and tools for detecting them like Valgrind and ThreadSanitizer. Finally, it recommends using std::async, std::future and boost::thread for flexibility and OpenMP for ease of use.
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
Евгений Крутько, Многопоточные вычисления, современный подход.
1. C++ User Group Russia, Ekaterinburg, 25.11.2016
Многопоточные вычисления, современный подход
Крутько Е.С.
НИЦ «Курчатовский институт»
e.s.krutko@gmail.com
2. Цель доклада
Современный стандарт C++ позволяет делать
многопоточные вычисления легко и удобно.
Если приложение или библиотека тратит меньше
времени на работу – это хорошо
Исходники тестов:
https://github.com/eskrut/multithread.git
2
5. Threadpool. с++11
5
#include <stdlib.h>
#include <assert.h>
//git submodule add https://github.com/progschj/ThreadPool.git
#include "ThreadPool/ThreadPool.h"
int someExample();
int someExampleParallel();
int main(int argc, char**argv) {
someExample();
someExampleParallel();
return 0;
}
6. Threadpool. с++11
6
int someExample() {
int value1;
//code to evaluate value1
//may require significant amount of time
value1 = 3;
int value2;
//code to evaluate value2
//may require significant amount of time
value2 = 3;
//Now use some fancy algorythm using values
int result = value1 + value2;
assert(result == 6);
return result;
}
7. Threadpool. с++11
7
int someExampleParallel() {
ThreadPool pool(8);
auto futureValue1 = pool.enqueue([](){
int value1;
//code to evaluate value1
//may require significant amount of time
value1 = 3;
return value1;
});
auto futureValue2 = pool.enqueue([](){
int value2;
//code to evaluate value2
//may require significant amount of time
value2 = 3;
return value2;
});
//Now use some fancy algorythm using values
int result = futureValue1.get() + futureValue2.get();
assert(result == 6);
return result;
}
8. std::async. с++11
8
int someExampleParallel() {
auto futureValue1 = std::async([](){
int value1;
//code to evaluate value1
//may require significant amount of time
value1 = 3;
return value1;
});
auto futureValue2 = std::async([](){
int value2;
//code to evaluate value2
//may require significant amount of time
value2 = 3;
return value2;
});
//Now use some fancy algorythm using values
int result = futureValue1.get() + futureValue2.get();
assert(result == 6);
return result;
}
9. Пример из жизни
9
void PhotoSortModel::fill(const QString &path)
{
unsigned numRows = invisibleRootItem()->rowCount();
auto read = [this,path](int id, int start, int stop){
for(int row = start; row < stop; ++row) {
auto photo = photoItem(row);
readDown(photo, path);
QMetaObject::invokeMethod(this,
"partialDone",
Qt::DirectConnection,
Q_ARG(int, id),
Q_ARG(int, row-start));
}
return 0;
};
read(0, 0, numRows);
emit(loaded());
for(unsigned row = 0; row < numRows; ++row)
itemChanged(photoItem(row));
}
14. valgring
valgrind --num-callers=1 --tool=helgrind ./datarace
==77960== Possible data race during read of size 4 at 0x10480810C by thread #3
==77960== Locks held: none
==77960== at 0x100002D2C: main::$_0::operator()(int volatile&, int) const
(datarace.cpp:15)
==77960==
==77960== This conflicts with a previous write of size 4 by thread #2
==77960== Locks held: none
==77960== at 0x100003634: main::$_1::operator()(int volatile&, int) const
(datarace.cpp:21)
==77960== Address 0x10480810c is on thread #1's stack
14
dataRaceTarget++;
22. Параллельная [стандартная] библиотека / на примере gcc
22
Результаты замены std на std::_parallel
std std::_parallel
max_element 0.015215 0.006585
sort 1.19395 0.264197
for_each 0.970016 0.356471
23. А если нужно большая гибкость? boost::thread
23
#include <stdlib.h>
#include "boost/thread.hpp"
#include <assert.h>
void someParallelTask() {
while(true) { //forewer cycle
boost::this_thread::disable_interruption di;
//alloc some resources to work
//do not interrupt me
boost::this_thread::restore_interruption ri(di);
//ok check if I should die (
boost::this_thread::interruption_point();
}
}
int main(int argc, char**argv) {
auto thread = boost::thread( someParallelTask );
//do some jod
//And now I do not want to wait thread
thread.interrupt();
thread.join();
return 0;
}
24. IMHO. Самая простая параллельность. OpenMP
24
int someExample() {
int value1;
//code to evaluate value1
//may require significant
amount of time
value1 = 3;
int value2;
//code to evaluate value2
//may require significant
amount of time
value2 = 3;
//Now use some fancy algorythm
using values
int result = value1 + value2;
assert(result == 6);
return result;
}
#include <omp.h>
int someExampleParallel() {
int value1;
int value2;
#pragma omp parallel sections
{
#pragma omp section
{
//code to evaluate value1
//may require significant
amount of time
value1 = 3;
}
#pragma omp section
{
//code to evaluate value2
//may require significant
amount of time
value2 = 3;
}
}//here we wait for all blocks
//Now use some fancy algorythm
using values
int result = value1 + value2;
assert(result == 6);
return result;
}
25. Итог
При достаточном уровне понимания сути алгоритма работы
программы с использованием современного C++ легко
внедрять параллельную обработку данных. И этим надо
пользоваться )
Я предпочитаю:
std::async и std::list<std::future<T>>
boost::thread (если нужен менеджер потоков)
OpenMP
25