This document discusses the adaptation of static code analysis tools for developing parallel programs. Static code analysis was originally introduced in the 1970s-1980s as a complement to compilers but declined in popularity in the 1990s as compiler diagnostics improved. However, interest has increased again as modern static analyzers can detect more complex errors, such as unsafe data access from multiple threads in parallel programs. The document examines how static analysis tools can help simplify the process of creating parallel program solutions by detecting errors even in rarely executed code sections.
Ensuring Technical Readiness For Copilot in Microsoft 365
Adaptation of the technology of the static code analyzer for developing parallel programs
1. Adaptation of the technology of the static
code analyzer for developing parallel
programs
Authors: Andrey Karpov, Evgeniy Ryzhkov
Date: 16.02.2008
Abstract
In the article the question of use of the static code analyzers in modern parallel program development
processes is considered. Having appeared in 70-80s as an addition to compilers, the static analyzers
stopped to be popular with the developers in 90s. The reason was probably the increase of the errors
diagnostics quality by the compilers. But in 2000s the interest to the static code analyzers started to
increase again. It is explained by the fact that new static code analyzers were created, which started to
detect quite difficult errors in programs. If the static code analyzers of the past made it possible, for
example, to detect an uninitialized variable, modern static code analyzers tend to detect an unsafe
access to data from several threads. The modern tendency of static code analyzers development
became their use for diagnosing errors in parallel programs. In the work the situations are considered,
where the use of such tools makes it possible to considerably simplify the process of creating parallel
program solutions.
Introduction
One of the actively developing tendencies in the sphere of computer engineering is parallel computing.
Parallelism becomes the basis of the growth of efficiency, which is connected with the slowdown of rate
of growth of clock frequency of modern microprocessors. If during the latest 30 years the efficiency was
determined by the clock frequency, optimization of commands execution, cache increase, then in the
nearest years it will be determined by multi-core.
The processors producers shifted the accent from increasing the clock frequency to the realization of
the parallelism in the processors proper at the expense of using multi-core architecture. The essence of
the idea is to integrate more than one core into one processor. This approach makes it possible to avoid
many technological problems connected with increasing clock frequencies and to create at the same
time more efficient processors. The system including the processor with two core in fact does not differ
from two-processor computer and the system with a four-core processor does not differ from a four-
processor computer.
It is very difficult for the programmers who have begun to use multi-core systems to direct themselves
in all the details of their using during developing programs of classic applications. As the practice shows,
the difficulties begin when the claims of the efficiency and mobility are laid to the parallel software
under development. It is connected with the fact that universal tools that facilitate the work of the
programmer and provide a full access to debugging information are on the stage of development. The
problem consists in the absence of standards in the sphere of creating and debugging programs for
parallel systems for the reason of the computing sphere being young.
2. The development of multi-core computing machinery is closely connected with the development of the
technology of parallel programming, both universal and under a certain architecture of a specific
system. One of the directions in the sphere of improvement of the methodology of parallel
programming is the research work in the sphere of static program code analyzer with the aim of
revelation of errors typical of parallel application.
1. The perspectives of systems of development and testing parallel
programs
A joke is popular with the specialists who are occupied with parallel calculations: "Parallel computing is
the technology of the future, and this will be ever urgent". This joke has kept its urgency for several
decades already. Similar spirits were spread among the community of computers architecture
developers, concerned by the fact that the limit of clock frequency of processors will be reached soon,
though much slower than before. The fusion of specialists' optimism about parallel calculations and
systems architects' pessimism favored the appearance of revolutionary multi-core processors [1].
Parallel computing ceased to be the privilege of large hardware-software complexes meant for big tasks
and now come to be a means for solving everyday tasks linked with work, studies, entertainment,
computer games, medicine.
But a huge layer of tasks connected with software development that takes into account the aspects of
parallel programming is clearly observed at the introduction of new highly efficient parallel counting
systems. The development of parallel software is a difficult task, which is now taken up mainly by
research institutes of modeling, organizations connected with the data bulk and other big
establishments. But this is not practically taken up by ordinary programmers, who take part in
developing applied software for an ordinary user. Such programs are in majority and as a consequence
the majority of programmers have never run across the tasks of parallel data processing.
The crisis in the sphere of software development is outlined connected with the forthcoming of multi-
core systems at the mass market. And this crisis will be connected with the lack of specialists creating
and supporting parallel code or capable of modifying older developments taking into consideration the
technology of paralleling. Such specialists are quite few and are well placed and have a good well-paid
job.
Naturally many programmers will later master new technologies and will get the skill of creating parallel
code. But a lot of time will pass and not everyone will be likely to realize such transition successfully. The
thing is that the technology of parallel programming is much more difficult than other technologies
changes like, for example, migration from C++ to C# or from 32-bit systems to 64-bit. Parallel
programming requires from a person the knowledge of theory of parallel algorithms, synchronization,
means of construction of parallel code and many other things.
The necessity of tools facilitating the adaptation and testing the existing software for multi-core
counting systems will naturally occur.
Many steps are taken and a number of decisions and methodologies are offered in the direction of
automation and facilitating the creation of parallel program systems. The examples are: MPI, OpenMP,
Intel Threading Analysis Tools, or, for example, a Russian development OpenTS [2] by Program Systems
Institute of the Russian Academy of Sciences. But the existence of all these systems unfortunately does
3. not give the opportunity to speak of complex solution of problems, which a beginning or an experienced
programmer comes across.
If the questions of the technologies making it possible to create parallel programs can be considered
closed, then the questions of testing and verification of parallel program code demand great attention
and creating corresponding support systems.
2. Dynamic and static analysis as the most perspective methods for
testing parallel code
The main ways of program code verification are: the program proof, white box method, black box
method, code review, dynamic code analysis, static code analysis. Unfortunately, in case of parallel code
the greatest practical result is presented by two last methods only. We will try to give proof the this
statement.
The program proof. It does not raise any doubts that only this approach makes it possible to guarantee
an algorithm correctness [3]. But the research of parallel programs must not be carried out in terms of
relations between incoming and outcoming data of the program like it is usually for the
successive/consecutive programs. This show that the verification and proof of the correctness of work of
such programs demands the development of adequate means of formal specification. In particular, it is
necessary to have an opportunity to formally express the relation between the states of the system at
the moments of time when these or those take place during the work of the program system [4].
Adding extra complexity makes this approach even more tedious and grounded only at creating critical
program systems. If to take into account the fact that the proof of code correctness is not practiced
even for series code, then we can't consider this method as practical methodology of parallel programs
verification.
White box method. Let us understand under the white box method testing the execution of highest
available quality of different parts of the code with the use of debugging or other tools. The more
coverage of the code was achieved, the more completely the testing was carried out. The white box
method is sometimes also understood as a simple application debugging with the aim of finding a
certain error. A complete testing with the help of a white box method has become impossible long time
ago due to huge code volume of modern programs. In case of parallel programs this method may turn
out to be absolutely useless and inapplicable in practice. The reason consists in the fact that, like in the
physic of elementary particles, the effect is observed when a change of a certain parameter influences
on the state of the whole system. Thus, it is possible to make a conclusion that this method is of no
great practical interest.
The black box method established a much better reputation. Unit test can be also referred here. The
main idea of the method consists in writing a set of tests for separate modules and functions, which will
verify all the main modes of their work. A number of sources refer a unit test to a white box method for
it is based on the knowledge of program organization. The authors tick to the opinion that tested
functions and modules must be considered as black boxes because unit tests must not take into account
the inner organization of the function. The ground for it can be the methodology of development when
tests are developed before creating the function itself, which favors the increase in the control of their
functionality from the point of view of the specification.
4. The black box method is quite suitable for parallel programs but anyway it can't be the method
providing a sufficient level of reliability of the verification. At another system with another quantity of
computational nodes or at a heterogeneous system a parallel program can behave in a completely
different way. This unpleasant peculiarity sufficiently limits the attractiveness of the method in contrast
to using it for the verification of usual programs.
Code review. This is the oldest, the most proved and reliable approach to searching for any kinds of
defects. This methodology is based on the combination of code reading with a execution of a set of rules
and recommendations very well described, for example, in Steve McConnell's book "Code Complete"
[5]. Unfortunately this practice is inapplicable for a large-scale verification of modern program systems
because of their big volume. Though this way gives beat results, it is not always used under the
conditions of modern life cycles of the software development where a very important moment is the
period of development and the time of the product release to the market. That's why the code review
generally comes to rare meetings the aim of which is training new and not less experienced staff to
write a high quality code than for verification of work of a number of modules. This is a very good way of
increasing programmers' qualification but it can't be considered as a complete means for quality control
of a developed system.
The methodology of static code analysis, which will be described further, comes to assist the
programmers who are aware of the necessity of regular code review but do not have enough time. The
main task of the methodology is reducing the code volume demanding the man's attention and thus
reducing the time for its review.
The dynamic code analysis is a software analysis, which is carried out with the help of executing
programs at a real or virtual processor. A dynamic analysis is often understood as a program code study
with the aim of its optimization. But we will speak of the dynamic analysis as of the method of parallel
programs testing.
For dynamic race detection, the program is executed and a history of its memory accesses and
synchronization operations is recorded and analyzed. Because the history is in fact a feasible execution,
dynamic detectors typically suffer a lower false positive rate than static detectors. On the other hand,
since not all possible execution paths are considered, the dynamic technique is not sound and thus
cannot certify a program to be free of data races. The dynamic technique also imposes runtime
overhead which must be kept within limits while still providing reasonably accurate race detection. The
post-mortem approach records events during program execution and analyzes them later, or records
only critical events and then carefully replays the program. This approach is unsuitable for long-running
applications that have extensive interactions with their environment. The other approach is to record
and analyze information as efficiently as possible on-the-fly [6].
Dynamic parallel code analyzers are already represented by the commercial solutions and have taken a
deserved place in a row of other code testing systems (a vivid example is a set of tools Intel Threading
Tools [7]). In the scope of it we will not speak about this dynamic analysis methodology and hope that
this methodology and corresponding tools will be helpful in creating reliable parallel systems.
But like it was mentioned before, dynamic analysis does not make it possible to find all the errors for it is
often impossible to execute the whole tested program code or the succession of its execution will be
very different from the real system. In this case the static analysis will be helpful, and we will describe it
in more detail. It is connected with the fact that all researches and realizations in this direction are more
likely to have a theoretical character and need more research and practical work.
5. Static code analysis is a software analysis which is carried out without real execution of programs under
research. Depending on the tool which is used the depth of the analysis can vary from defining the
behavior of certain operators to the analysis including all the source code. Lately, the static analysis was
mostly used for verification of software properties, which is used in computer systems of high reliability.
In case of parallel programs the static analysis makes it possible to detect errors even in those sections
of the code, which get called only in very rare cases. Regardless of the stage of execution, static analysis
makes it possible to detect potential errors which may not show up during the work at the whole class
of apparatus or program systems. The static analysis can thus substantially complement dynamic
analysis and be effectively used in parallel software development life cycle.
3. The sphere of effective application of static parallel code analyzer
The obstacle on the way of wide introduction of static analysis method consists in the fact that difficulty
of creating such systems is much higher than dynamic ones. That's why in the network of this article we
will offer not an attempt to describe and create an ideal analyzer but to find a sphere of application for
which the creating of the analysis is an observed task and can give a good practical effect.
For this reason let us consider main levels of decomposition of the task in case of creating parallel
algorithms. Let us name these levels:
• Division of tasks into subtasks.
• Division of subtasks into many quasi-homogeneous procedures which are executed at the same
time with different source data. In the mathematical physics such type of parallelism is called
geometrical parallelism for paralleling here is carried out with the help of calculations
distribution in different points of calculation sphere into different processors.
• Paralleling different procedures.
• Paralleling certain processor commands. Optimizing compilers and processors proper cope with
it quite well already.
Division of tasks into subtasks. It is an ideal case of paralleling when the task is divided into several parts
each of which can be solved separately. In this case the system of interaction of parallel threads is quite
simple and there's no sense to talk about a static analysis.
Division of each subtask into many parallel procedures. On the one hand it is the most interesting level
for analysis. This is exactly the level of parallel procedures where the most part of mistakes is made,
which is connected with particular difficulty of such systems. This level is also interesting with the fact
that it helps to achieve good temporary indicators of paralleling. That's why many researchers start to
work with static analysis at this level and find it very difficult. They consider demonstration examples.
But when they get to real systems, moreover written on such difficult language a C++, then it becomes a
very difficult task to create a working static analyzer. As the result we observe practical absence of
commercial solutions based only on static analysis. Dynamic-static analysis is practically used when after
the stage of program execution static analysis is carried out of the received data and source code.
Thus, it is possible to make a conclusion that though static analysis is interesting for the second level,
but it is difficult to be realized in practice. But one more step can be made on getting lower to the third
level, which is often forgot, and to find out that this is the very level where static analysis can give the
best result.
6. This is the level of different systems organizing the execution of both separate functions and sectors of
function code. MPI, OpenMP, OpenTS can be referred here. Exactly for such systems the use of static
code analyzers is considered the most reasonable.
We will consider the possibility of application of static code analyzer on the example of OpenMP as the
most actively developing technology. The standard OpenMP was developed in the year 1997 as API,
aimed at writing sorted multithread applications. First it was based on Fortran language but later
included C/C++. The last version of OpenMP is 2.0; it is fully supported by Visual C++ 2005. The standard
OpenMP is supported by the platform Xbox 360.
4. Static code analysis on the example of the technology OpenMP
Let us consider two simple examples demonstrating the cases where static analyzers may be helpful, for
example, in the process of writing a new code. Let us imagine that a programmer written the following
code:
size_t i;
#pragma omp parallel sections num_threads(2)
{
#pragma omp section
{
for (i = 0; i != arraySize; ++i)
array[i].a = 1;
}
#pragma omp section
{
for (i = 0; i != arraySize; ++i)
array[i].b = 2;
}
}
Picture 1 - First example.
In this example the error consists in using one variable 'i' for the two simultaneously executed cycles
which leads to an emergency program completion. The troubleshooting of this error can be easily
realized in a static analyzer with the help of verification that the same variable is used for recording in
two parallel sections. Of course such mistake is likely to be detected at the testing stage, but the
integration of a static analyzer in the process of development can substantially save time at the expense
of preliminary analysis just before the beginning of the testing.
7. The next example is also very simple, but it can arouse the necessity to spend much time on searching
an error. This code leads to calculation of the wrong magnitude of the sum aroused by the absence of
OpenMP directive reduction(+: sum).
int sum = 0;
#pragma omp parallel num_threads(2)
{
#pragma omp for
for (int i = 0; i < arraySize; ++i)
sum += array[i];
}
_tprintf(_T("sum=%in"), sum);
Picture 1 - Second example.
Such constructions of recording in variable 'sum' from two threads is also easily diagnosed by static
analyzers and make it possible to substantially facilitate the work on program code verification.
Conclusion
As it was said before, no commercial static analyzers for parallel systems support are represented at the
market At the same time, there's no doubt in the necessity of creating such tool, for the existing means
of development of parallel programs do not make it possible to solve all the problems. Let us describe
one more time what such solution may look like. The developer prepares parallel code and then starts
the static analyzer, which informs whether the code is correct or not. In case if the developed code
contains errors, then the diagnostic message will make it possible to quickly find and correct it.
We hope that on appearance of the tool of static parallel code analyzer the amount of program errors in
the developed systems will reduce.
References
1. Kang Su Gatlin, Pete Icency. OpenMP and C++ // MSDN Magazine #10, 2005.
2. Sergey Abramov, Alexei Adamovich, Alexander Inyukhin, Alexander Moskovsky, Vladimir
Roganov, Elena Shevchuk, Yuri Shevchuk, and Alexander Vodomerov. 2005. OpenTS: An Outline
of Dynamic Parallelization Approach. Parallel Computing Technologies: 8th International
Conference, PaCT 2005, Krasnoyarsk, Russia, September 5-9, 2005. Proceedings. Editors: Victor
Malyshkin - Berlin etc. Springer, 2005. - Lecture Notes in Computer Science: Volume 3606, pp.
303-312
3. Daykstra E, The programming discipline. M.: Mir, 1978.
4. Valiev M.K. Application of temporary logics to programs specification. // Programming. 1998, 2,
p. 3-9.
5. Steve McConnell, "Code Complete, 2nd Edition" Microsoft Press, Paperback, 2nd edition,
Published June 2004, 914 pages, ISBN: 0-7356-1967-0.
8. 6. Yuan Yu, Tom Rodeheffer, Wei Chen. RaceTrack: Efficient Detection of Data Race Conditions via
Adaptive Tracking // SOSP05, October 23-26, 2005, Brighton, United Kingdom.
7. Clay R. Breshirs, The set of instruments Intel Threading Tools and OpenMP library.