SlideShare a Scribd company logo
1 of 33
The operation principles of PVS-
Studio static code analyzer
Authors:
Candidate of Engineering Sciences
Evgeniy Ryzhkov, evg@viva64.com
Candidate of Physico-Mathematical Sciences
Andrey Karpov, karpov@viva64.com
OOO "Program Verification Systems"
(www.viva64.com)
• Development, marketing and sales of a software product.
• Office: Tula, 200 km away from Moscow.
• Staff: 24 people.
PVS-Studio
• More than 320 diagnostics for C, C++
• More than 120 diagnostics for C#
• Windows
• Linux
• Plugin for Visual Studio
• Quick Start (compilation monitoring)
• SonarQube
Our achievements
• To let the world know about our product, we check open-source projects. By the
moment we have checked about 270 projects.
• A “side” effect: we found more than 10 000 bugs in open source projects, without
setting it as a goal.
• On the average there are 40 errors in a project - not that much.
• It is important to emphasize once more that this was a “side” effect. We don’t have a
goal to find as many errors as possible. Quite often, we stop when we find enough
errors for an article.
• Conclusion: it’s rather easy to check even unfamiliar projects and find errors in them.
In the beginning: what we DO NOT USE
We do not use formal grammar for analysis
• The analyzer works on a higher level
• We analyze the derivation tree
• To build the tree we rely on existing components:
• External preprocessor
• OpenC ++ library, which we improved with the development of C++ (actually
there is almost nothing left from OpenC++)
• When working with C# code we take Roslyn as the basis
We do not use methods of programs proofs.
• PVS-Studio has nothing to do with the Prototype Verification System
(PVS) http://pvs.csl.sri.com/
• PVS-Studio is a contraction of "Program Verification Systems" (OOO
"Program Verification Systems")
We do not use substring search (string matching)
and regular expressions
• A dead-end way
• It is of no use even in the simplest situations
• Example: if (A+B == A+B)
• A+B == B+A
• A+(B) == (A)+B
• ((A+B)) == A+B
• More fatal: types, object sizes, inheritance, variable values and so on.
What we USE
The details of C++ and C# analysis differ, we are not going to cover them
here
Pattern-based analysis
• Pattern matching based on the derivation tree
• It is used to search for fragments in the source code that are similar
to the known code patterns with an error
• The complexity of the diagnostics varies greatly
• In some cases these are empirical algorithms
if ((*path)[0]->e->dest->loop_father != path->last()->e->....)
{
delete_jump_thread_path (path);
e->aux = NULL;
ei_next (&ei;);
}
else
{
delete_jump_thread_path (path);
e->aux = NULL;
ei_next (&ei;);
}
A simple case: copy-paste
The GCC Project
V523 The 'then' statement is equivalent to the 'else' statement. tree-ssa-
threadupdate.c 2596
A more complicated case: check of a wrong
variable
public override Predicate JoinWith(Predicate other)
{
var right = other as PredicateNullness;
if (other != null)
{
if (this.value == right.value)
{
The CodeContracts Project
V3019 Possibly an incorrect variable is compared to null after type conversion
using 'as' keyword. Check variables 'other', 'right'. CallerInvariant.cs 189
Quite a complicated case: a badly written
macro
#define ICB2400_VPINFO_PORT_OFF(chan) 
(ICB2400_VPINFO_OFF + 
sizeof (isp_icb_2400_vpinfo_t) + 
(chan * ICB2400_VPOPT_WRITE_SIZE))
off += ICB2400_VPINFO_PORT_OFF(chan - 1);
V733 It is possible that macro expansion resulted in incorrect evaluation
order. Check expression: chan - 1 * 20. isp.c 2301
The FreeBSD Project
Type inference
• The type inference based on the semantic model of the program
allows the analyzer to have full information about all variables and
statements in the code.
• It is important to detect errors
• It is important for exceptions
• The information about classes is especially important
Types are also important for bug detection
The Cocos2d-x project
WCHAR *gai_strerrorW(int ecode);
#define gai_strerror gai_strerrorW
fprintf(stderr, "net_listen error for %s: %s",
serv, gai_strerror(n));
V576 Incorrect format. Consider checking the fourth actual argument of
the 'fprintf' function. The pointer to string of char type symbols is
expected. ccconsole.cpp 341
Types are important for exceptions
// volatile the variable is assigned to itself
volatile int *ptr;
....
*ptr = *ptr; // No positive V570
The information about classes is especially
important: inheritance hierarchy, for instance
class sg_throwable : public std::exception { .... };
class sg_exception : public sg_throwable { .... };
if (!aInstall) {
sg_exception("missing argument to scheduleToUpdate");
}
V596 The object was created but it is not being used. The 'throw' keyword
could be missing: throw sg_exception(FOO); root.cxx 239
The FlightGear project
Symbolic execution
• The symbolic execution allows evaluating variable values that can lead
to errors, perform range checking of values.
• One of the most important mechanisms:
• Overflows
• Memory Leaks
• Array index out of bounds
• Null pointers/references
• Meaningless conditions
• Division by zero
• and so on…
The values of variables: the size of the array,
indices
Handle<YieldTermStructure> md0Yts() {
double q6mh[] = {
0.0001,0.0001,0.0001,0.0003,0.00055,0.0009,0.0014,0.0019,
0.0025,0.0031,0.00325,0.00313,0.0031,0.00307,0.00309,
........................................................
0.02336,0.02407,0.0245 }; 60 elements
....
for(int i=0;i<10+18+37;i++) { i < 65
q6m.push_back(
boost::shared_ptr<Quote>(new SimpleQuote(q6mh[i])));
The QuantLib
project
V557 Array overrun is possible. The value of 'i' index could reach 64.
markovfunctional.cpp 176
The values of variables: using conditions to
determine the range
std::string rangeTypeLabel(int idx)
{
const char* rangeTypeLabels[] = {"Self", "Touch", "Target"};
if (idx >= 0 && idx <= 3)
return rangeTypeLabels[idx];
else
return "Invalid";
}
V557 Array overrun is possible. The value of 'idx' index could reach 3.
esmtool labels.cpp 502
The OpenMW project
The values of functions
static inline size_t UnboxedTypeSize(JSValueType type)
{
switch (type) {
.......
default: return 0;
}
}
Minstruction *loadUnboxedProperty(size_t offset, ....)
{
size_t index = offset / UnboxedTypeSize(unboxedType);
The Thunderbird project
V609 Divide by zero. Denominator range [0..8]. ionbuilder.cpp 10922
The values of variables: pointers/references
if (providerName == null)
{
ProviderNotFoundException e =
new ProviderNotFoundException(
providerName.ToString(),
SessionStateCategory.CmdletProvider,
"ProviderNotFound",
SessionStateStrings.ProviderNotFound);
throw e;
V3080 Possible null dereference. Consider inspecting 'providerName'.
System.Management.Automation SessionStateProviderAPIs.cs 1004
The PowerShell Project
Method annotations
• Method annotations provides more information about the used
methods than can be obtained by analyzing only their signatures.
• C/C++. By this moment we have annotated 6570 functions (standard
C and C++ libraries, POSIX, MFC, Qt, ZLib and so on).
• C#. At the moment we have annotated 920 functions.
An example of annotating the memcmp
function
C_"int memcmp(const void *buf1, const void *buf2, size_t count);"
ADD(REENTERABLE | RET_USE | F_MEMCMP | STRCMP | HARD_TEST | INT_STATUS,
nullptr, nullptr, "memcmp", POINTER_1, POINTER_2, BYTE_COUNT);
• C_- an auxiliary control mechanism of annotations (unit tests)
• REENTERABLE - repetitive call with the same arguments will give the same result
• RET_USE - the result should be used
• F_MEMCMP - launch of certain checks for buffer out of bounds
• STR_CMP - the function returns 0 in case of equality
• HARD_TEST - a special function. Some programmers define their own functions in
their own namespace. Ignore namespace.
• INT_STATUS - explicitly compare the result with 1 or -1.
• POINTER_1, POINTER_2 - the pointers must be non-zero and different.
• BYTE_COUNT - this parameter specifies the number of bytes and must be > 0.
Annotation of memcmp: checking the result
bool operator()(const GUID& _Key1, const GUID& _Key2) const
{
return memcmp(&_Key1, &_Key2, sizeof(GUID)) == -1;
}
The CoreCLR project
V698 Expression 'memcmp(....) == -1' is incorrect. This function can
return not only the value '-1', but any negative value. Consider using
'memcmp(....) < 0' instead. sos util.cpp 142
Annotation of memcmp: storing the result
The Firebird project
V642 Saving the 'memcmp' function result inside the 'short' type variable is
inappropriate. The significant bits could be lost breaking the program's logic.
texttype.cpp 3
SSHORT TextType::compare(ULONG len1, const UCHAR* str1,
ULONG len2, const UCHAR* str2)
{
....
SSHORT cmp = memcmp(str1, str2, MIN(len1, len2));
if (cmp == 0)
cmp = (len1 < len2 ? -1 : (len1 > len2 ? 1 : 0));
return cmp;
}
Annotation of memcmp: wrong argument
The GLG3D project
V575 The 'memcmp' function processes '0' elements. Inspect
the 'third' argument. graphics3D matrix4.cpp 269
bool Matrix4::operator==(const Matrix4& other) const {
if (memcmp(this, &other, sizeof(Matrix4) == 0)) {
return true;
}
...
}
static int
psymbol_compare (const void *addr1, const void *addr2,
int length)
{
struct partial_symbol *sym1 = (struct partial_symbol *) addr1;
struct partial_symbol *sym2 = (struct partial_symbol *) addr2;
return (memcmp (&sym1->ginfo.value, &sym1->ginfo.value,
sizeof (sym1->ginfo.value)) == 0
&& .......
Annotation of memcmp: different arguments
The GDB Project
V549 The first argument of 'memcmp' function is equal to the second
argument. psymtab.c 1580
dst_s_read_private_key_file(....)
{
....
if (memcmp(in_buff, "Private-key-format: v", 20) != 0)
goto fail;
....
} 21 character
Annotation of memcmp: buffer underrun
The Haiku project
V512 A call of the 'memcmp' function will lead to underflow of the
buffer '"Private-key-format: v"'. dst_api.c 858
Annotation of memcmp: no status
The PHP project
V501 There are identical sub-expressions '!memcmp("auto", charset_hint,
4)' to the left and to the right of the '||' operator. html.c 396
if ((len == 4) /* sizeof (none|auto|pass) */ &&
(!memcmp("pass", charset_hint, 4) ||
!memcmp("auto", charset_hint, 4) ||
!memcmp("auto", charset_hint, 4)))
Annotation of custom functions
• Almost no support (except certain elements, as for example our own
printf function)
• There is no sense to develop this mechanism
• No one will spend months doing the markup of large projects
• The analyzer must work immediately
Testing the analyzer
• Testing the analyzer is the most important part of the development
process
• The hardest part about static analysis: not to complain
• A large test base:
• C++ Windows (Visual C++): 120 projects
• C++ Linux (GCC): 34 more projects
• C# Windows: 54 projects
We can send a more detailed version of the
presentation
• Write to us: support@viva64.com
• Follow on Twitter: @Code_Analysis
• Download PVS-Studio for Windows:
http://www.viva64.com/en/pvs-studio/
• Download PVS-Studio for Linux:
http://www.viva64.com/en/pvs-studio-download-linux/

More Related Content

What's hot

C++11 smart pointer
C++11 smart pointerC++11 smart pointer
C++11 smart pointer
Lei Yu
 
C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)
Olve Maudal
 

What's hot (20)

Basic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python ProgrammersBasic C++ 11/14 for Python Programmers
Basic C++ 11/14 for Python Programmers
 
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
[C++] The Curiously Recurring Template Pattern: Static Polymorphsim and Expre...
 
C++ references
C++ referencesC++ references
C++ references
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
 
Intel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correctionIntel IPP Samples for Windows - error correction
Intel IPP Samples for Windows - error correction
 
C++11 smart pointer
C++11 smart pointerC++11 smart pointer
C++11 smart pointer
 
С++ without new and delete
С++ without new and deleteС++ without new and delete
С++ without new and delete
 
C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)C++ idioms by example (Nov 2008)
C++ idioms by example (Nov 2008)
 
Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++Kirk Shoop, Reactive programming in C++
Kirk Shoop, Reactive programming in C++
 
C++11
C++11C++11
C++11
 
2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english2018 cosup-delete unused python code safely - english
2018 cosup-delete unused python code safely - english
 
200 Open Source Projects Later: Source Code Static Analysis Experience
200 Open Source Projects Later: Source Code Static Analysis Experience200 Open Source Projects Later: Source Code Static Analysis Experience
200 Open Source Projects Later: Source Code Static Analysis Experience
 
Smart Pointers in C++
Smart Pointers in C++Smart Pointers in C++
Smart Pointers in C++
 
Mathematicians: Trust, but Verify
Mathematicians: Trust, but VerifyMathematicians: Trust, but Verify
Mathematicians: Trust, but Verify
 
Handling Exceptions In C &amp; C++ [Part B] Ver 2
Handling Exceptions In C &amp; C++ [Part B] Ver 2Handling Exceptions In C &amp; C++ [Part B] Ver 2
Handling Exceptions In C &amp; C++ [Part B] Ver 2
 
Smart pointers
Smart pointersSmart pointers
Smart pointers
 
Checking the Source SDK Project
Checking the Source SDK ProjectChecking the Source SDK Project
Checking the Source SDK Project
 
C++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect ForwardingC++11: Rvalue References, Move Semantics, Perfect Forwarding
C++11: Rvalue References, Move Semantics, Perfect Forwarding
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs Chromium
 
PVS-Studio vs Chromium
PVS-Studio vs ChromiumPVS-Studio vs Chromium
PVS-Studio vs Chromium
 

Viewers also liked

Prolonger ses prêts
Prolonger ses prêtsProlonger ses prêts
Prolonger ses prêts
Niconum
 

Viewers also liked (12)

PVS-Studio. Статический анализатор кода. Windows/Linux, C/C++/C#
PVS-Studio. Статический анализатор кода. Windows/Linux, C/C++/C#PVS-Studio. Статический анализатор кода. Windows/Linux, C/C++/C#
PVS-Studio. Статический анализатор кода. Windows/Linux, C/C++/C#
 
Wild-life conservation though "awareness programme and joint patrol in Melgh...
Wild-life conservation  though "awareness programme and joint patrol in Melgh...Wild-life conservation  though "awareness programme and joint patrol in Melgh...
Wild-life conservation though "awareness programme and joint patrol in Melgh...
 
Ae224maers
Ae224maersAe224maers
Ae224maers
 
Upload Form 16 and E-File 2016 Income Tax Return Instantly
Upload Form 16 and E-File 2016 Income Tax Return InstantlyUpload Form 16 and E-File 2016 Income Tax Return Instantly
Upload Form 16 and E-File 2016 Income Tax Return Instantly
 
Prolonger ses prêts
Prolonger ses prêtsProlonger ses prêts
Prolonger ses prêts
 
C.V
C.VC.V
C.V
 
SEO with RoboHelp
SEO with RoboHelpSEO with RoboHelp
SEO with RoboHelp
 
Props describing them
Props describing themProps describing them
Props describing them
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...
 
Redes sociales, familiares y escuela.
Redes sociales, familiares y escuela.Redes sociales, familiares y escuela.
Redes sociales, familiares y escuela.
 
Final Report
Final ReportFinal Report
Final Report
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
 

Similar to The operation principles of PVS-Studio static code analyzer

Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
NIKHIL NAWATHE
 

Similar to The operation principles of PVS-Studio static code analyzer (20)

Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Headache from using mathematical software
Headache from using mathematical softwareHeadache from using mathematical software
Headache from using mathematical software
 
Picking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckPicking Mushrooms after Cppcheck
Picking Mushrooms after Cppcheck
 
Linux version of PVS-Studio couldn't help checking CodeLite
Linux version of PVS-Studio couldn't help checking CodeLiteLinux version of PVS-Studio couldn't help checking CodeLite
Linux version of PVS-Studio couldn't help checking CodeLite
 
ChakraCore: analysis of JavaScript-engine for Microsoft Edge
ChakraCore: analysis of JavaScript-engine for Microsoft EdgeChakraCore: analysis of JavaScript-engine for Microsoft Edge
ChakraCore: analysis of JavaScript-engine for Microsoft Edge
 
Rechecking TortoiseSVN with the PVS-Studio Code Analyzer
Rechecking TortoiseSVN with the PVS-Studio Code AnalyzerRechecking TortoiseSVN with the PVS-Studio Code Analyzer
Rechecking TortoiseSVN with the PVS-Studio Code Analyzer
 
The First C# Project Analyzed
The First C# Project AnalyzedThe First C# Project Analyzed
The First C# Project Analyzed
 
Checking 7-Zip with PVS-Studio analyzer
Checking 7-Zip with PVS-Studio analyzerChecking 7-Zip with PVS-Studio analyzer
Checking 7-Zip with PVS-Studio analyzer
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
 
Documenting Bugs in Doxygen
Documenting Bugs in DoxygenDocumenting Bugs in Doxygen
Documenting Bugs in Doxygen
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
Tesseract. Recognizing Errors in Recognition Software
Tesseract. Recognizing Errors in Recognition SoftwareTesseract. Recognizing Errors in Recognition Software
Tesseract. Recognizing Errors in Recognition Software
 
Introduction to C#
Introduction to C#Introduction to C#
Introduction to C#
 
"Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ...
"Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ..."Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ...
"Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ...
 
An Experiment with Checking the glibc Library
An Experiment with Checking the glibc LibraryAn Experiment with Checking the glibc Library
An Experiment with Checking the glibc Library
 
Checking the Code of LDAP-Server ReOpenLDAP on Our Readers' Request
Checking the Code of LDAP-Server ReOpenLDAP on Our Readers' RequestChecking the Code of LDAP-Server ReOpenLDAP on Our Readers' Request
Checking the Code of LDAP-Server ReOpenLDAP on Our Readers' Request
 
SAST and Application Security: how to fight vulnerabilities in the code
SAST and Application Security: how to fight vulnerabilities in the codeSAST and Application Security: how to fight vulnerabilities in the code
SAST and Application Security: how to fight vulnerabilities in the code
 
PVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating systemPVS-Studio is ready to improve the code of Tizen operating system
PVS-Studio is ready to improve the code of Tizen operating system
 
Looking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelopLooking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelop
 

More from Andrey Karpov

More from Andrey Karpov (20)

60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста
 
60 terrible tips for a C++ developer
60 terrible tips for a C++ developer60 terrible tips for a C++ developer
60 terrible tips for a C++ developer
 
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
 
PVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesPVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error Examples
 
PVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature OverviewPVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature Overview
 
PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибок
 
PVS-Studio в 2021
PVS-Studio в 2021PVS-Studio в 2021
PVS-Studio в 2021
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and Java
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal Engine
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for you
 
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsPVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
 
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
Analysis of commits and pull requests in Travis CI, Buddy and AppVeyor using ...
 

Recently uploaded

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Recently uploaded (20)

Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

The operation principles of PVS-Studio static code analyzer

  • 1. The operation principles of PVS- Studio static code analyzer Authors: Candidate of Engineering Sciences Evgeniy Ryzhkov, evg@viva64.com Candidate of Physico-Mathematical Sciences Andrey Karpov, karpov@viva64.com
  • 2. OOO "Program Verification Systems" (www.viva64.com) • Development, marketing and sales of a software product. • Office: Tula, 200 km away from Moscow. • Staff: 24 people.
  • 3. PVS-Studio • More than 320 diagnostics for C, C++ • More than 120 diagnostics for C# • Windows • Linux • Plugin for Visual Studio • Quick Start (compilation monitoring) • SonarQube
  • 4. Our achievements • To let the world know about our product, we check open-source projects. By the moment we have checked about 270 projects. • A “side” effect: we found more than 10 000 bugs in open source projects, without setting it as a goal. • On the average there are 40 errors in a project - not that much. • It is important to emphasize once more that this was a “side” effect. We don’t have a goal to find as many errors as possible. Quite often, we stop when we find enough errors for an article. • Conclusion: it’s rather easy to check even unfamiliar projects and find errors in them.
  • 5. In the beginning: what we DO NOT USE
  • 6. We do not use formal grammar for analysis • The analyzer works on a higher level • We analyze the derivation tree • To build the tree we rely on existing components: • External preprocessor • OpenC ++ library, which we improved with the development of C++ (actually there is almost nothing left from OpenC++) • When working with C# code we take Roslyn as the basis
  • 7. We do not use methods of programs proofs. • PVS-Studio has nothing to do with the Prototype Verification System (PVS) http://pvs.csl.sri.com/ • PVS-Studio is a contraction of "Program Verification Systems" (OOO "Program Verification Systems")
  • 8. We do not use substring search (string matching) and regular expressions • A dead-end way • It is of no use even in the simplest situations • Example: if (A+B == A+B) • A+B == B+A • A+(B) == (A)+B • ((A+B)) == A+B • More fatal: types, object sizes, inheritance, variable values and so on.
  • 9. What we USE The details of C++ and C# analysis differ, we are not going to cover them here
  • 10. Pattern-based analysis • Pattern matching based on the derivation tree • It is used to search for fragments in the source code that are similar to the known code patterns with an error • The complexity of the diagnostics varies greatly • In some cases these are empirical algorithms
  • 11. if ((*path)[0]->e->dest->loop_father != path->last()->e->....) { delete_jump_thread_path (path); e->aux = NULL; ei_next (&ei;); } else { delete_jump_thread_path (path); e->aux = NULL; ei_next (&ei;); } A simple case: copy-paste The GCC Project V523 The 'then' statement is equivalent to the 'else' statement. tree-ssa- threadupdate.c 2596
  • 12. A more complicated case: check of a wrong variable public override Predicate JoinWith(Predicate other) { var right = other as PredicateNullness; if (other != null) { if (this.value == right.value) { The CodeContracts Project V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'other', 'right'. CallerInvariant.cs 189
  • 13. Quite a complicated case: a badly written macro #define ICB2400_VPINFO_PORT_OFF(chan) (ICB2400_VPINFO_OFF + sizeof (isp_icb_2400_vpinfo_t) + (chan * ICB2400_VPOPT_WRITE_SIZE)) off += ICB2400_VPINFO_PORT_OFF(chan - 1); V733 It is possible that macro expansion resulted in incorrect evaluation order. Check expression: chan - 1 * 20. isp.c 2301 The FreeBSD Project
  • 14. Type inference • The type inference based on the semantic model of the program allows the analyzer to have full information about all variables and statements in the code. • It is important to detect errors • It is important for exceptions • The information about classes is especially important
  • 15. Types are also important for bug detection The Cocos2d-x project WCHAR *gai_strerrorW(int ecode); #define gai_strerror gai_strerrorW fprintf(stderr, "net_listen error for %s: %s", serv, gai_strerror(n)); V576 Incorrect format. Consider checking the fourth actual argument of the 'fprintf' function. The pointer to string of char type symbols is expected. ccconsole.cpp 341
  • 16. Types are important for exceptions // volatile the variable is assigned to itself volatile int *ptr; .... *ptr = *ptr; // No positive V570
  • 17. The information about classes is especially important: inheritance hierarchy, for instance class sg_throwable : public std::exception { .... }; class sg_exception : public sg_throwable { .... }; if (!aInstall) { sg_exception("missing argument to scheduleToUpdate"); } V596 The object was created but it is not being used. The 'throw' keyword could be missing: throw sg_exception(FOO); root.cxx 239 The FlightGear project
  • 18. Symbolic execution • The symbolic execution allows evaluating variable values that can lead to errors, perform range checking of values. • One of the most important mechanisms: • Overflows • Memory Leaks • Array index out of bounds • Null pointers/references • Meaningless conditions • Division by zero • and so on…
  • 19. The values of variables: the size of the array, indices Handle<YieldTermStructure> md0Yts() { double q6mh[] = { 0.0001,0.0001,0.0001,0.0003,0.00055,0.0009,0.0014,0.0019, 0.0025,0.0031,0.00325,0.00313,0.0031,0.00307,0.00309, ........................................................ 0.02336,0.02407,0.0245 }; 60 elements .... for(int i=0;i<10+18+37;i++) { i < 65 q6m.push_back( boost::shared_ptr<Quote>(new SimpleQuote(q6mh[i]))); The QuantLib project V557 Array overrun is possible. The value of 'i' index could reach 64. markovfunctional.cpp 176
  • 20. The values of variables: using conditions to determine the range std::string rangeTypeLabel(int idx) { const char* rangeTypeLabels[] = {"Self", "Touch", "Target"}; if (idx >= 0 && idx <= 3) return rangeTypeLabels[idx]; else return "Invalid"; } V557 Array overrun is possible. The value of 'idx' index could reach 3. esmtool labels.cpp 502 The OpenMW project
  • 21. The values of functions static inline size_t UnboxedTypeSize(JSValueType type) { switch (type) { ....... default: return 0; } } Minstruction *loadUnboxedProperty(size_t offset, ....) { size_t index = offset / UnboxedTypeSize(unboxedType); The Thunderbird project V609 Divide by zero. Denominator range [0..8]. ionbuilder.cpp 10922
  • 22. The values of variables: pointers/references if (providerName == null) { ProviderNotFoundException e = new ProviderNotFoundException( providerName.ToString(), SessionStateCategory.CmdletProvider, "ProviderNotFound", SessionStateStrings.ProviderNotFound); throw e; V3080 Possible null dereference. Consider inspecting 'providerName'. System.Management.Automation SessionStateProviderAPIs.cs 1004 The PowerShell Project
  • 23. Method annotations • Method annotations provides more information about the used methods than can be obtained by analyzing only their signatures. • C/C++. By this moment we have annotated 6570 functions (standard C and C++ libraries, POSIX, MFC, Qt, ZLib and so on). • C#. At the moment we have annotated 920 functions.
  • 24. An example of annotating the memcmp function C_"int memcmp(const void *buf1, const void *buf2, size_t count);" ADD(REENTERABLE | RET_USE | F_MEMCMP | STRCMP | HARD_TEST | INT_STATUS, nullptr, nullptr, "memcmp", POINTER_1, POINTER_2, BYTE_COUNT); • C_- an auxiliary control mechanism of annotations (unit tests) • REENTERABLE - repetitive call with the same arguments will give the same result • RET_USE - the result should be used • F_MEMCMP - launch of certain checks for buffer out of bounds • STR_CMP - the function returns 0 in case of equality • HARD_TEST - a special function. Some programmers define their own functions in their own namespace. Ignore namespace. • INT_STATUS - explicitly compare the result with 1 or -1. • POINTER_1, POINTER_2 - the pointers must be non-zero and different. • BYTE_COUNT - this parameter specifies the number of bytes and must be > 0.
  • 25. Annotation of memcmp: checking the result bool operator()(const GUID& _Key1, const GUID& _Key2) const { return memcmp(&_Key1, &_Key2, sizeof(GUID)) == -1; } The CoreCLR project V698 Expression 'memcmp(....) == -1' is incorrect. This function can return not only the value '-1', but any negative value. Consider using 'memcmp(....) < 0' instead. sos util.cpp 142
  • 26. Annotation of memcmp: storing the result The Firebird project V642 Saving the 'memcmp' function result inside the 'short' type variable is inappropriate. The significant bits could be lost breaking the program's logic. texttype.cpp 3 SSHORT TextType::compare(ULONG len1, const UCHAR* str1, ULONG len2, const UCHAR* str2) { .... SSHORT cmp = memcmp(str1, str2, MIN(len1, len2)); if (cmp == 0) cmp = (len1 < len2 ? -1 : (len1 > len2 ? 1 : 0)); return cmp; }
  • 27. Annotation of memcmp: wrong argument The GLG3D project V575 The 'memcmp' function processes '0' elements. Inspect the 'third' argument. graphics3D matrix4.cpp 269 bool Matrix4::operator==(const Matrix4& other) const { if (memcmp(this, &other, sizeof(Matrix4) == 0)) { return true; } ... }
  • 28. static int psymbol_compare (const void *addr1, const void *addr2, int length) { struct partial_symbol *sym1 = (struct partial_symbol *) addr1; struct partial_symbol *sym2 = (struct partial_symbol *) addr2; return (memcmp (&sym1->ginfo.value, &sym1->ginfo.value, sizeof (sym1->ginfo.value)) == 0 && ....... Annotation of memcmp: different arguments The GDB Project V549 The first argument of 'memcmp' function is equal to the second argument. psymtab.c 1580
  • 29. dst_s_read_private_key_file(....) { .... if (memcmp(in_buff, "Private-key-format: v", 20) != 0) goto fail; .... } 21 character Annotation of memcmp: buffer underrun The Haiku project V512 A call of the 'memcmp' function will lead to underflow of the buffer '"Private-key-format: v"'. dst_api.c 858
  • 30. Annotation of memcmp: no status The PHP project V501 There are identical sub-expressions '!memcmp("auto", charset_hint, 4)' to the left and to the right of the '||' operator. html.c 396 if ((len == 4) /* sizeof (none|auto|pass) */ && (!memcmp("pass", charset_hint, 4) || !memcmp("auto", charset_hint, 4) || !memcmp("auto", charset_hint, 4)))
  • 31. Annotation of custom functions • Almost no support (except certain elements, as for example our own printf function) • There is no sense to develop this mechanism • No one will spend months doing the markup of large projects • The analyzer must work immediately
  • 32. Testing the analyzer • Testing the analyzer is the most important part of the development process • The hardest part about static analysis: not to complain • A large test base: • C++ Windows (Visual C++): 120 projects • C++ Linux (GCC): 34 more projects • C# Windows: 54 projects
  • 33. We can send a more detailed version of the presentation • Write to us: support@viva64.com • Follow on Twitter: @Code_Analysis • Download PVS-Studio for Windows: http://www.viva64.com/en/pvs-studio/ • Download PVS-Studio for Linux: http://www.viva64.com/en/pvs-studio-download-linux/