SlideShare una empresa de Scribd logo
1 de 10
Descargar para leer sin conexión
PVS-Studio vs Chromium
    Studio
Author: Andrey Karpov

Date: 23.05.2011


Abstract
Good has won this time. To be more exact, source codes of the Chromium project have won. Chromium
                         .
is one of the best projects we have checked with PVS
                                                 PVS-Studio.




Chromium is an open-source web browser developed by Google and intended to provide users with fast
                      source web-browser
and safe Internet access. Chromium serves as the base for the Google Chrome browser. Moreover,
Chromium is a preliminary version of Google Chrome as well as some other alternative web
                                                                                     web-browsers.

From the programming viewpoint, Chromium is a solution consisting of 473 projects. The general size of
the source C/C++ code is about 460 Mbytes and the number of lines is difficult to count
                                                                                  count.

These 460 Mbytes include a lot of various libraries. If you exclude them, you will have about 155
Mbytes. It is much less but still a lot of lines. Moreover, everything is relative, you know. Many of these
                                                .
libraries were created by the Chromium developers within the task of creating Chromium itself.
Although such libraries live by themselves, still we may refer them to the browser.

Chromium had become the most quality and large project I have studied during testing of PVS-Studio.
While handling the Chromium project it was not actually clear to us what was checking what: we have
                           um
found and fixed several errors in PVS Studio related to C++ file analysis and support of a specific
                                  PVS-Studio
project's structure.

Many aspects and methods used in Chromium show the quality of its source code. For instance, most
programmers determine the number of items in an array using the following construct
                                                                          construct:

int XX[] = { 1, 2, 3, 4 };

size_t N = sizeof(XX) / sizeof
                        sizeof(XX[0]);

Usually it is arranged as a macro of this kind
                                          kind:
#define count_of(arg) (sizeof(arg) / sizeof(arg[0]))

This is a quite efficient and useful macro. To be honest, I have always used this very macro myself.
However, it might lead to an error because you may accidentally pass a simple pointer to it and it will
not mind. Let me explain this by the following example:

void Test(int C[3])

{

    int A[3];

    int *B = Foo();

    size_t x = count_of(A); // Ok

    x = count_of(B); // Error

    x = count_of(C); // Error

}

The count_of(A) construct works correctly and returns the number of items in the A array which is equal
to three here.

But if you apply by accident count_of() to a pointer, the result will be a meaningless value. The issue is
that the macro will not produce any warning for the programmer about a strange construct of the
count_of(B) sort. This situation seems farfetched and artificial but I had encountered it in various
applications. For example, consider this code from the Miranda IM project:

#define SIZEOF(X) (sizeof(X)/sizeof(X[0]))

int Cache_GetLineText(..., LPTSTR text, int text_size, ...)

{

    ...

    tmi.printDateTime(pdnce->hTimeZone, _T("t"), text, SIZEOF(text), 0);

    ...

}

So, such errors may well exist in your code and you'd better have something to protect yourself against
them. It is even easier to make a mistake when trying to calculate the size of an array passed as an
argument:

void Test(int C[3])

{

    x = count_of(C); // Error

}
According to the C++ standard, the 'C' variable is a simple pointer, not an array. As a result, you may
often see in programs that only a part of the array passed is processed.

Since we have started speaking of such errors, let me tell you about a method that will help you find the
size of the array passed. You should pass it by the reference:

void Test(int (&C)[3])

{

    x = count_of(C); // Ok

}

Now the result of the count_of(C) expression is value 3.

Let's return to Chromium. It uses a macro that allows you to avoid the above described errors. This is
how it is implemented:

template <typename T, size_t N>

char (&ArraySizeHelper(T (&array)[N]))[N];

#define arraysize(array) (sizeof(ArraySizeHelper(array)))

The idea of this magic spell is the following: the template function ArraySizeHelper receives an array of a
random type with the N length. The function returns the reference to the array of the N length
consisting of 'char' items. There is no implementation for this function because we do not need it. For
the sizeof() operator it is quite enough just to define the ArraySizeHelper function. The 'arraysize' macro
calculates the size of the array of bytes returned by the ArraySizeHelper function. This size is the number
of items in the array whose length we want to calculate.

If you have gone crazy because of all this, just take my word for it - it works. And it works much better
than the 'count_of()' macro we have discussed above. Since the ArraySizeHelper function takes an array
by the reference, you cannot pass a simple pointer to it. Let's write a test code:

template <typename T, size_t N>

char (&ArraySizeHelper(T (&array)[N]))[N];

#define arraysize(array) (sizeof(ArraySizeHelper(array)))



void Test(int C[3])

{

    int A[3];

    int *B = Foo();

    size_t x = arraysize(A); // Ok

    x = arraysize(B); // Compilation error
x = arraysize(C); // Compilation error

}

The incorrect code simply will not be compiled. I think it's cool when you can prevent a potential error
already at the compilation stage. This is a nice sample reflecting the quality of this programming
approach. My respect goes to Google developers.

Let me give you one more sample which is of a different sort yet it shows the quality of the code as well.

if (!file_util::Delete(db_name, false) &&

      !file_util::Delete(db_name, false)) {

    // Try to delete twice. If we can't, fail.

    LOG(ERROR) << "unable to delete old TopSites file";

    return false;

}

Many programmers might find this code strange. What is the sense in trying to remove a file twice?
There is a sense. The one who wrote it has reached Enlightenment and comprehended the essence of
software existence. A file can be definitely removed or cannot be removed at all only in textbooks and in
some abstract world. In the real system it often happens that a file cannot be removed right now and
can be removed an instance later. There may be many reasons for that: antivirus software, viruses,
version control systems and whatever. Programmers often do not think of such cases. They believe that
when you cannot remove a file you cannot remove it at all. But if you want to make everything well and
avoid littering in directories, you should take these extraneous factors into account. I encountered quite
the same situation when a file would not get removed once in 1000 runs. The solution was also the
same - I only placed Sleep(0) in the middle just in case.

Well, and what about the check by PVS-Studio? Chromium's code is perhaps the most quality code I've
ever seen. This is confirmed by the low density of errors we've managed to find. If you take their
quantity in general, there are certainly plenty of them. But if you divide the number of errors by the
amount of code, it turns out that there are almost no errors. What are these errors? They are the most
ordinary ones. Here are several samples:

V512 A call of the 'memset' function will lead to underflow

of the buffer '(exploded)'. platform time_win.cc 116



void NaCl::Time::Explode(bool is_local, Exploded* exploded) const {

    ...

    ZeroMemory(exploded, sizeof(exploded));

    ...

}
Everybody makes misprints. In this case, an asterisk is missing. It must be sizeof(*exploded).



V502      Perhaps the '?:' operator works in a different way than it

was expected. The '?:' operator has a lower priority than the '-'

operator.       views      custom_frame_view.cc             400



static const int kClientEdgeThickness;

int height() const;

bool ShouldShowClientEdge() const;



void CustomFrameView::PaintMaximizedFrameBorder(gfx::Canvas* canvas) {

    ...

    int edge_height = titlebar_bottom->height() -

                              ShouldShowClientEdge() ? kClientEdgeThickness : 0;

    ...

}

The insidious operator "?:" has a lower priority than subtraction. There must be additional parentheses
here:

int edge_height = titlebar_bottom->height() -

                           (ShouldShowClientEdge() ? kClientEdgeThickness : 0);



A meaningless check.

V547      Expression 'count < 0' is always false. Unsigned type value

is never < 0.          ncdecode_tablegen           ncdecode_tablegen.c             197



static void CharAdvance(char** buffer, size_t* buffer_size,

size_t count) {

    if (count < 0) {

      NaClFatal("Unable to advance buffer by count!");

    } else {
...

}

The "count < 0" condition is always false. The protection does not work and some buffer might get
overflowed. By the way, this is an example of how static analyzers might be used to search for
vulnerabilities. An intruder can quickly find code fragments that contain errors for further thorough
investigation. Here is another code sample related to the safety issue:

V511      The sizeof() operator returns size of the pointer,

and not of the array, in 'sizeof (salt)' expression.                             common

visitedlink_common.cc             84



void MD5Update(MD5Context* context, const void* buf, size_t len);



VisitedLinkCommon::Fingerprint
VisitedLinkCommon::ComputeURLFingerprint(

    ...

 const uint8 salt[LINK_SALT_LENGTH])

{

    ...

    MD5Update(&ctx, salt, sizeof(salt));

    ...

}

The MD5Update() function will process as many bytes as the pointer occupies. This is a potential
loophole in the data encryption system, isn't it? I do not know whether it implies any danger; however,
from the viewpoint of intruders, this is a fragment for thorough analysis.

The correct code should look this way:

MD5Update(&ctx, salt, sizeof(salt[0]) * LINK_SALT_LENGTH);

Or this way:

VisitedLinkCommon::Fingerprint
VisitedLinkCommon::ComputeURLFingerprint(

    ...

 const uint8 (&salt)[LINK_SALT_LENGTH])

{
...

    MD5Update(&ctx, salt, sizeof(salt));

    ...

}



One more sample with a misprint:

V501      There are identical sub-expressions 'host !=

buzz::XmlConstants::str_empty ()' to the left and to the right

of the '&&' operator.             chromoting_jingle_glue        iq_request.cc   248



void JingleInfoRequest::OnResponse(const buzz::XmlElement* stanza) {

    ...

    std::string host = server->Attr(buzz::QN_JINGLE_INFO_HOST);

    std::string port_str = server->Attr(buzz::QN_JINGLE_INFO_UDP);

    if (host != buzz::STR_EMPTY && host != buzz::STR_EMPTY) {

    ...

}

The port_str variable must be actually checked as well:

if (host != buzz::STR_EMPTY && port_str != buzz::STR_EMPTY) {



A bit of classics:

V530      The return value of function 'empty' is required to be utilized.

    chrome_frame_npapi          np_proxy_service.cc       293



bool NpProxyService::GetProxyValueJSONString(std::string* output) {

    DCHECK(output);

    output->empty();

    ...

}

It must be: output->clear();
And here is even the handling of a null pointer:

V522      Dereferencing of the null pointer 'plugin_instance' might take

place. Check the logical condition.                  chrome_frame_npapi

 chrome_frame_npapi.cc              517



bool ChromeFrameNPAPI::Invoke(...)

{

    ChromeFrameNPAPI* plugin_instance =

      ChromeFrameInstanceFromNPObject(header);

    if (!plugin_instance && (plugin_instance->automation_client_.get()))

      return false;

    ...

}



One more example of a check that will never work:

V547      Expression 'current_idle_time < 0' is always false. Unsigned

type value is never < 0.               browser      idle_win.cc   23



IdleState CalculateIdleState(unsigned int idle_threshold) {

    ...

    DWORD current_idle_time = 0;

    ...

    // Will go -ve if we have been idle for a long time (2gb seconds).

    if (current_idle_time < 0)

      current_idle_time = INT_MAX;

    ...

}
Well, we should stop here. I can continue but it's starting to get boring. Remember that all this only
concerns the Chromium itself. But there are also tests with errors like this:

V554      Incorrect use of auto_ptr. The memory allocated with 'new []'

will be cleaned using 'delete'.                    interactive_ui_tests

 accessibility_win_browsertest.cc                     306



void AccessibleChecker::CheckAccessibleChildren(IAccessible* parent) {

    ...

    auto_ptr<VARIANT> child_array(new VARIANT[child_count]);

    ...

}



There are also plenty of libraries Chromium is actually based on, the total size of libraries being much
larger than that of Chromium itself. They also have a lot of interesting fragments. It is clear that code
containing errors might not be used anywhere, still they are the errors nonetheless. Consider one of the
examples (the ICU library):

V547 Expression '* string != 0 || * string != '_'' is always true.

Probably the '&&' operator should be used here.                             icui18n ucol_sit.cpp

242



U_CDECL_BEGIN static const char* U_CALLCONV

_processVariableTop(...)

{

    ...

    if(i == locElementCapacity && (*string != 0 || *string != '_')) {

        *status = U_BUFFER_OVERFLOW_ERROR;

    }

    ...

}

The "(*string != 0 || *string != '_')" expression is always true. Perhaps it must be: (*string == 0 || *string
== '_').
Conclusion
PVS-Studio was defeated. Chromium's source code is one of the best we have ever analyzed. We have
found almost nothing in Chromium. To be more exact, we have found a lot of errors and this article
demonstrates only a few of them. But if we keep in mind that all these errors are spread throughout the
source code with the size of 460 Mbytes, it turns out that there are almost no errors at all.



P.S.

I'm answering to the question: will we inform the Chromium developers of the errors we've found? No,
we won't. It is a very large amount of work and we cannot afford doing it for free. Checking Chromium is
far from checking Miranda IM or checking Ultimate Toolbox. This is a hard work, we have to study all of
the messages and make a decision whether there is an error in every particular case. To do that, we
must be knowledgeable about the project. We will this article to the Chromium developers, and should
they find it interesting, they will be able to analyze the project themselves and study all the diagnostic
messages. Yes, they will have to purchase PVS-Studio for this purpose. But any Google department can
easily afford this.

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

CppCat Static Analyzer Review
CppCat Static Analyzer ReviewCppCat Static Analyzer Review
CppCat Static Analyzer Review
 
Антон Бикинеев, Writing good std::future&lt; C++ >
Антон Бикинеев, Writing good std::future&lt; C++ >Антон Бикинеев, Writing good std::future&lt; C++ >
Антон Бикинеев, Writing good std::future&lt; C++ >
 
Best Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' MistakesBest Bugs from Games: Fellow Programmers' Mistakes
Best Bugs from Games: Fellow Programmers' Mistakes
 
Linux version of PVS-Studio couldn't help checking CodeLite
Linux version of PVS-Studio couldn't help checking CodeLiteLinux version of PVS-Studio couldn't help checking CodeLite
Linux version of PVS-Studio couldn't help checking CodeLite
 
The CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitThe CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGit
 
Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016Top 10 bugs in C++ open source projects, checked in 2016
Top 10 bugs in C++ open source projects, checked in 2016
 
Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...Consequences of using the Copy-Paste method in C++ programming and how to dea...
Consequences of using the Copy-Paste method in C++ programming and how to dea...
 
Analysis of Microsoft Code Contracts
Analysis of Microsoft Code ContractsAnalysis of Microsoft Code Contracts
Analysis of Microsoft Code Contracts
 
"Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ...
"Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ..."Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ...
"Why is there no artificial intelligence yet?" Or, analysis of CNTK tool kit ...
 
Reanalyzing the Notepad++ project
Reanalyzing the Notepad++ projectReanalyzing the Notepad++ project
Reanalyzing the Notepad++ project
 
Of complicacy of programming, or won't C# save us?
Of complicacy of programming, or won't C# save us?Of complicacy of programming, or won't C# save us?
Of complicacy of programming, or won't C# save us?
 
Top 10 C# projects errors found in 2016
Top 10 C# projects errors found in 2016Top 10 C# projects errors found in 2016
Top 10 C# projects errors found in 2016
 
A few words about OpenSSL
A few words about OpenSSLA few words about OpenSSL
A few words about OpenSSL
 
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
 
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
Comparing the general static analysis in Visual Studio 2010 and PVS-Studio by...
 
Checking Notepad++: five years later
Checking Notepad++: five years laterChecking Notepad++: five years later
Checking Notepad++: five years later
 
C++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical ReviewerC++ Code as Seen by a Hypercritical Reviewer
C++ Code as Seen by a Hypercritical Reviewer
 
Checking the Source Code of FlashDevelop with PVS-Studio
Checking the Source Code of FlashDevelop with PVS-StudioChecking the Source Code of FlashDevelop with PVS-Studio
Checking the Source Code of FlashDevelop with PVS-Studio
 
The First C# Project Analyzed
The First C# Project AnalyzedThe First C# Project Analyzed
The First C# Project Analyzed
 
PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)PVS-Studio for Linux (CoreHard presentation)
PVS-Studio for Linux (CoreHard presentation)
 

Destacado (7)

Unit 20 workbook-1
Unit 20 workbook-1Unit 20 workbook-1
Unit 20 workbook-1
 
Gated Internet Community
Gated Internet CommunityGated Internet Community
Gated Internet Community
 
Mark Szulc
Mark SzulcMark Szulc
Mark Szulc
 
Online reputation management solution
Online reputation management solutionOnline reputation management solution
Online reputation management solution
 
Getting a Grip on Social Media
Getting a Grip on Social MediaGetting a Grip on Social Media
Getting a Grip on Social Media
 
Montage Effects
Montage EffectsMontage Effects
Montage Effects
 
AUTOMATIC CAR PARKING SYSTEM
AUTOMATIC CAR PARKING SYSTEMAUTOMATIC CAR PARKING SYSTEM
AUTOMATIC CAR PARKING SYSTEM
 

Similar a PVS-Studio vs Chromium

Similar a PVS-Studio vs Chromium (20)

How to avoid bugs using modern C++
How to avoid bugs using modern C++How to avoid bugs using modern C++
How to avoid bugs using modern C++
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
PVS-Studio vs Chromium. 3-rd Check
PVS-Studio vs Chromium. 3-rd CheckPVS-Studio vs Chromium. 3-rd Check
PVS-Studio vs Chromium. 3-rd Check
 
PVS-Studio: analyzing ReactOS's code
PVS-Studio: analyzing ReactOS's codePVS-Studio: analyzing ReactOS's code
PVS-Studio: analyzing ReactOS's code
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
 
Why Windows 8 drivers are buggy
Why Windows 8 drivers are buggyWhy Windows 8 drivers are buggy
Why Windows 8 drivers are buggy
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
 
Cppcheck and PVS-Studio compared
Cppcheck and PVS-Studio comparedCppcheck and PVS-Studio compared
Cppcheck and PVS-Studio compared
 
Static code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0xStatic code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0x
 
Static code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0xStatic code analysis and the new language standard C++0x
Static code analysis and the new language standard C++0x
 
The Little Unicorn That Could
The Little Unicorn That CouldThe Little Unicorn That Could
The Little Unicorn That Could
 
200 Open Source Projects Later: Source Code Static Analysis Experience
200 Open Source Projects Later: Source Code Static Analysis Experience200 Open Source Projects Later: Source Code Static Analysis Experience
200 Open Source Projects Later: Source Code Static Analysis Experience
 
LibRaw, Coverity SCAN, PVS-Studio
LibRaw, Coverity SCAN, PVS-StudioLibRaw, Coverity SCAN, PVS-Studio
LibRaw, Coverity SCAN, PVS-Studio
 
PVS-Studio: analyzing ReactOS's code
PVS-Studio: analyzing ReactOS's codePVS-Studio: analyzing ReactOS's code
PVS-Studio: analyzing ReactOS's code
 
Headache from using mathematical software
Headache from using mathematical softwareHeadache from using mathematical software
Headache from using mathematical software
 
Analyzing the Dolphin-emu project
Analyzing the Dolphin-emu projectAnalyzing the Dolphin-emu project
Analyzing the Dolphin-emu project
 
A Check of the Open-Source Project WinSCP Developed in Embarcadero C++ Builder
A Check of the Open-Source Project WinSCP Developed in Embarcadero C++ BuilderA Check of the Open-Source Project WinSCP Developed in Embarcadero C++ Builder
A Check of the Open-Source Project WinSCP Developed in Embarcadero C++ Builder
 
Exploring Microoptimizations Using Tizen Code as an Example
Exploring Microoptimizations Using Tizen Code as an ExampleExploring Microoptimizations Using Tizen Code as an Example
Exploring Microoptimizations Using Tizen Code as an Example
 
How to make fewer errors at the stage of code writing. Part N1.
How to make fewer errors at the stage of code writing. Part N1.How to make fewer errors at the stage of code writing. Part N1.
How to make fewer errors at the stage of code writing. Part N1.
 

Más de Andrey Karpov

Más de Andrey Karpov (20)

60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста60 антипаттернов для С++ программиста
60 антипаттернов для С++ программиста
 
60 terrible tips for a C++ developer
60 terrible tips for a C++ developer60 terrible tips for a C++ developer
60 terrible tips for a C++ developer
 
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...Ошибки, которые сложно заметить на code review, но которые находятся статичес...
Ошибки, которые сложно заметить на code review, но которые находятся статичес...
 
PVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error ExamplesPVS-Studio in 2021 - Error Examples
PVS-Studio in 2021 - Error Examples
 
PVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature OverviewPVS-Studio in 2021 - Feature Overview
PVS-Studio in 2021 - Feature Overview
 
PVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибокPVS-Studio в 2021 - Примеры ошибок
PVS-Studio в 2021 - Примеры ошибок
 
PVS-Studio в 2021
PVS-Studio в 2021PVS-Studio в 2021
PVS-Studio в 2021
 
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
Make Your and Other Programmer’s Life Easier with Static Analysis (Unreal Eng...
 
Does static analysis need machine learning?
Does static analysis need machine learning?Does static analysis need machine learning?
Does static analysis need machine learning?
 
Typical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and JavaTypical errors in code on the example of C++, C#, and Java
Typical errors in code on the example of C++, C#, and Java
 
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
How to Fix Hundreds of Bugs in Legacy Code and Not Die (Unreal Engine 4)
 
Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?Game Engine Code Quality: Is Everything Really That Bad?
Game Engine Code Quality: Is Everything Really That Bad?
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
Static Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal EngineStatic Code Analysis for Projects, Built on Unreal Engine
Static Code Analysis for Projects, Built on Unreal Engine
 
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded SystemsSafety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
Safety on the Max: How to Write Reliable C/C++ Code for Embedded Systems
 
The Great and Mighty C++
The Great and Mighty C++The Great and Mighty C++
The Great and Mighty C++
 
Static code analysis: what? how? why?
Static code analysis: what? how? why?Static code analysis: what? how? why?
Static code analysis: what? how? why?
 
Zero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for youZero, one, two, Freddy's coming for you
Zero, one, two, Freddy's coming for you
 
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOpsPVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
PVS-Studio Is Now in Chocolatey: Checking Chocolatey under Azure DevOps
 
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
PVS-Studio Static Analyzer as a Tool for Protection against Zero-Day Vulnerab...
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 

PVS-Studio vs Chromium

  • 1. PVS-Studio vs Chromium Studio Author: Andrey Karpov Date: 23.05.2011 Abstract Good has won this time. To be more exact, source codes of the Chromium project have won. Chromium . is one of the best projects we have checked with PVS PVS-Studio. Chromium is an open-source web browser developed by Google and intended to provide users with fast source web-browser and safe Internet access. Chromium serves as the base for the Google Chrome browser. Moreover, Chromium is a preliminary version of Google Chrome as well as some other alternative web web-browsers. From the programming viewpoint, Chromium is a solution consisting of 473 projects. The general size of the source C/C++ code is about 460 Mbytes and the number of lines is difficult to count count. These 460 Mbytes include a lot of various libraries. If you exclude them, you will have about 155 Mbytes. It is much less but still a lot of lines. Moreover, everything is relative, you know. Many of these . libraries were created by the Chromium developers within the task of creating Chromium itself. Although such libraries live by themselves, still we may refer them to the browser. Chromium had become the most quality and large project I have studied during testing of PVS-Studio. While handling the Chromium project it was not actually clear to us what was checking what: we have um found and fixed several errors in PVS Studio related to C++ file analysis and support of a specific PVS-Studio project's structure. Many aspects and methods used in Chromium show the quality of its source code. For instance, most programmers determine the number of items in an array using the following construct construct: int XX[] = { 1, 2, 3, 4 }; size_t N = sizeof(XX) / sizeof sizeof(XX[0]); Usually it is arranged as a macro of this kind kind:
  • 2. #define count_of(arg) (sizeof(arg) / sizeof(arg[0])) This is a quite efficient and useful macro. To be honest, I have always used this very macro myself. However, it might lead to an error because you may accidentally pass a simple pointer to it and it will not mind. Let me explain this by the following example: void Test(int C[3]) { int A[3]; int *B = Foo(); size_t x = count_of(A); // Ok x = count_of(B); // Error x = count_of(C); // Error } The count_of(A) construct works correctly and returns the number of items in the A array which is equal to three here. But if you apply by accident count_of() to a pointer, the result will be a meaningless value. The issue is that the macro will not produce any warning for the programmer about a strange construct of the count_of(B) sort. This situation seems farfetched and artificial but I had encountered it in various applications. For example, consider this code from the Miranda IM project: #define SIZEOF(X) (sizeof(X)/sizeof(X[0])) int Cache_GetLineText(..., LPTSTR text, int text_size, ...) { ... tmi.printDateTime(pdnce->hTimeZone, _T("t"), text, SIZEOF(text), 0); ... } So, such errors may well exist in your code and you'd better have something to protect yourself against them. It is even easier to make a mistake when trying to calculate the size of an array passed as an argument: void Test(int C[3]) { x = count_of(C); // Error }
  • 3. According to the C++ standard, the 'C' variable is a simple pointer, not an array. As a result, you may often see in programs that only a part of the array passed is processed. Since we have started speaking of such errors, let me tell you about a method that will help you find the size of the array passed. You should pass it by the reference: void Test(int (&C)[3]) { x = count_of(C); // Ok } Now the result of the count_of(C) expression is value 3. Let's return to Chromium. It uses a macro that allows you to avoid the above described errors. This is how it is implemented: template <typename T, size_t N> char (&ArraySizeHelper(T (&array)[N]))[N]; #define arraysize(array) (sizeof(ArraySizeHelper(array))) The idea of this magic spell is the following: the template function ArraySizeHelper receives an array of a random type with the N length. The function returns the reference to the array of the N length consisting of 'char' items. There is no implementation for this function because we do not need it. For the sizeof() operator it is quite enough just to define the ArraySizeHelper function. The 'arraysize' macro calculates the size of the array of bytes returned by the ArraySizeHelper function. This size is the number of items in the array whose length we want to calculate. If you have gone crazy because of all this, just take my word for it - it works. And it works much better than the 'count_of()' macro we have discussed above. Since the ArraySizeHelper function takes an array by the reference, you cannot pass a simple pointer to it. Let's write a test code: template <typename T, size_t N> char (&ArraySizeHelper(T (&array)[N]))[N]; #define arraysize(array) (sizeof(ArraySizeHelper(array))) void Test(int C[3]) { int A[3]; int *B = Foo(); size_t x = arraysize(A); // Ok x = arraysize(B); // Compilation error
  • 4. x = arraysize(C); // Compilation error } The incorrect code simply will not be compiled. I think it's cool when you can prevent a potential error already at the compilation stage. This is a nice sample reflecting the quality of this programming approach. My respect goes to Google developers. Let me give you one more sample which is of a different sort yet it shows the quality of the code as well. if (!file_util::Delete(db_name, false) && !file_util::Delete(db_name, false)) { // Try to delete twice. If we can't, fail. LOG(ERROR) << "unable to delete old TopSites file"; return false; } Many programmers might find this code strange. What is the sense in trying to remove a file twice? There is a sense. The one who wrote it has reached Enlightenment and comprehended the essence of software existence. A file can be definitely removed or cannot be removed at all only in textbooks and in some abstract world. In the real system it often happens that a file cannot be removed right now and can be removed an instance later. There may be many reasons for that: antivirus software, viruses, version control systems and whatever. Programmers often do not think of such cases. They believe that when you cannot remove a file you cannot remove it at all. But if you want to make everything well and avoid littering in directories, you should take these extraneous factors into account. I encountered quite the same situation when a file would not get removed once in 1000 runs. The solution was also the same - I only placed Sleep(0) in the middle just in case. Well, and what about the check by PVS-Studio? Chromium's code is perhaps the most quality code I've ever seen. This is confirmed by the low density of errors we've managed to find. If you take their quantity in general, there are certainly plenty of them. But if you divide the number of errors by the amount of code, it turns out that there are almost no errors. What are these errors? They are the most ordinary ones. Here are several samples: V512 A call of the 'memset' function will lead to underflow of the buffer '(exploded)'. platform time_win.cc 116 void NaCl::Time::Explode(bool is_local, Exploded* exploded) const { ... ZeroMemory(exploded, sizeof(exploded)); ... }
  • 5. Everybody makes misprints. In this case, an asterisk is missing. It must be sizeof(*exploded). V502 Perhaps the '?:' operator works in a different way than it was expected. The '?:' operator has a lower priority than the '-' operator. views custom_frame_view.cc 400 static const int kClientEdgeThickness; int height() const; bool ShouldShowClientEdge() const; void CustomFrameView::PaintMaximizedFrameBorder(gfx::Canvas* canvas) { ... int edge_height = titlebar_bottom->height() - ShouldShowClientEdge() ? kClientEdgeThickness : 0; ... } The insidious operator "?:" has a lower priority than subtraction. There must be additional parentheses here: int edge_height = titlebar_bottom->height() - (ShouldShowClientEdge() ? kClientEdgeThickness : 0); A meaningless check. V547 Expression 'count < 0' is always false. Unsigned type value is never < 0. ncdecode_tablegen ncdecode_tablegen.c 197 static void CharAdvance(char** buffer, size_t* buffer_size, size_t count) { if (count < 0) { NaClFatal("Unable to advance buffer by count!"); } else {
  • 6. ... } The "count < 0" condition is always false. The protection does not work and some buffer might get overflowed. By the way, this is an example of how static analyzers might be used to search for vulnerabilities. An intruder can quickly find code fragments that contain errors for further thorough investigation. Here is another code sample related to the safety issue: V511 The sizeof() operator returns size of the pointer, and not of the array, in 'sizeof (salt)' expression. common visitedlink_common.cc 84 void MD5Update(MD5Context* context, const void* buf, size_t len); VisitedLinkCommon::Fingerprint VisitedLinkCommon::ComputeURLFingerprint( ... const uint8 salt[LINK_SALT_LENGTH]) { ... MD5Update(&ctx, salt, sizeof(salt)); ... } The MD5Update() function will process as many bytes as the pointer occupies. This is a potential loophole in the data encryption system, isn't it? I do not know whether it implies any danger; however, from the viewpoint of intruders, this is a fragment for thorough analysis. The correct code should look this way: MD5Update(&ctx, salt, sizeof(salt[0]) * LINK_SALT_LENGTH); Or this way: VisitedLinkCommon::Fingerprint VisitedLinkCommon::ComputeURLFingerprint( ... const uint8 (&salt)[LINK_SALT_LENGTH]) {
  • 7. ... MD5Update(&ctx, salt, sizeof(salt)); ... } One more sample with a misprint: V501 There are identical sub-expressions 'host != buzz::XmlConstants::str_empty ()' to the left and to the right of the '&&' operator. chromoting_jingle_glue iq_request.cc 248 void JingleInfoRequest::OnResponse(const buzz::XmlElement* stanza) { ... std::string host = server->Attr(buzz::QN_JINGLE_INFO_HOST); std::string port_str = server->Attr(buzz::QN_JINGLE_INFO_UDP); if (host != buzz::STR_EMPTY && host != buzz::STR_EMPTY) { ... } The port_str variable must be actually checked as well: if (host != buzz::STR_EMPTY && port_str != buzz::STR_EMPTY) { A bit of classics: V530 The return value of function 'empty' is required to be utilized. chrome_frame_npapi np_proxy_service.cc 293 bool NpProxyService::GetProxyValueJSONString(std::string* output) { DCHECK(output); output->empty(); ... } It must be: output->clear();
  • 8. And here is even the handling of a null pointer: V522 Dereferencing of the null pointer 'plugin_instance' might take place. Check the logical condition. chrome_frame_npapi chrome_frame_npapi.cc 517 bool ChromeFrameNPAPI::Invoke(...) { ChromeFrameNPAPI* plugin_instance = ChromeFrameInstanceFromNPObject(header); if (!plugin_instance && (plugin_instance->automation_client_.get())) return false; ... } One more example of a check that will never work: V547 Expression 'current_idle_time < 0' is always false. Unsigned type value is never < 0. browser idle_win.cc 23 IdleState CalculateIdleState(unsigned int idle_threshold) { ... DWORD current_idle_time = 0; ... // Will go -ve if we have been idle for a long time (2gb seconds). if (current_idle_time < 0) current_idle_time = INT_MAX; ... }
  • 9. Well, we should stop here. I can continue but it's starting to get boring. Remember that all this only concerns the Chromium itself. But there are also tests with errors like this: V554 Incorrect use of auto_ptr. The memory allocated with 'new []' will be cleaned using 'delete'. interactive_ui_tests accessibility_win_browsertest.cc 306 void AccessibleChecker::CheckAccessibleChildren(IAccessible* parent) { ... auto_ptr<VARIANT> child_array(new VARIANT[child_count]); ... } There are also plenty of libraries Chromium is actually based on, the total size of libraries being much larger than that of Chromium itself. They also have a lot of interesting fragments. It is clear that code containing errors might not be used anywhere, still they are the errors nonetheless. Consider one of the examples (the ICU library): V547 Expression '* string != 0 || * string != '_'' is always true. Probably the '&&' operator should be used here. icui18n ucol_sit.cpp 242 U_CDECL_BEGIN static const char* U_CALLCONV _processVariableTop(...) { ... if(i == locElementCapacity && (*string != 0 || *string != '_')) { *status = U_BUFFER_OVERFLOW_ERROR; } ... } The "(*string != 0 || *string != '_')" expression is always true. Perhaps it must be: (*string == 0 || *string == '_').
  • 10. Conclusion PVS-Studio was defeated. Chromium's source code is one of the best we have ever analyzed. We have found almost nothing in Chromium. To be more exact, we have found a lot of errors and this article demonstrates only a few of them. But if we keep in mind that all these errors are spread throughout the source code with the size of 460 Mbytes, it turns out that there are almost no errors at all. P.S. I'm answering to the question: will we inform the Chromium developers of the errors we've found? No, we won't. It is a very large amount of work and we cannot afford doing it for free. Checking Chromium is far from checking Miranda IM or checking Ultimate Toolbox. This is a hard work, we have to study all of the messages and make a decision whether there is an error in every particular case. To do that, we must be knowledgeable about the project. We will this article to the Chromium developers, and should they find it interesting, they will be able to analyze the project themselves and study all the diagnostic messages. Yes, they will have to purchase PVS-Studio for this purpose. But any Google department can easily afford this.