PVS-Studio vs Chromium

PVS-Studio vs Chromium
Studio
Author: Andrey Karpov

Date: 23.05.2011

Abstract
Good has won this time. To be more exact, source codes of the Chromium project have won. Chromium
.
is one of the best projects we have checked with PVS
PVS-Studio.

Chromium is an open-source web browser developed by Google and intended to provide users with fast
source web-browser
and safe Internet access. Chromium serves as the base for the Google Chrome browser. Moreover,
Chromium is a preliminary version of Google Chrome as well as some other alternative web
web-browsers.

From the programming viewpoint, Chromium is a solution consisting of 473 projects. The general size of
the source C/C++ code is about 460 Mbytes and the number of lines is difficult to count
count.

These 460 Mbytes include a lot of various libraries. If you exclude them, you will have about 155
Mbytes. It is much less but still a lot of lines. Moreover, everything is relative, you know. Many of these
.
libraries were created by the Chromium developers within the task of creating Chromium itself.
Although such libraries live by themselves, still we may refer them to the browser.

Chromium had become the most quality and large project I have studied during testing of PVS-Studio.
While handling the Chromium project it was not actually clear to us what was checking what: we have
um
found and fixed several errors in PVS Studio related to C++ file analysis and support of a specific
PVS-Studio
project's structure.

Many aspects and methods used in Chromium show the quality of its source code. For instance, most
programmers determine the number of items in an array using the following construct
construct:

int XX[] = { 1, 2, 3, 4 };

size_t N = sizeof(XX) / sizeof
sizeof(XX[0]);

Usually it is arranged as a macro of this kind
kind:

#define count_of(arg) (sizeof(arg) / sizeof(arg[0]))

This is a quite efficient and useful macro. To be honest, I have always used this very macro myself.
However, it might lead to an error because you may accidentally pass a simple pointer to it and it will
not mind. Let me explain this by the following example:

void Test(int C[3])

{

int A[3];

int *B = Foo();

size_t x = count_of(A); // Ok

x = count_of(B); // Error

x = count_of(C); // Error

}

The count_of(A) construct works correctly and returns the number of items in the A array which is equal
to three here.

But if you apply by accident count_of() to a pointer, the result will be a meaningless value. The issue is
that the macro will not produce any warning for the programmer about a strange construct of the
count_of(B) sort. This situation seems farfetched and artificial but I had encountered it in various
applications. For example, consider this code from the Miranda IM project:

#define SIZEOF(X) (sizeof(X)/sizeof(X[0]))

int Cache_GetLineText(..., LPTSTR text, int text_size, ...)

{

...

tmi.printDateTime(pdnce->hTimeZone, _T("t"), text, SIZEOF(text), 0);

...

}

So, such errors may well exist in your code and you'd better have something to protect yourself against
them. It is even easier to make a mistake when trying to calculate the size of an array passed as an
argument:

void Test(int C[3])

{

x = count_of(C); // Error

}

According to the C++ standard, the 'C' variable is a simple pointer, not an array. As a result, you may
often see in programs that only a part of the array passed is processed.

Since we have started speaking of such errors, let me tell you about a method that will help you find the
size of the array passed. You should pass it by the reference:

void Test(int (&C)[3])

{

x = count_of(C); // Ok

}

Now the result of the count_of(C) expression is value 3.

Let's return to Chromium. It uses a macro that allows you to avoid the above described errors. This is
how it is implemented:

template <typename T, size_t N>

char (&ArraySizeHelper(T (&array)[N]))[N];

#define arraysize(array) (sizeof(ArraySizeHelper(array)))

The idea of this magic spell is the following: the template function ArraySizeHelper receives an array of a
random type with the N length. The function returns the reference to the array of the N length
consisting of 'char' items. There is no implementation for this function because we do not need it. For
the sizeof() operator it is quite enough just to define the ArraySizeHelper function. The 'arraysize' macro
calculates the size of the array of bytes returned by the ArraySizeHelper function. This size is the number
of items in the array whose length we want to calculate.

If you have gone crazy because of all this, just take my word for it - it works. And it works much better
than the 'count_of()' macro we have discussed above. Since the ArraySizeHelper function takes an array
by the reference, you cannot pass a simple pointer to it. Let's write a test code:

template <typename T, size_t N>

char (&ArraySizeHelper(T (&array)[N]))[N];

#define arraysize(array) (sizeof(ArraySizeHelper(array)))

void Test(int C[3])

{

int A[3];

int *B = Foo();

size_t x = arraysize(A); // Ok

x = arraysize(B); // Compilation error

x = arraysize(C); // Compilation error

}

The incorrect code simply will not be compiled. I think it's cool when you can prevent a potential error
already at the compilation stage. This is a nice sample reflecting the quality of this programming
approach. My respect goes to Google developers.

Let me give you one more sample which is of a different sort yet it shows the quality of the code as well.

if (!file_util::Delete(db_name, false) &&

!file_util::Delete(db_name, false)) {

// Try to delete twice. If we can't, fail.

LOG(ERROR) << "unable to delete old TopSites file";

return false;

}

Many programmers might find this code strange. What is the sense in trying to remove a file twice?
There is a sense. The one who wrote it has reached Enlightenment and comprehended the essence of
software existence. A file can be definitely removed or cannot be removed at all only in textbooks and in
some abstract world. In the real system it often happens that a file cannot be removed right now and
can be removed an instance later. There may be many reasons for that: antivirus software, viruses,
version control systems and whatever. Programmers often do not think of such cases. They believe that
when you cannot remove a file you cannot remove it at all. But if you want to make everything well and
avoid littering in directories, you should take these extraneous factors into account. I encountered quite
the same situation when a file would not get removed once in 1000 runs. The solution was also the
same - I only placed Sleep(0) in the middle just in case.

Well, and what about the check by PVS-Studio? Chromium's code is perhaps the most quality code I've
ever seen. This is confirmed by the low density of errors we've managed to find. If you take their
quantity in general, there are certainly plenty of them. But if you divide the number of errors by the
amount of code, it turns out that there are almost no errors. What are these errors? They are the most
ordinary ones. Here are several samples:

V512 A call of the 'memset' function will lead to underflow

of the buffer '(exploded)'. platform time_win.cc 116

void NaCl::Time::Explode(bool is_local, Exploded* exploded) const {

...

ZeroMemory(exploded, sizeof(exploded));

...

}

Everybody makes misprints. In this case, an asterisk is missing. It must be sizeof(*exploded).

V502 Perhaps the '?:' operator works in a different way than it

was expected. The '?:' operator has a lower priority than the '-'

operator. views custom_frame_view.cc 400

static const int kClientEdgeThickness;

int height() const;

bool ShouldShowClientEdge() const;

void CustomFrameView::PaintMaximizedFrameBorder(gfx::Canvas* canvas) {

...

int edge_height = titlebar_bottom->height() -

ShouldShowClientEdge() ? kClientEdgeThickness : 0;

...

}

The insidious operator "?:" has a lower priority than subtraction. There must be additional parentheses
here:

int edge_height = titlebar_bottom->height() -

(ShouldShowClientEdge() ? kClientEdgeThickness : 0);

A meaningless check.

V547 Expression 'count < 0' is always false. Unsigned type value

is never < 0. ncdecode_tablegen ncdecode_tablegen.c 197

static void CharAdvance(char** buffer, size_t* buffer_size,

size_t count) {

if (count < 0) {

NaClFatal("Unable to advance buffer by count!");

} else {

...

}

The "count < 0" condition is always false. The protection does not work and some buffer might get
overflowed. By the way, this is an example of how static analyzers might be used to search for
vulnerabilities. An intruder can quickly find code fragments that contain errors for further thorough
investigation. Here is another code sample related to the safety issue:

V511 The sizeof() operator returns size of the pointer,

and not of the array, in 'sizeof (salt)' expression. common

visitedlink_common.cc 84

void MD5Update(MD5Context* context, const void* buf, size_t len);

VisitedLinkCommon::Fingerprint
VisitedLinkCommon::ComputeURLFingerprint(

...

const uint8 salt[LINK_SALT_LENGTH])

{

...

MD5Update(&ctx, salt, sizeof(salt));

...

}

The MD5Update() function will process as many bytes as the pointer occupies. This is a potential
loophole in the data encryption system, isn't it? I do not know whether it implies any danger; however,
from the viewpoint of intruders, this is a fragment for thorough analysis.

The correct code should look this way:

MD5Update(&ctx, salt, sizeof(salt[0]) * LINK_SALT_LENGTH);

Or this way:

VisitedLinkCommon::Fingerprint
VisitedLinkCommon::ComputeURLFingerprint(

...

const uint8 (&salt)[LINK_SALT_LENGTH])

{

...

MD5Update(&ctx, salt, sizeof(salt));

...

}

One more sample with a misprint:

V501 There are identical sub-expressions 'host !=

buzz::XmlConstants::str_empty ()' to the left and to the right

of the '&&' operator. chromoting_jingle_glue iq_request.cc 248

void JingleInfoRequest::OnResponse(const buzz::XmlElement* stanza) {

...

std::string host = server->Attr(buzz::QN_JINGLE_INFO_HOST);

std::string port_str = server->Attr(buzz::QN_JINGLE_INFO_UDP);

if (host != buzz::STR_EMPTY && host != buzz::STR_EMPTY) {

...

}

The port_str variable must be actually checked as well:

if (host != buzz::STR_EMPTY && port_str != buzz::STR_EMPTY) {

A bit of classics:

V530 The return value of function 'empty' is required to be utilized.

chrome_frame_npapi np_proxy_service.cc 293

bool NpProxyService::GetProxyValueJSONString(std::string* output) {

DCHECK(output);

output->empty();

...

}

It must be: output->clear();

And here is even the handling of a null pointer:

V522 Dereferencing of the null pointer 'plugin_instance' might take

place. Check the logical condition. chrome_frame_npapi

chrome_frame_npapi.cc 517

bool ChromeFrameNPAPI::Invoke(...)

{

ChromeFrameNPAPI* plugin_instance =

ChromeFrameInstanceFromNPObject(header);

if (!plugin_instance && (plugin_instance->automation_client_.get()))

return false;

...

}

One more example of a check that will never work:

V547 Expression 'current_idle_time < 0' is always false. Unsigned

type value is never < 0. browser idle_win.cc 23

IdleState CalculateIdleState(unsigned int idle_threshold) {

...

DWORD current_idle_time = 0;

...

// Will go -ve if we have been idle for a long time (2gb seconds).

if (current_idle_time < 0)

current_idle_time = INT_MAX;

...

}

Well, we should stop here. I can continue but it's starting to get boring. Remember that all this only
concerns the Chromium itself. But there are also tests with errors like this:

V554 Incorrect use of auto_ptr. The memory allocated with 'new []'

will be cleaned using 'delete'. interactive_ui_tests

accessibility_win_browsertest.cc 306

void AccessibleChecker::CheckAccessibleChildren(IAccessible* parent) {

...

auto_ptr<VARIANT> child_array(new VARIANT[child_count]);

...

}

There are also plenty of libraries Chromium is actually based on, the total size of libraries being much
larger than that of Chromium itself. They also have a lot of interesting fragments. It is clear that code
containing errors might not be used anywhere, still they are the errors nonetheless. Consider one of the
examples (the ICU library):

V547 Expression '* string != 0 || * string != '_'' is always true.

Probably the '&&' operator should be used here. icui18n ucol_sit.cpp

242

U_CDECL_BEGIN static const char* U_CALLCONV

_processVariableTop(...)

{

...

if(i == locElementCapacity && (*string != 0 || *string != '_')) {

*status = U_BUFFER_OVERFLOW_ERROR;

}

...

}

The "(*string != 0 || *string != '_')" expression is always true. Perhaps it must be: (*string == 0 || *string
== '_').

Conclusion
PVS-Studio was defeated. Chromium's source code is one of the best we have ever analyzed. We have
found almost nothing in Chromium. To be more exact, we have found a lot of errors and this article
demonstrates only a few of them. But if we keep in mind that all these errors are spread throughout the
source code with the size of 460 Mbytes, it turns out that there are almost no errors at all.

P.S.

I'm answering to the question: will we inform the Chromium developers of the errors we've found? No,
we won't. It is a very large amount of work and we cannot afford doing it for free. Checking Chromium is
far from checking Miranda IM or checking Ultimate Toolbox. This is a hard work, we have to study all of
the messages and make a decision whether there is an error in every particular case. To do that, we
must be knowledgeable about the project. We will this article to the Chromium developers, and should
they find it interesting, they will be able to analyze the project themselves and study all the diagnostic
messages. Yes, they will have to purchase PVS-Studio for this purpose. But any Google department can
easily afford this.

PVS-Studio vs Chromium

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (7)

Similar a PVS-Studio vs Chromium

Similar a PVS-Studio vs Chromium (20)

Más de Andrey Karpov

Más de Andrey Karpov (20)

Último

Último (20)

PVS-Studio vs Chromium