To assess the quality of PVS-Studio C# diagnostics, we test it on a large number of software projects. Since projects are written by different programmers from different teams and companies, we have to deal with different coding styles, shorthand notations, and simply different language features. In this article, I will give an overview of some of the features offered by the wonderful C# language, as well as the issues that one may run into when writing in this language.
1. Amusing C#
Author: Vitaliy Alferov
Date: 15.06.2016
To assess the quality of PVS-Studio C# diagnostics, we test it on a large number of software projects.
Since projects are written by different programmers from different teams and companies, we have to
deal with different coding styles, shorthand notations, and simply different language features. In this
article, I will give an overview of some of the features offered by the wonderful C# language, as well as
the issues that one may run into when writing in this language.
Properties and how they can be used
As we all know, a property is a pair of functions - accessor and mutator - designed for writing or reading
the value of a field. At least, things used to be that way before the release of C# version 3.0. In its
traditional form, a property used to look like this:
class A
{
int index;
public int Index
{
get { return index; }
set { index = value; }
}
}
Years went by, and both the language standards and properties have acquired a number of new
mechanisms.
2. So, here we go. The C# 3.0 standard brought us the well-known feature that allowed you to omit the
field; that is, to declare a property in the following way:
class A
{
public int Index { get; set; }
}
The idea was pushed even further in C# 6.0 by allowing programmers to omit "set" as well:
class A
{
public int Index { get; }
}
It was possible to use this style before C# 6.0 too, but you could not assign anything to a variable
declared in such a way. Now it has in fact become an equivalent to readonly fields, i.e. the values of such
properties can be assigned only in the constructor.
Properties and fields can be initialized in different ways. For example, like this:
class A
{
public List<int> Numbers { get; } = new List<int>();
}
Or like this:
class A
{
public List<int> Numbers = new List<int>();
}
One more version:
class A
{
public List<int> Numbers => new List<int>();
}
In the last case, though, you will be unpleasantly surprised. You see, what we have actually created
there is the following property:
class A
{
public List<int> Numbers { get { return new List<int>(); } }
}
3. That is, an attempt to fill Numbers with values will inevitably fail; you'll be getting a new list every time.
A a = new A();
a.Numbers.Add(10);
a.Numbers.Add(20);
a.Numbers.Add(30);
So be careful when using shorthand notations, as it may result in long bug-hunting sometimes.
These are not all the interesting features of properties. As I have already said, a property is a pair of
functions, and in C# nothing prevents you from changing the parameters of functions.
For example, the following code compiles successfully and even executes:
class A
{
int index;
public int Index
{
get { return index; }
set {
value = 20;
index = value; }
}
}
static void Main(string[] args)
{
A a = new A();
a.Index = 10;
Console.WriteLine(a.Index);
}
However, the program will always output the number "20", but never "10".
You may wonder why one would need to assign the value 20 to value? Well, it appears to make sense.
To explain this point, however, we'll have to set our discussion of properties aside for a while and talk
about the @ prefix. This prefix allows you to declare variables that resemble keywords in spelling. At the
same time, you are not prohibited from inserting this character wherever you please, for example:
class A
{
public int index;
public void CopyIndex(A @this)
4. {
this.@index = @this.index;
}
}
static void Main(string[] args)
{
A a = new A();
@a.@index = 10;
a.@CopyIndex(new A() { @index = 20 });
Console.WriteLine(a.index);
}
The output, as everywhere in this article, is the number "20", but never "10".
The @ prefix is actually required in one place only: when writing parameter name @this in the
CopyIndex function. When used elsewhere, it's just redundant code, which also lacks clarity.
Now that we know all that, let's get back to properties and take a look at the following class:
class A
{
int value;
public int Value
{
get { return @value; }
set { @value = value; }
}
public A()
{
value = 5;
}
}
You may think that the value field of class A will change in the Value property, but it won't, and the
following code will output 5, not 10.
static void Main(string[] args)
{
A a = new A();
a.Value = 10;
5. Console.WriteLine(a.Value);
}
Dictionary initialization
Let's first recall how arrays can be initialized:
string[] test1 = new string[] { "1", "2", "3" };
string[] test2 = new[] { "1", "2", "3" };
string[] test3 = { "1", "2", "3" };
string[,] test4 = { { "11", "12" },
{ "21", "22" },
{ "31", "32" } };
Lists are simpler:
List<string> test2 = new List<string>(){ "1", "2", "3" };
Now, what about dictionaries? There are actually two versions of shorthand initialization. The first:
Dictionary<string, int> test =
new Dictionary<string, int>() { { "a-a", 1 },
{ "b-b", 2 },
{ "c-c", 3 } };
The second:
Dictionary<string, int> test =
new Dictionary<string, int>() {
["a-a"] = 1,
["b-b"] = 2,
["c-c"] = 3
};
A few words about LINQ queries
LINQ queries are in themselves a convenient feature: you make a sequence of necessary samples and
get the required information at the output. Let's first discuss a couple of nice tricks that may not occur
to you until you see them. Let's start with a basic example:
void Foo(List<int> numbers1, List<int> numbers2) {
var selection1 = numbers1.Where(index => index > 10);
var selection2 = numbers2.Where(index => index > 10);
6. }
As you can easily see, the code above contains several identical checks, so it would be better to enclose
them in a separate "function":
void Foo(List<int> numbers1, List<int> numbers2) {
Func<int, bool> whereFunc = index => index > 10;
var selection1 = numbers1.Where(index => whereFunc(index));
var selection2 = numbers2.Where(index => whereFunc(index));
}
It looks better now; if functions are large, it's better still. The whereFunc call, however, looks somewhat
untidy. Well, it's not a problem either:
void Foo(List<int> numbers1, List<int> numbers2) {
Func<int, bool> whereFunc = index => index > 10;
var selection1 = numbers1.Where(whereFunc);
var selection2 = numbers2.Where(whereFunc);
}
Now the code does look compact and neat.
Now let's talk about the specifics of LINQ-query execution. For example, the following code line won't
trigger immediate sampling of data from the numbers1 collection.
IEnumerable<int> selection = numbers1.Where(whereFunc);
Sampling will start only after the sequence has been converted into the List<int> collection:
List<int> listNumbers = selection.ToList();
This nuance may cause a captured variable to be used after its value has changed. Here's a simple
example. Suppose we need function Foo to return only those elements of the "{ 1, 2, 3, 4, 5 }" array
whose numerical values are less than the current element's index. In other words, we need it to output
the following:
0 :
1 :
2 : 1
3 : 1, 2
4 : 1, 2, 3
Our function will have the following signature:
static Dictionary<int, IEnumerable<int>> Foo(int[] numbers)
{ .... }
And this is how we will call it:
foreach (KeyValuePair<int, IEnumerable<int>> subArray in
7. Foo(new[] { 1, 2, 3, 4, 5 }))
Console.WriteLine(string.Format("{0} : {1}",
subArray.Key,
string.Join(", ", subArray.Value)));
It doesn't seem to be difficult. Now let's write the LINGQ-based implementation itself. This is what it will
look like:
static Dictionary<int, IEnumerable<int>> Foo(int[] numbers)
{
var result = new Dictionary<int, IEnumerable<int>>();
for (int i = 0; i < numbers.Length; i++)
result[i] = numbers.Where(index => index < i);
return result;
}
Very easy, isn't it? We just "make" samples from the numbers array one by one.
However, what the program will output in the console is the following:
0 : 1, 2, 3, 4
1 : 1, 2, 3, 4
2 : 1, 2, 3, 4
3 : 1, 2, 3, 4
4 : 1, 2, 3, 4
The problem with our code has to do with the closure in the lambda expression index => index < i. The i
variable was captured, but because the lambda expression index => index < i was not called until the
string.Join(", ", subArray.Value) function was requested to return, the value that the variable referred to
was not the same as when the LINQ query had been formed. When retrieving data from the sample, the
i variable was referring to 5, which resulted in incorrect output.
Undocumented kludges in C#
The C++ language is famous for its hacks, workarounds, and other kludges - the series of XXX_cast
functions alone counts for a lot. It is commonly believed that C# doesn't have any such things. Well, it's
not quite true...
Here are a few keywords, for a start:
__makeref
__reftype
__refvalue
These words are unknown to IntelliSense, nor will you find any official MSDN entries on them.
So what are these wonder words?
8. __makeref takes an object and returns some "reference" to it as an object of type TypedReference. And
as for the words __reftype and __refvalue, they are used, respectively, to find out the type and the value
of the object referred to by this "reference".
Consider the following example:
struct A { public int Index { get; set; } }
static void Main(string[] args)
{
A a = new A();
a.Index = 10;
TypedReference reference = __makeref(a);
Type typeRef = __reftype(reference);
Console.WriteLine(typeRef); //=> ConsoleApplication23.Program+A
A valueRef = __refvalue(reference, A);
Console.WriteLine(valueRef.Index); //=> 10
}
Well, we could do this "stunt" using more common syntax:
static void Main(string[] args)
{
A a = new A();
a.Index = 10;
dynamic dynam = a;
Console.WriteLine(dynam.GetType());
A valuDynam = (A)dynam;
Console.WriteLine(valuDynam.Index);
}
The dynamic keyword allows us to both use fewer lines and avoid questions like "What's that?" and
"How does it work?" that programmers not familiar with those words may ask. That's fine, but here's a
somewhat different scenario where dynamic doesn't look that great compared to TypedReference.
static void Main(string[] args)
{
TypedReference reference = __makeref(a);
SetVal(reference);
Console.WriteLine(__refvalue(reference, A).Index);
}
static void SetVal(TypedReference reference)
9. {
__refvalue(reference, A) = new A() { Index = 20 };
}
The result of executing this code is outputting the number "20" in the console. Sure, we could pass
dynamic into the function using ref, and it would work just as well.
static void Main(string[] args)
{
dynamic dynam = a;
SetVal(ref dynam);
Console.WriteLine(((A)dynam).Index);
}
static void SetVal(ref dynamic dynam)
{
dynam = new A() { Index = 20 };
}
Nevertheless, I find the version with TypedReference better, especially when you need to pass the
information on and on through other functions.
There is one more wonder word, __arglist, which allows you to declare a variadic function whose
parameters can also be of any type.
static void Main(string[] args)
{
Foo(__arglist(1, 2.0, "3", new A[0]));
}
public static void Foo(__arglist)
{
ArgIterator iterator = new ArgIterator(__arglist);
while (iterator.GetRemainingCount() > 0)
{
TypedReference typedReference =
iterator.GetNextArg();
Console.WriteLine("{0} / {1}",
TypedReference.ToObject(typedReference),
TypedReference.GetTargetType(typedReference));
}
}
10. It is strange that the foreach statement can't be used as an out-of-the-box solution to iterate through a
list or access a list element directly. So, it's not that cool as C++ or JavaScript with its arguments :)
function sum() {
....
for(var i=0; i < arguments.length; i++)
s += arguments[i]
}
Conclusion
To sum it up, I'd like to say that C++ and C# are highly flexible languages as far as their grammar goes,
and that's why they are convenient to use on the one hand, but don't protect you from typos on the
other. There is an established belief that in C# it's impossible to make such mistakes as in C++, but it's
just not true. This article demonstrates rather interesting language features, but the bulk of errors in C#
has nothing to do with them; instead, they typically occur when writing common if-inductions, like in
Infragistics project. For example:
public bool IsValid
{
get {
var valid =
double.IsNaN(Latitude) || double.IsNaN(Latitude) ||
this.Weather.DateTime == Weather.DateTimeInitial;
return valid;
}
}
V3001 There are identical sub-expressions 'double.IsNaN(Latitude)' to the left and to the right of the '||'
operator. WeatherStation.cs 25
It is at points like this that human attention tends to weaken, which causes you later to waste a huge
amount of time trying to track down "God-knows-what–God-knows-where". So don't miss the chance to
protect yourself from bugs with the help of PVS-Studio static code analyzer.