This paper introduces you to GCC Compiler as a Performance testing tool. GCC Compiler has never been used before as a Performance Testing, but in this paper we’ll be discussing various points on how this can be implemented based on the objective of a Performance testing.
In performance testing, we don’t focus on the bugs in the code that we are testing, but its purpose being of removing the bottlenecks in the codes by taking under consideration parameters like CPU time, memory usage, Speed, Stability, scalability and so on.
To achieve this, we have used the GCC optimization options; different levels of optimization are also discussed as well as a comparative study between manual testing and automatic testing (by using Unix/Linux commands).
GCC Compiler as a Performance Testing tool for C programs
1. Symbiosis International University
Symbiosis Institute of Computer Studies and Research
SICSR
Verification and Validation Techniques:
By:
Daniel Ilunga Musenge
Performance Testing with GCC
Compiler
Academic year: 2014 - 2015
2. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
2
Abstract
This paper introduces you to GCC Compiler as a Performance testing tool. GCC
Compiler has never been used before as a Performance Testing, but in this
paper we’ll be discussing various points on how this can be implemented based
on the objective of a Performance testing.
In performance testing, we don’t focus on the bugs in the code that we are
testing, but its purpose being of removing the bottlenecks in the codes by taking
under consideration parameters like CPU time, memory usage, Speed, Stability,
scalability and so on.
To achieve this, we have used the GCC optimization options; different levels of
optimization are also discussed as well as a comparative study between manual
testing and automatic testing (by using Unix/Linux commands).
3. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
3
1.
GCC:
Short
for
GNU
Compiler
Collection,
GCC
is
a
collection
of
programming
compilers
including
C,
C++,
Objective-‐C,
Fortran,
Java,
and
Ada.
And
the
current
release
of
GCC
compiler
is
the
4.9.2
version.
2.
Performance
Testing:
• Software
performance
testing
is
a
means
of
quality
assurance
(QA).
It
involves
testing
software
applications
to
ensure
they
will
performs
• Features
and
Functionality
supported
by
a
software
system
is
not
the
only
concern.
A
software
application's
performance
like
its
response
time,
do
matter.
The
goal
of
performance
testing
is
not
to
find
bugs
but
to
eliminate
performance
bottlenecks.
Bottlenecks
are
obstructions
in
system,
which
degrade
overall
system
performance.
Bottlenecking
is
when
either
coding
errors
or
hardware
issues
cause
a
decrease
of
throughput
under
certain
loads.
Bottlenecking
is
often
caused
by
one
faulty
section
of
code.
The
key
to
fixing
a
bottlenecking
issue
is
to
find
the
section
of
code
that
is
causing
the
slow
down
and
try
to
fix
it
there.
We
usually
eliminate
the
performance
bottlenecks
by
either
fixing
poor
running
processes
or
adding
additional
hardware.
• The
focus
of
Performance
testing
is
checking
a
software
program's:
– Speed
–
Here
we
check
whether
the
application
/
software
responds
quickly
or
not.
– Scalability
-‐
Determines
maximum
user
load
the
software
application
can
handle.
A
software
product
suffers
from
poor
scalability
when
it
4. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
4
cannot
handle
the
expected
number
of
users
or
when
it
does
not
accommodate
a
wide
enough
range
of
users.
Load
testing
should
be
done
to
be
certain
the
application
could
handle
the
anticipated
number
of
users.
– Stability
-‐
Determines
if
the
application
is
stable
under
varying
loads
as
well
under
their
expected
workload.
• Some
common
performance
bottlenecks
are:
– CPU
utilization
– Memory
utilization
– Operating
System
limitations
etc.
à
GCC
as
a
Performance
Testing
tool:
As
we
all
know,
GCC
is
never
mentioned
to
be
a
testing
tool.
In
this
article,
we’ll
be
using
GCC
as
a
testing
tool
for
“Performance
Testing”
in
particular
apart
from
many
other
testing
techniques
that
we
can
use
with
the
help
of
GCC,
such
as
Black
box
Testing,
which
we
are
not
demonstrating
here.
To
begin,
I
would
like
to
start
with
how
we
can
execute
C
programs
using
the
GCC
compilers,
just
for
a
quick
review:
In
your
terminal:
dimsconsultants$
gcc
test.c
–o
test
The
above
line
will
make
the
gcc
compiler
use
the
‘test
as
the
output
file
with
the
help
of
the
‘-‐o’
option
and
test.c
is
the
C
file
out
of
which
test.exe
will
be
created.
As
mentioned
above,
the
goal
of
performance
testing
is
not
to
find
bugs
but
to
eliminate
performance
bottlenecks.
And
we
shall
look
at
the
few
following
parameters
for
performance
Testing:
5. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
5
• Processor
Usage
-‐
amount
of
time
processor
spends
executing
non-‐idle
threads.
• Memory
use
-‐
amount
of
physical
memory
available
to
processes
on
a
computer.
• Response
time
–
is
a
time
a
system
or
functional
unit
takes
to
react
to
a
given
input.
In
other
words,
we
can
say
that
the
response
time
is
the
total
amount
of
time
it
takes
to
respond
to
a
request
for
service.
That
service
can
be
anything
from
a
memory
fetch,
to
a
disk
IO,
to
a
complex
database
query,
or
loading
a
full
web
page
etc.
• Throughput
-‐
rate
a
computer
or
network
receives
requests
per
second.
• Top
waits
-‐
are
monitored
to
determine
what
wait
times
can
be
cut
down
when
dealing
with,
how
fast
data
is
retrieved
from
memory
• Thread
counts
-‐
An
applications
health
can
be
measured
by
the
no.
of
threads
that
are
running
and
currently
active.
• Garbage
collection
-‐
has
to
do
with
returning
unused
memory
back
to
the
system.
Garbage
collection
needs
to
be
monitored
for
efficiency
etc.
1. To
check
the
speed
of
you
C
program,
we
use
two
techniques:
a) Manual
time
calculation
as
in
the
below
example:
Test
programs:
a)
test01.c
program
#include
<stdio.h>
#include
<time.h>
//function
declaration
void
sum(int
no1,
int
no2);
int
main(){
clock_t
start
=
clock();
//starting
time
int
num1=10;
int
num2=10;
sum(num1,num2);
6. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
6
clock_t
end
=
clock();
//end
time
clock_t
elapsed
=
(end
-‐
start)
/
CLOCKS_PER_SEC;
printf("nElapse
time:
%.8fn",elapsed);
return
0;
}
void
sum(int
no1,
int
no2){
int
num1
=
no1;
int
num2
=
no2;
printf("%d",(num1+num2));
}
b) test02.c
program
#include
<stdio.h>
#include
<time.h>
//function
declaration
void
sum(short
int
no1,
short
int
no2);
int
main(){
clock_t
start
=
clock();
//starting
time
short
int
num1=10,
num2
=
10;
sum(num1,num2);
clock_t
end
=
clock();
//end
time
clock_t
elapsed
=
(end
-‐
start)
/
CLOCKS_PER_SEC;
printf("nElapse
time:
%.8fn",(float)elapsed);
return
0;
}
void
sum(short
int
no1,
short
int
no2){
printf("%d",(no1+no2));
}
c) test03.c
program:
#include
<stdio.h>
#include
<stdlib.h>
#include
<time.h>
7. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
7
int
main(int
argc,
char*argv[])
{
FILE
*fp;
char
ch;
int
count=0;
printf("n
Reading
a
file...n");
clock_t
start
=
clock(),
end;
fp=fopen("objective.txt","r");
if(fp==NULL)
{
printf("n
File
not
found!fp=NULLnn");
}
else
{
//using
while
will
remove
that
extra
character
that
comes
after
reaching
the
end
of
file
ch=fgetc(fp);//to
at
least
have
one
character
to
see
whether
enter
into
a
loop
or
not
while(ch!=EOF)
{
//putchar(ch);
ch=fgetc(fp);
++count;
}
printf("n
the
no
of
char
is
=
%dnn",count);
}
fclose(fp);//should
be
used
to
close
the
file.
printf
("Calculating...n");
end
=
clock()
-‐
start;
printf("nnElapsed
time:
%.8fnn",((float)end)/CLOCKS_PER_SEC
);
8. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
8
return
0;
}
d) test04.c
program:
#include
<stdio.h>
#include
<stdlib.h>
#define
SIZE
30
void
swap(int
*x,int
*y);
void
selection_sort(int*
a,
const
int
n);
void
display(int
a[],int
size);
void
main()
{
int
a[SIZE]
=
{8,5,2,3,1,6,9,4,0,7,8,5,2,3,12,36,9,44,102,71,8,45,22,3,11,66,9,4,0,7};
int
i;
printf("The
array
before
sorting:n");
display(a,SIZE);
selection_sort(a,SIZE);
printf("The
array
after
sorting:n");
display(a,SIZE);
}
/*
swap
two
integers
*/
void
swap(int
*x,int
*y)
{
9. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
9
int
temp;
temp
=
*x;
*x
=
*y;
*y
=
temp;
}
/*
perform
selection
sort
*/
void
selection_sort(int*
a,const
int
size)
{
int
i,
j,
min;
for
(i
=
0;
i
<
size
-‐
1;
i++)
{
min
=
i;
for
(j
=
i
+
1;
j
<
size;
j++)
{
if
(a[j]
<
a[min])
{
min
=
j;
}
}
swap(&a[i],
&a[min]);
}
}
/*
display
array
content
*/
void
display(int
a[],const
int
size)
{
int
i;
for(i=0;
i<size;
i++)
printf("%d
",a[i]);
printf("n");
}
10. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
10
e) test05.c
program:
/**
*
Test
C
program
*
Author:
Daniel
Ilunga
*
Edited:
02/02/2015
*
Version:
1.0
*/
#include
<stdio.h>
#include
<stdlib.h>
#include
<math.h>
#include
"myctype.h"
#include
<time.h>
void
twinprime();
int
prime();
int
frequency_of_primes
(int
n)
;
void
str_comb();
void
russianMethod();
void
reverseArray();
void
toupperCase1();
int
main(){
clock_t
start
=
clock(),
end;
//twin
prime
twinprime();
//str_comb
str_comb();
//Russian
method
of
multiplication
russianMethod();
11. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
11
reverseArray();
//reverse
array
function
toupperCase1();
//toupperCase1()
function
printf
("Calculating...n");
long
int
fq;
fq
=
frequency_of_primes
(499999);
printf
("nThe
number
of
primes
lower
than
500,000
is:
%ldn",fq);
end
=
clock()
-‐
start;
printf("nnElapsed
time:
%.8fnn",((float)end)/CLOCKS_PER_SEC
);
return
0;
}
void
twinprime()
{
int
no1=11;
int
no2=13;
printf("n
*****
Twin
Prime
number
Program
******
n");
//printf("nEnter
first
number
=
");
//scanf("%d",&no1);
//printf("nEnter
second
number
=
");
//scanf("%d",&no2);
int
diff=0;
if(no1>no2)
{
diff=no1-‐no2;
if(diff==2)
{
if(prime(no1)
&&
prime(no2))
{
printf("nnTwine
Pime!!nn");
}
else{
printf("nnNot
twine
prime!!nn");
}
}
13. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
13
if(flag==1)
return
0;
else
return
1;
}
}
int
frequency_of_primes
(int
n)
{
printf("nn****
Count
of
all
Prime
Numbers
<
500000
****nn");
int
i,j;
int
freq=n-‐1;
for
(i=2;
i<=n;
++i)
for
(j=sqrt(i);j>1;-‐-‐j)
if
(i%j==0)
{
-‐-‐freq;
break;
}
return
freq;
}
void
str_comb(){
int
cnt=4;
printf("n
*****
String
Combination
Program
******
nn");
//printf("Enter
no
char
=
");
//scanf("%d",&cnt);
char
str[]
=
{'a','b','c','d'};
//printf("Enter
a
string
=
");
//scanf("%s",str);
for
(int
i
=
0;
i
<
cnt;
++i)
{
printf("%c
",str[i]
);
14. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
14
}
printf("n");
char
temp=0;
int
i=0;
while
(i<cnt)
{
for
(int
j
=
0;
j
<
cnt-‐1;
++j)
{
temp=str[j];
str[j]=str[j+1];
str[j+1]=temp;
for
(int
d
=
0;
d
<
cnt;
++d)
{
printf("%c
",str[d]
);
}
printf("n");
}
i++;
}
}
void
russianMethod(){
int
num1=20,
num2=13,
num3;
int
j=0;//counter
variable
int
result=0;
printf("n
*****
Russian
Method
of
Multiplication
Program
******
nn");
//printf("n
Enter
the
first
number
=
");
//scanf("%d",&num1);
//printf("n
Enter
the
second
number
=
");
//scanf("%d",&num2);
num3=num1;//to
keep
track
of
the
initial
value
of
num2
int
count=0;
while(num1>0)
{
15. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
15
num1=num1/2;
++count;
}//finding
the
size
that
we
can
use
for
storing
the
values
in
the
arrays
//printf("count=
%d",count);
int
arrdiv[count];
int
arrmul[count];
arrdiv[0]=num3;
arrmul[0]=num2;
for(j=1;j<count;j++)
{
num3=num3/2;
num2=num2*2;
arrdiv[j]=(num3);
arrmul[j]=(num2);
}
for(j=0;j<count;j++)
{
if(arrdiv[j]%2!=0)
{
result=result+arrmul[j];
}
}
printf("n
Using
-‐
Russian
method
of
multiplication
the
result
is
=
%dnn",result);
}
void
toupperCase1()
{
printf("nn****
ToUpperCase
****nn");
int
EXIT_SIZE
=
15,
START_SIZE
=
0;//the
size
of
the
array
//printf("n
Enter
the
size
of
the
array
=
");
//scanf("%d",&EXIT_SIZE);
int
i=0,count=0;
int
cnt=START_SIZE;//initialisation
of
cnt
char
*ptr;//declaration
of
the
pointer
that
will
point
at
the
array
char
*temp;
ptr=(char*)(malloc(EXIT_SIZE*sizeof(char)));
16. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
16
temp=ptr;
printf("Enter
your
string
=
");
scanf("%s",ptr);//reading
the
elements
of
the
array
for(i=START_SIZE;i<EXIT_SIZE;++i)
{
*(temp+i)=*(ptr+i);
++count;
}
for(cnt=START_SIZE,i=START_SIZE;cnt<EXIT_SIZE;++cnt,++i)
{
if((*(ptr+i))>=97
&&
(*(ptr+i))<=122)
{
*(ptr+i)=(*(ptr+i)-‐32);
}
if((*(ptr+i))>=65
&&
(*(ptr+i))<=90)
{
*(ptr+i)=*(temp+i);
}
}
printf("n%snn",ptr);
printf("The
length
of
the
array
is
=
%dnn",count);
}
void
reverseArray(){
printf("nn****
Reverse
array
****nn");
int
START_SIZE
=
0,
EXIT_SIZE
=
20;//the
size
of
the
array
printf("n
Enter
the
size
of
character
array
=
");
scanf("%d",&EXIT_SIZE);
int
i=0,count=0;
int
cnt=START_SIZE;//initialisation
of
cnt
char
*ptr;//declaration
of
the
pointer
that
will
point
at
the
array
17. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
17
char
*temp;
ptr=(char*)(malloc(EXIT_SIZE*sizeof(char)));//assigning
memory
that
will
be
pointed
by
ptr
temp=ptr;
printf("Enter
your
string
=
");
scanf("%s",ptr);//reading
the
elements
of
the
array
//ptr=temp;//to
make
the
temp
pointer
point
at
the
initial
element
ptr=ptr-‐EXIT_SIZE;
for(i=(EXIT_SIZE-‐1),cnt=START_SIZE;i>=START_SIZE;-‐-‐i,++cnt)
{
*(ptr+i)=*(temp+cnt);
}
START_SIZE
=
0;
for(cnt=START_SIZE;cnt<EXIT_SIZE;++cnt)
{
printf("%c",*(ptr+cnt));
++count;
}
printf("nnThe
length
of
the
array
is
%dnn",count);
}
à
Using
the
time
command,
will
return
the
output
and
the
elapsed
manually
calculated
in
the
test.c
program
depending
on
the
operating
system
on
which
it
runs
as
well
as
the
machine
characteristics.
dimsconsultants$
gcc
test.c
–Wall
–o
dims
dimsconsultants$
./dims
18. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
18
Output:
Elapse
time:
0.00000000
-‐
this
elapsed
time
is
equivalent
to
the
user
time
if
performed
automatically
with
the
help
of
the
time
command.
-‐
the
“-‐Wall”
option
is
to
enable
the
warnings
in
the
program.
To
silence
the
warning
we
should
use
“-‐w”,
and
remember
both
options
are
case
sensitive.
f) Automatic
time
calculation
using
the
time
command
as:
dimsconsultants$
gcc
test.c
–Wall
–o
dims
dimsconsultants$
time
./dims
#
will
return
the
time
with
three
components:
Output:
-‐>
real
0m0.005s
-‐>
user
0m0.001s
-‐>
sys
0m0.002s
Note:
-‐ the
above
program
is
executed
on
a
machine
with
the
following
characteristics:
-‐ Processor:
2.3
GHz
Intel
Core
i7
-‐ Memory:
16
GB
1600
MHz
DDR3
Which
also
means
that
the
output
will
vary
based
on
the
system
the
program
is
tested
or
executed.
For
instance,
on
a
multi-‐processor
machine,
a
multi-‐threaded
process
might
have
an
elapsed
time
lesser
than
the
total
CPU
time
–
due
to
the
fact
19. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
19
that,
threads
or
processes
may
run
in
parallel
and
also
because
the
time
statistics
reported
come
from
different
origins,
so
the
recorded
time
for
every
small
running
task
can
be
subject
to
rounding
errors,
etc.
The
meaning
of
the
above
output
is
explained
in
the
below
lines.
-‐>
Real,
User
and
Sys
process
time
statistics:
One
of
these
things
is
not
like
the
other.
Real
refers
to
actual
elapsed
time;
User
and
Sys
refer
to
CPU
time
used
only
by
the
process.
• Real
is
wall
clock
time
-‐
time
from
start
to
finish
of
the
call.
This
is
all
elapsed
time
including
time
slices
used
by
other
processes
and
time
the
process
spends
blocked
(for
example
if
it
is
waiting
for
I/O
to
complete).
• User
is
the
amount
of
CPU
time
spent
in
user-‐mode
code
(outside
the
kernel)
within
the
process.
This
is
only
actual
CPU
time
used
in
executing
the
process.
Other
processes
and
time
the
process
spends
blocked
do
not
count
towards
this
figure.
• Sys
is
the
amount
of
CPU
time
spent
in
the
kernel
within
the
process.
This
means
executing
CPU
time
spent
in
system
calls
within
the
kernel,
as
opposed
to
library
code,
which
is
still
running
in
user-‐space.
Like
'user',
this
is
only
CPU
time
used
by
the
process.
See
below
for
a
brief
description
of
kernel
mode
(also
known
as
'supervisor'
mode)
and
the
system
call
mechanism.
To
find
out
the
CPU
time
how
process
has
used
for
its
execution,
we’ll
have
to
sum
the
User
time
and
The
System
time
-‐
User
+
Sys.
We
must
remember
also
that
this
is
on
all
CPUs,
so
if
the
process
has
multiple
threads,
it
could
exceed
the
time
reported
by
Real.
In
The
output
we
get,
the
figures
include
the
User
and
Sys
time
of
all
child
processes
or
threads
associated
to
the
process.
à
More
about
'sys'
Normally,
there
are
many
things
that
the
programmers
code
can’t
do
in
the
user
mode,
such
as
allocating
memory
or
accessing
or
requesting
some
I/o.
Everything
within
an
operating
system
is
under
supervision
of
the
kernel.
Some
operations
like
malloc,
calloc,
fread
/
fwrte
etc.
will
invoke
the
kernel
functions
and
will
be
counted
under
‘sys’
time.
But,
this
does
not
mean,
"every
call
to
malloc
will
be
20. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
20
counted
in
'sys'
time".
The
call
to
malloc
will
do
some
processing
of
its
own
(still
counted
in
'user'
time)
and
then
somewhere
along
the
way
call
the
function
in
kernel
(counted
in
'sys'
time).
After
returning
from
the
kernel
call,
there
will
be
some
more
time
in
'user'
and
then
malloc
will
return
to
your
code.
As
for
when
the
context
switches
happen,
and
how
much
time
associated
to
it
is
counted
in
the
kernel
mode
(sys
time).
It
depends
on
the
implementation
of
the
library.
Also,
other
seemingly
innocent
functions
might
also
use
malloc
and
the
like
in
the
background,
which
will
again
have
some
time
in
'sys'
time.
How do I test the Performance of my C Code?
There
are
two
techniques
to
achieve
this:
• Enable
a
higher
level
of
compiler
optimization
(-‐O
in
gcc),
which
we
are
going
to
discuss
about
with
more
details
on
how
this
can
be
done.
• And
turn
off
the
debug
option
if
it's
enabled
(-‐g
in
gcc).
Depending
on
your
workload,
this
may
improve
performance
by
10
to
50%.
We
can
also
use
the
(-‐pg
in
gcc)
to
reduce
the
execution
time
of
our
code.
By
default,
the
debugging
mode
is
enabled.
à
Brief
introduction
to
different
levels
of
optimization:
There
are
seven
-‐O
settings
or
seven
levels
of
optimization:
-‐O0,
-‐O1,
-‐O2,
-‐O3,
-‐
Os,
Og,
and
-‐Ofast.
Only
use
one
of
option
at
a
time.
Some
users
boast
about
even
better
performance
obtained
by
using
-‐O4,
-‐O9,
and
so
on,
but
the
reality
is
that
-‐
O
levels
higher
than
3
have
no
effect.
The
compiler
may
accept
CFLAGS
like
-‐O4,
but
it
actually
doesn't
do
anything
with
them.
It
only
performs
the
optimizations
for
-‐
O3,
nothing
more.
Now,
we’ll
be
learning
the
different
level
of
optimization
one
by
one
and
what
they
do:
With
the
exception
of
-‐O0,
the
-‐O
settings
each
activate
several
additional
flags,
so
be
sure
to
read
the
GCC
manual's
chapter
on
optimization
options
to
learn
which
flags
are
activated
at
each
-‐O
level,
as
well
as
some
explanations
as
to
what
they
do.
21. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
21
Let
us
examine
each
optimization
level:
-‐O0:
This
level
(that
is
the
letter
"O"
followed
by
a
zero)
turns
off
optimization
entirely
and
is
the
default
if
no
-‐O
level
is
specified
in
CFLAGS
or
CXXFLAGS.
This
reduces
compilation
time
and
can
improve
debugging
info,
but
some
applications
will
not
work
properly
without
optimization
enabled.
This
option
is
not
recommended
except
for
debugging
purposes.
-‐O1:
the
most
basic
optimization
level.
The
compiler
will
try
to
produce
faster,
smaller
code
without
taking
much
compilation
time.
It
is
basic,
but
it
should
get
the
job
done
all
the
time.
-‐O2:
A
step
up
from
-‐O1.
The
recommended
level
of
optimization
unless
the
system
has
special
needs.
-‐O2
will
activate
a
few
more
flags
in
addition
to
the
ones
activated
by
-‐O1.
With
-‐O2,
the
compiler
will
attempt
to
increase
code
performance
without
compromising
on
size,
and
without
taking
too
much
compilation
time.
-‐O3:
the
highest
level
of
optimization
possible.
It
enables
optimizations
that
are
expensive
in
terms
of
compile
time
and
memory
usage.
Compiling
with
-‐O3
is
not
a
guaranteed
way
to
improve
performance,
and
in
fact,
in
many
cases,
can
slow
down
a
system
due
to
larger
binaries
and
increased
memory
usage.
-‐O3
is
also
known
to
break
several
packages.
Using
-‐O3
is
not
recommended.
-‐Os:
optimizes
code
for
size.
It
activates
all
-‐O2
options
that
do
not
increase
the
size
of
the
generated
code.
It
can
be
useful
for
machines
that
have
extremely
limited
disk
storage
space
and/or
CPUs
with
small
cache
sizes.
-‐Og:
In
GCC
4.8,
a
new
general
optimization
level,
-‐Og,
has
been
introduced.
It
addresses
the
need
for
fast
compilation
and
a
superior
debugging
experience
while
providing
a
reasonable
level
of
runtime
performance.
Overall
experience
for
development
should
be
better
than
the
default
optimization
level
-‐O0.
Note
that
-‐
Og
does
not
imply
-‐g,
it
simply
disables
optimizations
that
may
interfere
with
debugging.
-‐Ofast:
New
in
GCC
4.7,
consists
of
-‐O3
plus
-‐ffast-‐math,
-‐fno-‐protect-‐parens,
and
-‐
fstack-‐arrays.
This
option
breaks
strict
standards
compliance,
and
is
not
recommended
for
use.
22. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
22
As
previously
mentioned,
-‐O2
is
the
recommended
optimization
level.
If
package
compilation
fails
and
while
not
using
-‐O2,
try
rebuilding
with
that
option.
-‐pipe
A
common
flag
is
-‐pipe.
This
flag
has
no
effect
on
the
generated
code,
but
it
makes
the
compilation
process
faster.
It
tells
the
compiler
to
use
pipes
instead
of
temporary
files
during
the
different
stages
of
compilation,
which
uses
more
memory.
On
systems
with
low
memory,
GCC
might
get
killed.
In
those
cases
do
not
use
this
flag.
Now,
let
say
we
do:
dimsconsultants$
gcc
test.c
–o
dims
è use
the
time
command
to
view
how
much
time
this
gives
you:
dimsconsultants$
time
./dims
-‐ Remember
that
the
above
commands
are
tested
on
a
machine
with
the
following
characteristics:
-‐ Processor:
2.3
GHz
Intel
Core
i7
-‐ Memory:
16
GB
1600
MHz
DDR3
Therefore,
the
output
might
different
from
one
machine
to
another.
But,
the
concept
will
remain
the
same.
Table
1:
Speed
Test
–
before
removing
the
bottlenecks
Program
Response
Throughput
CPU
Time
System
time
23. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
23
time
(User)
(Real)
(User
+
Sys)
(Sys)
test01.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
test02.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
test03.c
0m0.005s
0m0.009s
0m0.007s
0m0.002s
test04.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
test05.c
0m0.535s
0m9.702s
0m0.537s
0m0.002s
Remember
that
the
goal
of
performance
testing
is
not
to
find
bugs
but
to
eliminate
performance
bottlenecks.
A
good
question
should
be,
how
do
we
eliminate
the
performance
bottlenecks?
We’ll
use
optimization
levels
to
eliminate
the
bottlenecks
in
our
programs
without
we
doing
any
change
within
our
codes
but
making
them
better
code
with
good
performance:
dimsconsultants$
gcc
test.c
–g
–o
dims
#or
also
the
below
can
be
used:
dimsconsultants$
gcc
test.c
–o
dims
–O1
#first
level
of
optimization
è use
the
time
command
to
view
how
much
time
this
gives
you:
dimsconsultants$
time
./dims
Table
2:
Speed
Test
–
after
removing
the
bottlenecks:
-‐O1
(level)
Program
Response
time
(User)
Throughput
(Real)
CPU
Time
(User
+
Sys)
System
time
(Sys)
test01.c
0m0.001s
0m0.004s
0m0.003s
0m0.002s
24. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
24
test02.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
test03.c
0m0.005s
0m0.008s
0m0.007s
0m0.002s
test04.c
0m0.001s
0m0.004s
0m0.003s
0m0.002s
test05.c
0m0.465s
0m6.600s
0m0.468s
0m0.003s
Table
3:
Program
Before
removing
bottlenecks
After
removing
bottlenecks
User
Real
CPU
Sys
User
Real
CPU
Sys
test01.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
0m0.001s
0m0.004s
0m0.003s
0m0.002s
test02.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
0m0.001s
0m0.004s
0m0.003s
0m0.002s
test03.c
0m0.005s
0m0.009s
0m0.007s
0m0.002s
0m0.005s
0m0.008s
0m0.007s
0m0.002s
test04.c
0m0.001s
0m0.005s
0m0.003s
0m0.002s
0m0.001s
0m0.004s
0m0.003s
0m0.002s
test05.c
0m0.535s
0m9.702s
0m0.537s
0m0.002s
0m0.465s
0m6.600s
0m0.468s
0m0.003s
25. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
25
èComparison
of
different
levels
of
optimization:
Program
Level
Response
time
(User)
Throughput
(Real)
CPU
Time
(User
+
Sys)
System
time
(Sys)
test05.c
-‐00
0m0.559s
0m4.879s
0m0.540s
0m0.003s
-‐01
0m0.465s
0m6.600s
0m0.468s
0m0.003s
-‐02
0m0.460s
0m5.299s
0m0.462s
0m0.002s
-‐03
0m0.467s
0m6.179s
0m0.470s
0m0.003s
-‐O4
0m0.467s
0m5.066s
0m0.470s
0m0.003s
-‐0s
0m0.458s 0m5.129s 0m0.460s 0m0.002s
-‐0fast
0m0.449s 0m6.301s 0m0.451s 0m0.002s
Other
options
-‐g
0m0.548s 0m5.615s 0m0.550s 0m0.002s
-‐pg
0m0.528s 0m6.179s 0m0.531s 0m0.003s
-‐pipe
0m0.543s 0m4.812s 0m0.545s 0m0.002s
à
Things
to
remember:
Ø -‐O1:
The
purpose
of
the
first
level
of
optimization
is
to
produce
an
optimized
image
in
a
short
amount
of
time.
These
optimizations
typically
don't
require
significant
amounts
of
compile
time
to
complete.
dimsconsultants$
gcc
-‐O1
-‐o
test
test.c
Any
optimization
can
be
enabled
outside
of
any
level
simply
by
specifying
its
name
with
the
-‐f
prefix,
as:
dimsconsultants$
gcc
-‐fdefer-‐pop
-‐o
test
test.c
26. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
26
We
also
could
enable
level
1
optimization
and
then
disable
any
particular
optimization
using
the
-‐fno-‐
prefix,
like
this:
dimsconsultants$
gcc
-‐O1
-‐fno-‐defer-‐pop
-‐o
test
test.c
This
command
would
enable
the
first
level
of
optimization
and
then
specifically
disable
the
defer-‐pop
optimization.
Ø -‐O2:
The
level
-‐O2
optimizations
include
all
of
the
-‐O1
optimizations,
plus
a
large
number
of
others.
dimsconsultants$
gcc
-‐O2
-‐o
test
test.c
Ø -‐Os:
The
special
optimization
level
(-‐Os
or
size)
enables
all
-‐O2
optimizations
that
do
not
increase
code
size;
it
puts
the
emphasis
on
size
over
speed.
This
includes
all
second-‐level
optimizations,
except
for
the
alignment
optimizations.
dimsconsultants$
gcc
-‐Os
-‐o
test
test.c
Ø -‐O3:
The
third
and
highest
level
enables
even
more
optimizations
like
described
in
the
below
table
by
putting
emphasis
on
speed
over
size.
This
includes
optimizations
enabled
at
-‐O2.
dimsconsultants$
gcc
-‐O3
-‐o
test
test.c
Although,
-‐O3
and
–Ofast
can
produce
fast
code.
But,
it
is
recommended
to
use
the
second
level
of
optimization
-‐O2
to
increase
the
performance
of
your
code.
28. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
28
Table
4:
Memory
usage
before
removal
of
bottlenecks
dimsconsultants$
gcc
test01.c
–c
–o
dims
dimsconsultants$
size
dims
Program
Memory
usage
TEXT
DATA
OBJC
Others
Dec
Hex
test01.c
312
0
0
64
376
178
test02.c
326
0
0
64
390
186
test03.c
577
0
0
32
609
261
test04.c
975
0
0
128
1103
44F
test05.c
3559
0
0
256
3815
EE7
Table
5:
Memory
usage
after
removal
of
bottlenecks
dimsconsultants$
gcc
test01.c
–c
–o
dims
-‐Os
dimsconsultants$
size
dims
Program
Memory
usage
TEXT
DATA
OBJC
Others
Dec
Hex
test01.c
231
0
0
62
295
127
test02.c
238
0
0
64
302
12E
test03.c
448
0
0
32
480
1E0
test04.c
734
0
0
128
862
35E
test05.c
2071
0
0
256
2327
917
29. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
29
Program
Memory
usage
-‐
Before
Memory
usage
-‐
After
TEXT
DATA
OBJC
Others
Dec
Hex
TEXT
DATA
OBJC
Others
Dec
Hex
test01.c
312
0
0
64
376
178
231
0
0
62
295
127
test02.c
326
0
0
64
390
186
238
0
0
64
302
12E
test03.c
577
0
0
32
609
261
448
0
0
32
480
1E0
test04.c
975
0
0
128
1103
44F
734
0
0
128
862
35E
test05.c
3559
0
0
256
3815
EE7
2071
0
0
256
2327
917
Table
6:
Memory
usage
in
a
more
human
readable
form
dimsconsultants$
gcc
test01.c
–c
–o
dims
dimsconsultants$
ls
–hl
dims
#size
in
KB
dimsconsultants$
ls
–l
dims
#
size
in
Bytes
Program
Before
removing
bottlenecks
test01.c
1192B | 1.2KB
test02.c
1208B | 1.2KB
test03.c
1656B | 1.6KB
test04.c
2368B | 2.3KB
test05.c
6320B | 6.2KB
Table
7:
Memory
usage
in
a
more
human
readable
form
-‐
Optimized
dimsconsultants$
gcc
test01.c
–c
–o
dims
-‐Os
dimsconsultants$
ls
–hl
dims
#size
in
KB
dimsconsultants$
ls
–l
dims
#
size
in
Bytes
30. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
30
Program
After
removing
bottlenecks
test01.c
1120B | 1.1KB
test02.c
1128B | 1.1KB
test03.c
1644B | 1.6KB
test04.c
2124B | 2.1KB
test05.c
4784B | 4.7KB
Table
8:
Program
Memory
usage
Before
After
test01.c
1192B | 1.2KB
1120B | 1.1KB
test02.c
1208B | 1.2KB
1128B | 1.1KB
test03.c
1656B | 1.6KB
1644B | 1.6KB
test04.c
2368B | 2.3KB
2124B | 2.1KB
test05.c
6320B | 6.2KB
4784B | 4.7KB
-‐>
The
above
command
is
used
to
show
how
much
memory
is
reduced.
To
just
execute
the
code
with
the
optimization
levels
do:
(without
the
‘-‐c’
option
which
stops
the
entire
process
at
the
compilation
level
only).
But
the
easiest
way
is
to
use
the
–save-‐temps
command
that
will
allow
you
test
all
different
files
involved
in
the
compilation
process.
dimsconsultants$
gcc
test01.c
–o
dims
–Os
dimsconsultants$
size
./dims
#ls
–lh
dims
Note:
There
is
quite
a
lot
to
explore
when
it
comes
to
using
time
command,
such
as
the
time
taken
by
the
gcc
command
to
compile,
execute
etc.,
and
how
much
memory
the
program
under
test
takes
at
every
compilation
stage.
31. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
31
ØConclusion:
We
can
therefore
say
that,
though
not
known
as
a
tool
for
Performance
Testing,
GCC
compiler
has
lots
of
abilities
that
we
should
look
at
and
exploit.
With
the
help
of
various
Linux
commands,
we
were
able
to
test
parameters
such
as, CPU time,
memory usage, Speed, Stability, scalability and so on.
To achieve this, we have used the GCC optimization options; different levels of
optimization we of great help for us. Though these options are of great help, one
should know which level to use based on the understanding of each optimization
level.
32. Verification
and
Validation
Techniques
GCC
Compiler
as
a
Performance
Testing
tool
32
Ø
References:
1.
GCC
Documentation:
-‐
https://gcc.gnu.org/onlinedocs/gcc/Optimize-‐Options.html