Statistics is the science of
conducting studies to collect,
organize, summarize, analyze,
and draw conclusions from data.
The mathematics of the
collection, organization, and
interpretation of numerical data,
especially the analysis of
population characteristics from
sample datasets.
2
STATISTICS
Software Measurements
Establishment of Measurements, Analysis
and Forecasting of future events. For
Example;
• Measurement of Bugs Density
• Measurement of Invalid Processing of Bugs
• Measurement of Bugs reported by client
• Measurement of Project Issues
• Measurement of rejected Baseline requests
3
STATISTICAL SOFTWARE ENGINEERING
APPLICATIONS OF STATISTICS IN S.E.
Quality Management
Statistical methods for quality control
Analysis of Bugs, NCs and Issues
4
STATISTICAL SOFTWARE ENGINEERING
APPLICATIONS OF STATISTICS IN S.E.
Analysis of results of surveys
Quantitative Research Methodology
Descriptive Statistics
Correlation
Regression
Hypothesis Testing
Quality Control
P Charts, Control Charts
5
STATISTICAL SOFTWARE ENGINEERING
APPLICATIONS OF STATISTICS IN GENERAL
Analysis of computation of algorithms
Complexity and Performance
Analysis of Network traffic
Analysis of CPU and Memory utilization
Progress reporting to top Management
Graphs and Charts
Prediction and Forecasting of future
events
6
STATISTICAL SOFTWARE ENGINEERING
APPLICATIONS OF STATISTICS IN GENERAL
Development of research Instruments
Reliability Analysis
Validity Analysis
Analysis of quality of software
7
STATISTICAL SOFTWARE ENGINEERING
APPLICATIONS OF STATISTICS IN GENERAL
Inferential Statistics
• Inferential statistics
consists of generalizing
from samples to
populations, performing
hypothesis tests,
determining relationships
among variables, and
making predictions.
Descriptive Statistics
• Descriptive statistics
consists of the collection,
organization,
summarization, and
presentation of data
• Charts, Graphs, Tables,
Mean, Median, Mode etc
Types of Statistics
8
A variable is a characteristic or
attribute that can assume
different values.
For Example, if the duration of 30
activities were measured, then
duration would be a variable.
9
VARIABLE
10
QUALITATIVE VARIABLES
Qualitative variables are
variables that can be
placed into distinct
categories, according to
some characteristic or
attribute.
e.g. Gender, Geographical
Location of team, Nature
of Project, Designation of
employee
QUANTITATIVE VARIABLES
Quantitative variables are
numerical and can be
ordered or ranked.
e.g. Professional
Experience of Employee
(in years), Budget of
Project, No. of bugs in a
release.
CLASSIFICATION OF VARIABLES
Quantitative variables can be
further classified into two
groups:
Discrete and Continuous
11
CLASSIFICATION OF VARIABLES
QUANTITATIVE VARIABLES
Discrete variables can be
assigned values such as 0, 1, 2,
3 and are said to be countable.
e.g.
No. of software projects completed
by a company
No. of software engineers in a
team (in matrix based
organization)
12
CLASSIFICATION OF VARIABLES
1. DISCRETE VARIABLES
Continuous variables can
assume an infinite number of
values between any two specific
values. They often include
fractions and decimals. e.g.
Budget of a software project
Computed Bugs Density against a
release/build
13
CLASSIFICATION OF VARIABLES
2. CONTINUOUS VARIABLES
In addition to being classified as
qualitative or quantitative,
variables can be classified by how
they are categorized, counted, or
measured.
Measurement Scale has four types:
Nominal
Ordinal
Interval
Ratio
14
CLASSIFICATION OF VARIABLES w.r.t. MEASUREMENTS
The nominal level of measurement
classifies data into mutually
exclusive (non-overlapping)
categories in which no order or
ranking can be imposed on the
data.
e.g. Gender
Projects completed by company
Skills of Employee
Cities
15
CLASSIFICATION OF VARIABLES w.r.t. MEASUREMENTS
1. NOMINAL LEVEL OF MEASUREMENT
Dichotomous is a special type of
Nominal variable that comprises
only two possible values.
E.g. Gender (Male, Female)
Unit Test Result ( Pass, Fail)
Sanity Testing Result ( Pass, Fail)
16
CLASSIFICATION OF VARIABLES w.r.t. MEASUREMENTS
1.1 DICHOTOMOUS
The ordinal level of measurement
classifies data into categories that
can be ranked.
Mutually exclusive groups + order
E.g. Severity of Bugs ( Level-1, Level-
2, Level-3, Level-4)
Priority of Change Request ( High,
Medium, Low)
17
CLASSIFICATION OF VARIABLES w.r.t. MEASUREMENTS
2. ORDINAL LEVEL OF MEASUREMENT
The interval level of measurement
ranks data.
Precise differences between
Interval and Ratio measure do
exist; however, there is no
meaningful zero.
Interval variables have ordered
categories that are equally spaced.
E.g. Temperature (73 oF)
Calculated Bugs Density
18
CLASSIFICATION OF VARIABLES w.r.t. MEASUREMENTS
3. INTERVAL LEVEL OF MEASUREMENT
The ratio level of measurement
possesses all the characteristics of
interval measurement, and there
exists a true zero.
E.g. No. of Bugs
Estimated Effort for new project
Duration of Project
Delay of schedule
19
CLASSIFICATION OF VARIABLES w.r.t. MEASUREMENTS
4. RATIO LEVEL OF MEASUREMENT
Nominal Variable Nominal
Ordinal Variable Ordinal
Interval & Ratio Variables Scale
20
TYPES OF VARIABLES IN SPSS
Data
Data are the values (measurements or
observations) that the variables can
assume.
Data Set
A collection of data values forms a data
set. Each value in the data set is called a
data value or a datum.
21
SOME MORE DEFINITIONS
Population
A population consists of all subjects
(human or otherwise) that are being
studied.
Sample
A sample is a group of subjects selected
from a population.
22
SOME MORE DEFINITIONS
Read the following HR policy of a
software house regarding annual
increments of employees, and answer
the questions.
Employees who meet their deadlines 95-
100% of the time usually receive Rs. 20k
as an increment in their salary. Employees
who meet their deadline 80-90% of the
time usually receive Rs. 10k, and
employees who meet their deadlines less
than 80% of the time usually receive Rs.
5k as an increment in their salary.
23
CASE STUDY NO. 1 (HR POLICY)
Based on this information, ‘Meeting
deadlines’ and ‘Annual increments’ are
related. The more you meet deadlines,
the more likely it is you will receive a
higher increment. If you improve your
performance and meet deadlines of
maximum tasks, your annual
increment will probably improve.
24
CASE STUDY NO. 1 (HR POLICY)
1. What are the variables under study?
2. What are the data in the study?
3. Are descriptive, inferential, or both
types of statistics used?
4. What is the population under study?
5. Was a sample collected? If so, from
where?
6. From the information given, comment
on the relationship between the variables.
25
CASE STUDY NO. 1 (HR POLICY)
QUESTIONS
1. The variables are ‘Meeting deadlines’
and ‘Annual Increments’
2. The data consists of ‘Percentage of
Meeting Deadlines’ and ‘Amount of
increments’
3. These are descriptive statistics;
however, inference statement is also
present (i.e. Based on this information, ‘Meeting
deadlines’ and ‘Annual increments’ are related). So
these are also inferential statistics.
26
CASE STUDY NO. 1 (HR POLICY)
ANSWERS
4. The population under study is the
employees of software house.
5. Not specified
6. Based on the data, it appears that, in
general, the better you meet deadlines,
the higher will be your annual increment.
27
CASE STUDY NO. 1 (HR POLICY)
ANSWERS
Quality Management department of a
software house has published the
number of open bugs of five ‘In-
Progress’ software projects, during
Annual Quality review meeting.
28
CASE STUDY NO. 2 (PROJECTS QUALITY)
Project Name No. of Open Bugs
Project 1 500
Project 2 600
Project 3 350
Project 4 265
Project 5 1325
1. What are the variables under
study?
2. Categorize each variable as
quantitative or qualitative.
3. Categorize each quantitative
variable as discrete or continuous.
4. Identify the level of measurement
for each variable.
29
CASE STUDY NO. 2 (PROJECTS QUALITY)
QUESTIONS
5. ‘Project 4’ shows minimum number
of ‘Open Bugs’. Does that mean
‘Project 4’ is most successful project
among all 5 projects?
30
CASE STUDY NO. 2 (PROJECTS QUALITY)
QUESTIONS
1. The variables are
‘Project Name’ and ‘No. of Open Bugs’.
2. ‘Project Name’ is a Qualitative
variable, while ‘No. of Open Bugs’ is
quantitative variable.
3. The ‘No. of Open Bugs’ is Discrete
variable.
4. ‘Project Name’ is Nominal, while
‘No. of Open Bugs’ is ratio.
31
CASE STUDY NO. 2 (PROJECTS QUALITY)
ANSWERS
5. ‘Project 4’ shows minimum number
of ‘Open Bugs’: However, there may
be other things to consider, Size of
Project, Schedule of Project,
Compliance with client requirements.
Therefore, it is not necessary that a
project with minimum Open bugs is
most successful project of company.
32
CASE STUDY NO. 2 (PROJECTS QUALITY)
ANSWERS