HTML Injection Attacks: Impact and Mitigation Strategies
ICSE10a.ppt
1. An Exploratory Study
of the Evolution
of Software Licensing
Massimiliano Di Penta**
Daniel M. German**
Yann-Gaël Guéhéneuc***
Giuliano Antoniol***
*University of Sannio, Italy
***University of Victoria, Canada
***Ecole Polytechnique de Montréal, Canada
2. Motivations
OpenBSD founder and project leader Theo de Raadt removed a
security software package called IP-Filter [written by Darren Reed]
after its author changed its license.
Stephen Shankland, CNET News, 2001/05/30.
Licenses evolve as software does
Failing to account for that would cause copyright infringements
Decisions on license changes impact as other decisions on software
evolution
Little attention so far from the scientific community
Need for methods and tools to audit licensing and their
changes
ICSE 2010 - Cape Town, SA 2
3. Outline
Motivating Examples
Licensing monitoring and analysis process
Empirical study definition
Study results
Conclusions and work-in-progress
ICSE 2010 - Cape Town, SA 3
4. Example: Java
Until November 2006, the license of Java JDK v1.2 said:
“Except as specifically authorized in any
Supplemental License Terms, you may not make
copies of Software, other than a single copy of
Software for archival purposes”
This disallowed the inclusion of Java in Linux distributions
Java 5.0 released under the GPL v2 with the CLASSPATH
exception:
Java could be modified/updated under the GPL v2
Java programs could be released under any license as long as
they satisfy the conditions stated in the CLASSPATH exception
Changing the license of a system can promote and
ease the distribution and reuse of a software
system
ICSE 2010 - Cape Town, SA 4
5. Example: Mono
Framework produced by Novell to support the .Net API
under non Microsoft OS
Initially distributed under the GPL v2
potential problem when running .Net systems
Considered derivative works of Mono
Required to be also released under the GPL v2
Mono developers changed its license to MIT/X11
The change was also required by HP for its participation to the
project
A change to a more permissive license may
increase the size of the community of
contributors to a FOSS system
ICSE 2010 - Cape Town, SA 5
6. Example: QT
QT was first released under a non-open source but free license,
called the FreeQT License, and a commercial license
QT became the basis for KDE
QT v2.0 was released under a new license, the Q Public License
→ incompatible with the GPL
The GNOME project was started as a QT-free alternative to KDE
The Harmony project started as replacement of QT
Licensed under the GPL
Trolltech changed the license of QT v3 to the GPL v2
→The Harmony project was abandoned
Changing the license of FOSS system towards a
more permissive might cause the abandonment of
a competing system
ICSE 2010 - Cape Town, SA 6
7. Example: MySQL
In 2004, MySQL AB changed the license of its client
libraries from LGPL v2.1 to GPL v2
to prevent industrial companies from using the libraries within
proprietary products
Unintended consequences:
PHP systems were no longer able to connect to MySQL
PHP license is incompatible with the GPL v2
MySQL addressed this problem by adding the MySQL
FOSS License Exception to the GPL v2
Changing the license of a FOSS system might
have unintended/undesirable consequences to its
legitimate users
ICSE 2010 - Cape Town, SA 7
8. Empirical Study
Goal: analyze licensing evolution
Purpose: investigating how developers change
licensing statements in source code files
Quality focus: kind of changes occurring in
licensing statements
Perspective: researchers, practitioners
Context: CVS/SVN repositories of
ArgoUML, Eclipse-JDT, the FreeBSD and the
OpenBSD kernels, Mozilla, Samba
ICSE 2010 - Cape Town, SA 8
9. Research Questions
RQ1: How frequently do the licensing
statements of source files change?
RQ2: To what extent are files changing
their licenses?
RQ3: How are copyright years changed in
licensing statements?
ICSE 2010 - Cape Town, SA 9
10. Licensing Analysis Method - I
Step 0: Downloading subsequent revisions of source
code files from CVS/SVN
Step 1: Extracting licensing statements
/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* ***** BEGIN LICENSE BLOCK *****
* Version: MPL 1.1/GPL 2.0/LGPL 2.1
*
* The contents of this file are subject to the Mozilla Public License Version
* 1.1 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
* http://www.mozilla.org/MPL/
….
* Portions created by the Initial Developer are Copyright (C) 2002
* the Initial Developer. All Rights Reserved.
*
* Contributor(s):
* Brian Ryner <bryner@brianryner.com>
….
* decision by deleting the provisions above and replace them with the notice
* and other provisions required by the GPL or the LGPL. If you do not delete
* the provisions above, a recipient may use your version of this file under
* the terms of any one of the MPL, the GPL or the LGPL.
*
* ***** END LICENSE BLOCK ***** */
#include "nsXULAppAPI.h"
#ifdef XP_WIN
#include <windows.h>
ICSE 2010 - Cape Town, SA 10
11. Licensing Analysis Method - II
Step 2: Identifying changes in licensing statements
Differencing tools not suitable
IR Vector Space Models
Cosine similarity
nsBrowserApp.cpp.1.32 nsBrowserApp.cpp.1.33
** Version: MPL 1.1/GPL 2.0/LGPL 2.1 ** Version: MPL 1.1/GPL 2.0/LGPL 2.1
**Version: MPL 1.1/GPL 2.0/LGPL 2.1 **Version: MPL 1.1/GPL 2.0/LGPL 2.1
** The contents of this ** The contents of this
fileThe contents of this Mozilla fileThe contents of this Mozilla
file are subject to the
file are subject to the
Publicare subject to the Mozilla
License Version Publicare subject to the Mozilla
License Version
Public (the "License"); you may not
License Version
** 1.1 (the "License"); you may not Public (the "License"); you may not
License Version
** 1.1 (the "License"); you may not
1.1 file except in compliance
use this file except in compliance 1.1 file except in compliance
use this file except in compliance
with this
use with this
use
with License. You may obtain a
with License. You may obtain a
** the ** the
copythe License. Youat under
may obtain a copythe License. Youat under
may obtain a
copy ofthe License at under
copy ofthe License at under
ofthe License one of the MPL,
** the terms of any ofthe License one of the MPL,
** the terms of any
the the terms ofLGPL.one of the MPL,
any the the terms ofLGPL.one of the MPL,
any
the GPL or the LGPL.
the GPL or the LGPL.
GPL or the GPL or the
=
cosine=1
ICSE 2010 - Cape Town, SA 11
12. Licensing Analysis Method - III
Step 3: Classifying licenses
Only on pairs of file revisions where Step 2 identified a change
(cosine >0.99)
We use FoSSology: detects licenses using the Binary Symbolic
Alignment Matrix (bSAM)
/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 22 -*- */
/* *****Mode: C++; tab-width: *****
/* -*- BEGIN LICENSE BLOCK 2; indent-tabs-mode: nil; c-basic-offset:
/* Version: MPL LICENSE BLOCK *****
***** BEGIN 1.1/GPL 2.0/LGPL 2.1
-*- */
** Version: MPL 1.1/GPL 2.0/LGPL 2.1
**
** The contents of this file are subject to the Mozilla Public License Version
** 1.1 (the "License"); you may not use this the Mozilla Public License with
The contents of this file are subject to Version
1.1 License. You may obtain a not use this file except in compliance with
(the "License"); you may copy of the License at
file except in compliance
** the License. You may obtain a copy of the License at
the
** http://www.mozilla.org/MPL/
http://www.mozilla.org/MPL/
….
….
** Portions created by the Initial Developer are Copyright (C) 2002
** the Initial Developer. All RightsDeveloper are Copyright (C) 2002
Portions created by the Initial Reserved.
** the Initial Developer. All Rights Reserved.
** Contributor(s):
**Contributor(s):
Brian Ryner <bryner@brianryner.com>
Brian Ryner <bryner@brianryner.com>
….
….
MPL 1.1/GPL 2.0/LGPL 2.1
ICSE 2010 - Cape Town, SA 12
13. Licensing Analysis Method - IV
Step 4: Identifying changes in copyright years
Mining references to years in licensing
/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 22 -*- */
/* *****Mode: C++; tab-width: *****
/* -*- BEGIN LICENSE BLOCK 2; indent-tabs-mode: nil; c-basic-offset:
/* Version: MPL LICENSE BLOCK *****
***** BEGIN 1.1/GPL 2.0/LGPL 2.1
-*- */
** Version: MPL 1.1/GPL 2.0/LGPL 2.1
**
** The contents of this file are subject to the Mozilla Public License Version
** 1.1 (the "License"); you may not use this the Mozilla Public License with
The contents of this file are subject to Version
1.1 License. You may obtain a not use this file except in compliance with
(the "License"); you may copy of the License at
file except in compliance
** the License. You may obtain a copy of the License at
the
** http://www.mozilla.org/MPL/
http://www.mozilla.org/MPL/
….
….
** Portions created by the Initial Developer are Copyright (C) 2002
Portions created by the Initial Developer are Copyright (C) 2002
** the Initial Developer. All Rights Reserved.
** the Initial Developer. All Rights Reserved.
** Contributor(s):
**Contributor(s):
Brian Ryner <bryner@brianryner.com>
Brian Ryner <bryner@brianryner.com>
….
….
ICSE 2010 - Cape Town, SA 13
14. RQ1: Frequency of licensing changes (all)
1.0
Frequency ofoflicensing changes
0.8
Frequency licensing change
0.6
0.4
0.2
0.0
ArgoUML Eclipse-JDT FreeBSD OpenBSD Mozilla
ArgoUML Eclipse−JDT FreeBSD OpenBSD Mozilla Samba
Samba
System
OpenBSD, ArgoUML, and FreeBSD, have a significantly
higher licensing change-proneness than the other
systems
ICSE 2010 - Cape Town, SA 14
15. RQ1: Frequency of licensing changes (cos<0.99)
1.0
Frequency licensing changes
0.8
Frequency of of licensing change
0.6
0.4
0.2
0.0
ArgoUML Eclipse−JDT FreeBSD OpenBSD Mozilla
ArgoUML Eclipse-JDT FreeBSD OpenBSD Mozilla Samba
Samba
Mozilla, has a significantly System
higher licensing change-
proneness than the other systems
Change from CPL to EPL consists in changing 2 words
Common→ Eclipse, cpl→epl
ICSE 2010 - Cape Town, SA 15
16. Distribution of licenses
Eclipse-JDT
Others
Last
EPL
First
None
0 10 20 30 40 50 60 70 80 90 100
% of files
ICSE 2010 - Cape Town, SA 16
17. Distribution of licenses
FreeBSD
BSD (unknown)
INRIA-OSL (3-cl BSD)
Cryptix-stile (2-cl BSD) Last
First
Others
None
BSD UCRegents-style (4 cl. BSD)
0 10 20 30 40 50 60 70
% of files
ICSE 2010 - Cape Town, SA 17
18. Distribution of licenses
Mozilla
Dual MPL-GPL
Others
Last
First
None
NPL
0 10 20 30 40 50 60 70 80 90 100
% of files
ICSE 2010 - Cape Town, SA 18
19. Distribution of licenses
Samba
Others
LGPL v2+
Last
First
None
GPL v2
0 10 20 30 40 50 60 70 80 90
% of files
ICSE 2010 - Cape Town, SA 19
20. RQ2: Most relevant license changes
Eclipse-JDT
Common Public License v1.0 Eclipse Public License v1.0 CHANGE 2394
Common Public License v0.5 Common Public License v1.0 UPDATE 808
Mozilla
NPL 'NPL v1.1'-style+GPL v2+LGPL v2.1 DUAL 2914
NPL 'Dual MPL GPL'-style+MPL DUAL 1274
'Dual MPL GPL'-style+MPL NPL BUG 1194
Licensing updated as new licenses were developed
Eclipse JDT: CPL 0.5→CPL 1.0→EPL 1.0
IBM has relinquished control of licenses to the Eclipse Foundation
Mozilla: NPL→MPL + GPL (+ LGPL)
NPL allowed to release Netscape 6 as a proprietary system
MPL only allows to re-distribute the source code under the MPL
Multiple licenses to deal with incompatibilities
Files wrongly changed to2010 - Cape Town, SA
ICSE NPL (bug #98089) 20
21. RQ2: Most relevant license changes
FreeBSD
'BSD UCRegents'-style
BSD UCRegents (4-cl BSD) (4-cl BSD) UPDATE 491
'BSD UCRegents'-style (4-cl BSD) 'INRIA-OSL'-style (3-cl BSD) UPDATE 300
OpenBSD
'BSD UCRegents'-style (4-cl BSD) 'INRIA-OSL'-style (3-cl BSD) UPDATE 964
'BSD UCRegents'-style
BSD UCRegents (4-cl BSD) (4-cl BSD) UPDATE 414
FreeBSD and OpenBSD are more eclectic than
other projects
Heterogeneous community of contributors
Code frequently imported from external sources
Moving from BSD-4 clauses to the more permissive
BSD-3 and BSD-2
ICSE 2010 - Cape Town, SA 21
22. RQ2: Most relevant license changes
ArgoUML
'Free with copyright clause'-style +'UC Regents free with
None copyright clause'-style ADD 127
Samba
None GPL v2 ADD 15
ArgoUML and Samba kept the same licenses
over the analyzed time span
Change is from None to a simple license
Authors realized the importance of including a license
ICSE 2010 - Cape Town, SA 22
23. RQ3: Changes in copyright years
Eclipse-JDT %00.06
FreeBSD
%00.09
%00.08
• a %00.05
%00.07
%00.04
%00.06
s
Updates
Updates
s
e e
t
%00.05t
a a
%00.03 d
d p
%00.04 p U
U
%00.02
%00.03
%00.02
%00.01
%00.01
%00.0
%00.0 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 9
9 9
9 9
9 9
9 9
9 9
9 9
9 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0
0
0 0
0 0
0 0
0 0
0 0
0 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
2 2 2 2 2 2
Year
rae Y Year
rae Y
Percentages of files with copyright years updated
Increase over the years for ArgoUML and Eclipse-JDT
Decrease for FreeBSD, OpenBSD, Mozilla, and Samba
ICSE 2010 - Cape Town, SA 23
24. Why copyright years were changed?
Files for which the copyright years were updated
underwent a significantly higher number of
changes than others
When developers perform substantial changes to a file,
they also update copyright years
Required by copyright regulations
Lack of updates with substantial changes would allow an
infringer to claim “innocent infringement”
Commits explicitly targeted to copyright years
“Updated copyrights”
“Updated copyrights to 2004”
ICSE 2010 - Cape Town, SA 24
25. Threats to Validity
Construct validity
We might have extracted other comments not belonging to
licenses → this only affects RQ1
FoSSology has some imprecision
→ we improved it and manually checked many licenses
Internal & conclusion validity: N/A
External validity
Pretty large and diverse variety of projects (6)
Interesting to study FLOSS code used in industry
Replicability of the study
Raw data available
Tools available
Working data sets available
ICSE 2010 - Cape Town, SA 25
26. Conclusions and work-in-progress
We proposed a code analysis method as
support for lawyers other than for software
engineers
Main findings from the empirical study:
Addition of licenses where there weren’t
Moving towards more permissive licenses
Copyright years updated to preserve rights on new code
Work-in-progress:
Analyzing licensing inconsistencies in software distributions
[ICPC 2010]
Identifying licensing for software distributed as binary
[MSR 2010]
Interacting with projects as Fedora and Mozilla to help fixing
licensing inconsistencies and defining the new MPL
ICSE 2010 - Cape Town, SA 26