What is it?
In computing, backwards or downwards compatibility is a general term referring to the ability to read, write, and/or execute input with a certain technology, where that input was designed to be read, written, and/or executed by an older version of the same technology.
Example 1: Java
In the Java programming language, code written according to the Java 1.0 specification will still run with identical results on the current version (*), and it is generally understood that this will be so indefinitely.
Innovations have been made in the succeeding major versions of the Java language, mostly by expanding the available syntax, but the original specification is still 100% valid.
Example 2: PDF
PDFs created in an old PDF format (e.g. PDF 1.0) should be rendered correctly by later viewers.
Newer versions of PDF add new functionality. They don’t remove existing functionality (*). An older PDF viewer might not be able to use that new functionality (e.g. OCG), or even fail to render the document (e.g. when the cross-reference table is compressed).
Important:
Backward compatibility obviously doesn’t mean you can run newer code in an older environment. For instance: code compiled with Java 8 can cause a java.lang.UnsupportedClassVersionError (Unsupported major.minor version) when you run it on a Java 6 JVM, even if you only use functionality that was already available in Java 6.
In context of a library, the “input” is the application code calling the API.
Mention “Bug compatibility”: emulation of bugs may be necessary if legacy code depends on that behavior
BouncyCastle
The Android operating system, as of early 2014, includes a customized version of Bouncy Castle. Due to class name conflicts, this prevents Android applications from including and using the official release of Bouncy Castle as-is.
A third-party project called Spongy Castle distributes a renamed version of the library to work around this issue.
iText 4
A third party created a fork of iText 2.1.7 and named it iText 4.
That’s fine, but unfortunately they also released this unofficial version on Maven using the official iText groupId.
That’s not OK: according to the Apache FAQ, this is in violation with the rules.
Many users upgraded to iText 4 without realizing they were using an unofficial iText version.
Nevertheless, they expected the original iText developers to support this version.
iText Group reclaimed its groupId and this broke the Maven builds of many iText 4 users.
Some developers blamed iText for this, instead of blaming the real culprits.
I removed this example:
OpenSSL
At 11 PM on new year’s eve 2011, David Henson received the code from Robin Seggelmann, a respected academic who’s an expert in internet protocols. Henson reviewed the code and had added it to the OpenSSL repository.
More than two years later, in April 2014, this patch was discovered to cause a serious security vulnerability known as “Heartbleed”, impacting almost every company using OpenSSL.
Try to fix problems with internal re-implementations
Do not change the API that is exposed to the developers. Sometimes parts of the API were not intended to be exposed.
Java 9 will allow to encapsulate internal APIs
Use two methods for the same function, deprecate the old method
Main disadvantage: redundancy
E.g. in PDF: the F operator is equivalent to the f operator.
It’s there for historical reasons and should no longer be used.
XFA
There were dozens of ways to construct a similar form. Even Adobe didn’t succeed in following its own spec. Almost no third party vendor supported XFA because it was too complex.
As a result, XFA was deprecated in ISO-32000-2. A PDF 2.0 file shall not contain any XFA.
Python
An official goal of the Python 3 redesign was to "reduce feature duplication by removing old ways of doing things". An example is the use of placeholders in string templates
iText
E.g. in iText: the HTMLWorker class was deprecated in favor of the XML Worker framework.
HTMLWorker was written for a specific, limited purpose, but people started to use it in a broader way.
We get plenty of questions from people who use HTMLWorker and want us to extend it.
We can’t and won’t do that.
In iText 5, there were several rendering APIs that had a lot of functional overlap but also showed (sometimes subtle) differences in behavior
In iText 7, we introduced a renderer framework that is much easier to maintain
Digital signatures depend on hashing and encryption algorithms
Algorithms are subject to flaws and vulnerabilities: e.g. collision attacks in MD5, SHA-1...
Processing power increases, reducing the time to decrypt a message
Internationalization: ASCII isn’t sufficient anymore, we need Unicode
Python 2 depended on ASCII; the API of Python 3 was changed in favor of Unicode
Standards and specifications change
For instance: the maximum file size in PDF file used to be 10 Gigabyte, this was increased to 1 Terabyte, but to achieve this, the PDF needs to have a compressed cross-reference table with a different structure than before. Old viewers can’t read such a cross-reference table.
Before iText 5.2.0, all byte positions were expressed in int. This reduced the maximum file size of a PDF to 2 Gigabyte. We changed int to long to support files up to 1 Terabyte. No API change was required, but we broke the 5.2.x series by accident. We removed all 5.2.x releases from our repositories.
Twitter
The easy API provided by Twitter caused a couple of problems
It resulted in huge server loads (we all know the “fail whale”)
It was easy to use the API in ways that weren’t acceptable (SPAM, abuse)
In 2013, the API was changed to restrict access to authenticated, registered applications
Sharepoint
Sharepoint 2010: IFilter architecture, for instance to search PDF files in an intelligent way
Adobe (free), PDFLib, and Foxit were better than MS at searching PDF files in Sharepoint
Sharepoint 2013: IFilter support for PDF is broken
PDF “natively” supported in Sharepoint by MS; competing products no longer work, no longer needed
iText 7 versus iText 5
The core design hadn’t changed since 2000.
The original design and first versions were written by a self-taught developer
Not enough refactoring along the way. Breaking compatibility gradually from time to time could have prevented the complete iText 7 rewrite.
Younger code (e.g. digital signatures, parsing PDF) can be reused; the older code needed a rewrite
We gained new insights thanks to people using iText in ways we didn’t expect
E.g. the original developer never imagined that one day he’d have to support Hindi
iText grew organically; many different developers contributed
E.g. form fields: different classes for form creation vs. form filling
E.g. different layout systems: Document.add() vs. ColumnText vs. writing to PdfContentByte
The world has changed since 2000:
People started deploying applications on AWS and GAE: different file system.
People started using iText on Android: monolithic iText vs. limitation of the number of classes
BouncyCastle:
Source code incompatibilities were introduced in version 1.47. This led to numerous problems for all libraries that had a dependency on BouncyCastle 1.x.
The communication wasn’t ideal:
“The next release of BC will be version 2.0. For this reason a lot of things in 1.46 that relate to CMS have been deprecated” (Bouncy Castle Release Notes 1.46, 2011)
“Okay, so we have had to do another release. The issue we have run into is that we probably didn’t go far enough in 1.46, but we are now confident that moving from this release to 2.0 should be largely just getting rid of deprecated methods.” (Bouncy Castle Release Notes 1.47, 2012)
“There has been further clean out of deprecated methods in this release. If your code has previously been flagged as using a deprecated method you may need to change it. The OpenPGP API is the most heavily affected.” (Bouncy Castle Release Notes 1.51, 2014)
The most recent version is currently 1.54, dating from 2015. BouncyCastle 2.0 never happened.
iText 7 also doesn’t strictly use semantic versioning, but we intend to do something similar.
Spring 2 evolving into Spring 3
Configuration in Spring 2 was done using XML
XML used to be popular, but its popularity faded over the years
Spring 3 added configuration through annotations without breaking the API
The original design of the Spring Framework was future-proof
Python 3 versus Python 2
There were several integer sizes in memory, one of which was dependent on the architecture of the underlying processor
The exception framework was inconsistent and idiosyncratic
Comparison of object types was complex and non-intuitive
But also… trade off between being future proof / well designed and pragmatic / performant.
We kept project Arya closed source until we were confident there wouldn’t be any substantial changes to the API
Writing the jump-start tutorial was a pain, because the examples and the text of the tutorial had to be updated on a regular basis
Commons-imaging: our mistake to depend on a snapshot version
* Clirr (http://clirr.sourceforge.net/)
** Clirr Maven plugin
*** console: mvn clirr:check
*** we run it in a separate profile
** Clirr SonarQube plugin
*** browser
*** Tip: create separate SQ dashboard
* JDiff (http://javadiff.sourceforge.net/)
** Javadoc doclet
** generates HTML report of API differences
** published for users
Provide good documentation
Provide conversion tools
Python:
To move from Python 2 to Python 3, a tool called 2to3 was developed to make the transition easier, essentially rewriting Python 2 source code to fit the Python 3 specification
specific user-contributed package called six, which users can add to their Python project of either version to make it reusable on the other version
Microsoft Word
The .doc and newer .docx document formats for Microsoft Word are not mutually compatible. However, the versions of MS Word since the introduction of .docx are still able to open and edit .doc files usually suggesting to save the document to the newer format to make the file more future-proof.
Python
a number of Python 3 innovations have been back-ported to the Python 2 project. This takes away incentive for developers to make the switch at all, because the new features are no longer exclusive to Python 3.
As a consequence of the stability of and continued development on Python 2, uptake of Python 3 has been relatively slow.
iText
No new development in iText 5, only bug fixing
All new functionality will be developed in iText 7
Maintaining multiple versions means less resource for new developments in the new version.
In open source you cannot prevent/prohibit back-porting. Users may fork. They should mainly be discouraged to do so because they are convinced by the new version.
Python 3 was released in 2008. Announcement that Python 2 would be supported until 2020 was made in 2014.
That EOL date has changed many times.
UWP: the Universal Windows Platform
Released in 2015
The announcement of UWP felt as a huge marketing event
There were talks and articles on “How to monetize your app”
There was very little technical info
UWP is based on the .NET Core, but the API of .NET Core wasn’t backward compatible to the .NET Framework
Moreover: it wasn’t ready when UWP was announced. It was a moving target.
A general porting guide was released in February 2016, but it stays clear of giving exact advice on some of the more fundamental changes in the .NET Core reimplementation, like the Cryptography implementation.
This is problematic for vendors confronted with questions such as “Is there UWP support for iText?”
We know what breaks, but it’s not clear how to fix it.
Stream::Close() doesn’t exist anymore: replaced by Stream::Dispose() (from IDisposable)
System.Text.Encoding has changed the names of several properties
ICloneable doesn’t exist anymore
MethodImplOptions.Synchronized doesn’t exist anymore
Serializable and SerializationInfo don’t exist anymore
The entire System.util namespace doesn’t exist anymore
TimeZone doesn’t exist anymore
System.Xml.XmlTextReader and XPath APIs don’t exist anymore
…
Deprecate before deleting
The user thus receives a fair warning that it is unwise to reference that code in their client application, usually with the implication that the faulty code may be deleted at any time
Warn that the code is subject to change
Sun packages: example PKCS#11
The class sun.security.pkcs11.SunPKCS11 was available in the 32-bit version of Java 6, but not in the 64-bit version
Beware of non-technical API changes
iText 2: the MPL/LGPL made it hard to find a working business model
iText 5: the AGPL allowed us to create a business based on a dual licensing model
We had to make sure that users didn’t accidentally upgrade, so we broke the API
The package names com.lowagie were changed into com.itextpdf
New insights that led to a complete rewrite of iText 7
Make it modular, make it extensible
Some users complained that they only needed a small part of the functionality, but still had to ship the full, monolithic iText jar with their applications.
The original design was extensible, otherwise it wouldn’t have lasted for 16 years, but there was room for improvement.
Allow the introduction of a new business model: iText 7 as a platform.
Remove functional overlap to avoid maintenance hell
You change something in one place, but forget to change it in another place