1. Topic: Selecting a User Store technology for the WSO2 Identity Server User Store
Unless there are alternate suggestions, we’ll meet as per the timeline below to discuss and
decide on the User Store technology to be used in the WSO2 Identity Server. It is my opinion
that we should use a MySQL based User Store over the default LDAP accessed Directory
Server configuration. The meeting invite will contain the Confluence link for related
documentation. Please try to have any input you want considered added to the Confluence
directory before the Kick-off Meeting. We’ll discuss any contributions and make a final decision
by Friday December 5. After the decision is made, a Position Paper will be created to document
the criteria and factors considered for the IdM User Store decision.
This is an important Architectural consideration; the WSO2 IS User Store is a critical component
that must meet critical availability and scalability capability. The tight coupling of Authentication
and Authorization of user session and all requests in all components makes it important that the
team understands and that there is consensus on the decision for the chosen technology (or I’d
have already made the decision).
It is important to reach a decision on December 5 to meet the estimates for the work to
implement the User Store solution so as to not impact work based on the current backlog
prioritizations.
Timeline:
Kick-off Meeting:Monday, December 1 (Time TBD)
Deadline for Comments and Responses:Wednesday, December 3 EOD
Final Decision: Friday, December 5 12:00 to 2:00pm meeting (we’ll break as soon as decision
is made)
Background:
When implementing an Identity Management solution, such as WSO2 Identity Server or any of
the other many products in the Identity Management (IdM) vertical, very often the default
configuration for the user store is a Directory Server accessed via LDAP. While Directory
Server were an excellent choice in the past for IdM User Data stores, they are a poor choice for
many environments today and this choice results in significant additional effort (and cost) but
also has many other disadvantages when compared to User Stores using a Relational
Databases.
This document captures some of the decision points of uses a Directory Server (such as
OpenLDAP) compared to a Relational Database solution for an IdM User Store both at small
scale and especially at large scale that must meet critical availability and scalability SLAs.
2. LDAP and Directory Servers originated in the 1970’s and both their design and implementation
hampers not enables, the agility, scalability availability and utility of applications using them for
many reasons; some of which are:
LDAP is designed for optimal usage in high read-to-write ratio situation; 10:1 or 100:1 is
most often quoted as optimal for LDAP based directories. For any Password Policy that
tracks the results of attempted authentications (which are a feature of all IdM solutions),
the Directory Server must update once for every authentication attempt. Idle and
maximum (a.k.a. soft and hard) timeouts are another required feature that usually
require frequent updates. Many systems also persist session information, including “last
accessed” information in the user store. The application will use the User Store in ways
that are recognized as less than optimal.
LDAP is an access protocol (LDAP = Lightweight Directory Access Protocol) not a data
store. LDAP data stores use some storage technology, usually a RDBMS like an
embedded small scale Open Source database like H2 or Postgres in a black-box
configuration. DevOps must support this application and the additional backup, restore,
sizing, HA and other Operational needs through the tools provided and very often need
to purchase additional licenses to support the Directory Server User Store. This can be
a significant challenge if the storage engine used for the User Store is not already
supported by DevOps. There are additional recurring costs in for the labor to maintain
and possible licensing costs for this additional component. It is best to choose a store
engine that we have in house expertise and already support.
Customization of the Data Store for LDAP based Directory Servers is complex and often
not a skill companies have in-house as it is not a common function anymore. Arguably
you can Google how to extend a Directory schema and get examples of how to do it. I
would not want to extend a schema for other applications in this manner. This often
leads to applications reusing existing attributes instead of creating appropriately named
attributes (like reusing the “stateOrProvence” attribute for a data element not explicitly
accommodated in the default directory schema). This is a poor practice that should be
avoided.
LDAP adds an additional layer of abstraction and latency to your application but doesn’t
offer any advantage for his extra complexity and overhead. Applications such as WSO2
Identity Server can access a JDBC based datastore (directly) or an LDAP datastore.
LDAP Connection Pooling support is non-existent or is very limited; this is an important
scalability and performance concern. No architect would design an application that had
to create a new database connection every time it needed to access the database.
Establishing a new connection is VERY resource intensive and a huge source of
application latency. Establishing a connection usually takes longer than the query you
establish the connection to run. The ability to effectively utilize connection pools is a
vital point to consider.
LDAP is not a transactional protocol. Generally, IdM functions (user provisioning for
example) are closely coupled to other database transactions and the ability to have
changes to the IdM user store and other schemas participate in transactions are
important. Not having transactions means rollbacks of an update require a
compensating transaction to “undo” the update. It is sometimes difficult or impossible to
back-out an update via a compensating transaction.
3. LDAP and Directory Servers do not have DRI, locking, or check constraints even if the
relational database the LDAP implementation is built on supports them.
Directory Server data has limited Data Typing. There are Strings, Numbers (Integer
only numbers actually), Time, Telephone Numbers, Boolean, Binary, Distinguished
Name and Bit Strings data types in directory servers. Decimal (and all non-integer
numeric) data and complex types (objects) must be stored as a string or
serialized/deserialized and explicitly cast if used in any application (SQL, Java Visual
basic…). And there are limits on searchability and indexability (and indexing in general);
especially for non-native data types. Relational database (like Oracle) datatypes map to
Java SQL datatypes without any casts.
LDAP has no equivalent structure to stored procedures (and packages). It is desirable
to have the SQL for data input and output abstracted from the calling applications to
minimize the risk and impact to existing applications of future changes to the User
DataStore. Decoupling the release cycles of the database and Business logic as much
as possible is a more agile approach. Generally Java applications use Prepared
Statements so this may be a less important point, but it does eliminate implementation
options.
A Directory Server has minimal Error Handling internally and externally error handlers
must be coded and implanted in all code that calls into the Directory Server. Relational
databases’ Error Handling allows for better and more consistent exception handling,
resolution, and logging and encapsulates these functions from the calling application.
Data access is vital. When developing or in production, I frequently need to query the
user store. There are no MySQL Workbench, Toad or other similar products for LDAP
based directories. I remember how difficult it was developing using only SQL*Plus;
better tools really does produce better end results. I use Eclipse or IntelliJ now; I do not
write Java classes in Notepad.
Many of the items above if taken alone may not be persuasive, but taken in total and compared
to the list of advantages (are there any?) of using a Directory Server, I can’t come to any
conclusion other than Relational Database over Directory Server for an IdM User Store in the
context of application.