2. Introduction
Your flight plan
About me
Some words about the project’s background
What is a Content Repository?
Why should I use a CR?
Why code it ourself?
Inside the TYPO3 CR
Where do we stand? Our plans for the future...
Inspiring people to
share
3. Introduction
About me
Born 1977, living (mostly) in Germany
Started out with BASIC on a Commodore 128
Now a PHP addict open to other languages as well
Active member of the TYPO3 Association
Developer with the TYPO3 project
Inspiring people to
share
5. Introduction
About TYPO3
One of the leading open-source CMS
Invented by Kasper Skårhøj in 1997
Written in PHP, released under GPL in 2000
Now used with small and large companies around the world
Hundreds of thousands of websites built with TYPO3
Backed by a huge community and the TYPO3 Association
Inspiring people to
share
6. Introduction
The future of TYPO3
The current architecture of TYPO3 is becoming outdated
We decided to write TYPO3 5.0 – it soon became clear that we'd
do more than quot;just write a new CMSquot;
We started with some groundwork, resulting in the FLOW3
framework – more on that in a minute
We decided to use a CR for the new version
And of course we still have the ultimate goal to come up with a
new TYPO3 CMS...
Inspiring people to
share
7. Introduction
TYPO3 5.0 CMS
Successor to TYPO3 v4, which is the result of 10 years of
development
Start from scratch, but keep the soul of TYPO3
Shall provide
lower complexity
make use of advanced PHP features
be more (and more easily) extensible, ...
Inspiring people to
share
8. What happened so far
FLOW3
Provides an advanced programming framework with support for
Dependency Injection / Inversion of Control
Aspect Oriented Programming
Component and Package Management
enhanced Reflection
Caching
MVC and more
Inspiring people to
share
9. What happened so far
FLOW3
“Best of breed”:
Inspired by the most popular frameworks and toolkits from
Smalltalk, Python, Ruby and Java
Picking the best concepts, skipping the annoyances
Not tied to TYPO3 CMS, can be used for any PHP-based project
Important to you!?
Have a look at the website at flow3.typo3.org
Inspiring people to
share
10. Introduction
About the TYPO3 Association
Founded in November 2004 by a group around Kasper Skårhøj
It’s goals:
Support TYPO3 development on a more steady basis
Improve the transparency and efficiency of various aspects of
the TYPO3 project
Is funded by members and sponsors
Financed the development of TYPO3 v5 and related projects
until now
Inspiring people to
share
11. What is a Content Repository?
Inspiring people to
share
12. What is a CR?
Jack Rabbit says
A content repository is a hierarchical content store with support
for structured and unstructured content
In addition to a hierarchically structured storage system,
common services of a content repository are versioning, access
control, full text searching, and event monitoring
Typical applications that use content repositories include
content management, document management, and records
management systems
Inspiring people to
share
13. What is a CR?
But it is for Java, no?
The Java Community Process (JCP) is very efficient, not only
when compared to other standardization bodies
The Java Specification Request (JSR) 170 led to the specification
Content Repository for Java technology API (JCR) is the result
First JSR with a real open source license (Apache-style)
The API is defined in Java, but can be ported to other languages
No, it’s not only for Java!
Inspiring people to
share
14. What is a CR?
Nodes and Properties
A Content Repository (CR) allows the storage and retrieval of
arbitrary content as nodes and properties in a tree structure
Inspiring people to
share
15. What is a CR?
Workspaces
A repository can contain multiple independent workspaces that
can correspond to each other, allowing comparison
Inspiring people to
share
16. What is a CR?
The Basics
The tree structure can be freely defined by the user of the CR
Nodes may be typed with a rigid structure – or free-form
The API abstracts the actual data storage used (RDBMS, ODBMS,
files, ...)
Binary content can be stored and queried as effectively as
textual content
Export to and import from XML are possible
Versioning, locking, transactions, event listeners, ...
Inspiring people to
share
18. Why use a CR?
Best of both^Wthree worlds...
Inspiring people to
share
19. Why use a CR?
Isn’t that convincing?
Inspiring people to
share
20. Why use a CR?
From a coder’s perspective
One well-designed API instead of different ones
Common language and concepts
Properties instead of fields give flexibility
Learn once, use everywhere
Portable code allows easier reuse of existing solutions
Rich set of tools
No more SQL!
Inspiring people to
share
21. Why use a CR?
Summary
A content repository provides a robust storage for your content
- be it text, images, or code, structured or unstructured
Knowledge and tools can be reused at will
A Content Repository (CR) promises to solve a lot of problems
A stable standard with a fresh version in the making
SQL has been around for 35+ years, CR has “just started”
Inspiring people to
share
22. Why code a CR in PHP?
Inspiring people to
share
23. Why code a CR in PHP?
Inspiring people to
share
24. Why code a CR in PHP?
No, really...
There are better reasons, of course!
Inspiring people to
share
25. Why code a CR in PHP?
Existing implementations
Jackrabbit is the reference implementation, available as open
source from the Apache Foundation
Day CRX is the commercial CR implementation from the
quot;inventorquot; of JSR-170, Day Software
Other implementations are eXo JCR and Jeceira, the latter also
being dead, and others
JSR-170 connectors exist Alfresco, BEA Portal Server, IBM
Domino and others
Inspiring people to
share
26. Why code a CR in PHP?
PHP ports of the JSR-170/283 API
What about PHP?
Travis Swicegood ported the JSR-170 API to PHP in 2005 -
project is dead
There is a port of the JSR-170 API available in the Jackrabbit
sources, added 2005 - no relevant changes since then
No full port of the JSR-283 API available today
Inspiring people to
share
27. Why code a CR in PHP?
What about using what’s there?
We tried to integrate Jackrabbit using the PHP-Java-Bridge
(Almost) every call to Jackrabbit needs to be wrapped for
type conversion, exception mapping, ...
We ran into massive memory issues
More complex to set up and maintain
A dependency on Java is a no-go (not only) for our PHP-based
project
Inspiring people to
share
28. Why code a CR in PHP?
Summary
Various implementations exist, mostly in Java
A CR offers a truckload of advantages, we want to leverage
those advantages
No PHP implementation of a CR exists
Using existing non-PHP implementations isn’t an alternative
We need to build our own CR
Inspiring people to
share
30. The TYPO3 CR
Three truths about the TYPO3 CR
Goal is a pure PHP implementation of JSR-283
although functionality needed for TYPO3 CMS has priority over
specification compliance for now
Will take advantage of the FLOW3 framework, but not be tied to
the TYPO3 CMS.
Could eventually become the standard CR for the PHP
community?!
Inspiring people to
share
31. The TYPO3 CR
Porting the JSR-283 API
Issues
Typing, some Java types simply do not exist in PHP
Constructor overloading is impossible in PHP
Binary data (might be FLOW3 Resource Manager handles
instead of streams)
Interfaces will not be ported up-front, but as we need them
Useful by-product of our development process
Inspiring people to
share
32. The TYPO3 CR
Development model
Based on the FLOW3 Framework
Domain Driven Design (will be) used
Use of AOP planned to avoid tight internal coupling
Test Driven Development with Continuous Integration
Automatic checks against coding guidelines
Inspiring people to
share
33. The TYPO3 CR
Aspect Oriented Programming
AOP is a programming paradigm
Not a new concept, but still new to PHP
Complements OOP by separating concerns to improve
modularization
OOP modularizes concerns: methods, classes, packages
AOP addresses cross-cutting concerns
Inspiring people to
share
37. Aspect Oriented Programming
How AOP sounds
Some language first
Aspects contain advices that you want to add to your
software
Pointcuts expressed by pointcut expressions define where to
add advices to your code
Join points are events in the flow of a program, such as
calling a method or throwing an exception
Targets are the classes and methods being adviced by aspects
Inspiring people to
share
38. Aspect Oriented Programming
How AOP works
Three steps to AOP use
Write the code for the cross-cutting concern
Define a pointcut expression telling the framework where to
add that code
Get some coffee
The (hard) work is to identify the cross-cutting concerns
and to define the simplest possible pointcut expression
Inspiring people to
share
39. Aspect Oriented Programming
Example: Logging
It might be good to know who deleted the mail archive of the
last four years
Logging could solve this
A logging aspect added at the right places solves this easily
Using AOP
makes changing the logging a snap
keeps the code clean
Inspiring people to
share
40. Aspect Oriented Programming
Example: Security
It would have been even better to not allow deletion of the mail
archive of the last four years...
Security is a complex issue, solving this “right, now” seems
impossible
Using AOP
makes changing the changing security code easier
allows to add security everywhere, anytime
keeps the code clean
Inspiring people to
share
41. The TYPO3 CR
Actual data storage
The underlying storage of the TYPO3CR will be a RDBMS in most
cases
Currently PDO is used to access SQLite
Easy to use for development and unit testing
The use of PDO already enables any PDO-supported database
Specialized DB connectors will follow, using optimized queries,
stored procedures, ...
Inspiring people to
share
42. Actual data storage
Data storage techniques
Basically we need to store a simple tree
Read access must be fast, write access should be fast, as the
majority of requests are read requests
Traditional approach as used in TYPO3 today is to store a triplet
(uid,pid,sorting) resulting an an adjacency list
Alternative & sometimes faster methods
Materialized Path
Nested sets, Nested intervals
Inspiring people to
share
43. Actual data storage
Nested sets
Better suited to how RDBMS work internally
Stores numbers determined
by preorder tree traversal
Very fast read access,
problematic write access
Concurrency demands locking
On average half of all nodes need to be updated on insertion
of a new node
Inspiring people to
share
44. Actual data storage
Speeding up nested sets!?
Write access can be sped up by various approaches like spacing
and variable length indices for the pre/post numbers or by
partitioning the data over more tables
Materialized path works like adjacency list and stores the full
path to the node
Nested intervals sometimes considered OMPM – “Obfuscated
Materialized Path Method”
All methods have their (dis-)advantages
Finally: DB-specific tricks change the problem!
Inspiring people to
share
45. The TYPO3 CR
Querying the TYPO3 CR
Level 1 methods
Using getRootNode() and friends from the API
Using XPath queries
With JSR!283
Optional methods XPath will
Using SQL queries be dropped
Inspiring people to
share
46. Querying the TYPO3 CR
XPath support for TYPO3R With JSR!283
XPath will
To enable XPath we need be dropped
a XPath parser
an efficient way to transform a XPath query into SQL for the
used low-level data structure
The latter is a lot easier when storing the tree as a nested set
The problems caused by this have been mentioned already...
Inspiring people to
share
47. XPath support for TYPO3R
Pre/Post Plane Encoding
Stores number determined by
preorder and postorder tree
traversal
Allows to partition the nodes into
four regions, as shown for node ƒ
Very fast read access, e.g. a single
SELECT to query all ancestors to a
node ƒ
SELECT * FROM nodes WHERE
pre < ƒ.pre AND post > ƒ.post
Inspiring people to
share
48. Querying the TYPO3 CR
SQL support for TYPO3R
Using SQL we need
a (simple) SQL parser
an efficient way to transform that SQL into equivalent SQL for
the used low-level data structure
This still needs to be investigated, possible approaches
storing a reference to the parent node
using the pre/post plane only as a cache for XPath read
queries, optimizing the native storage for SQL read queries
Inspiring people to
share
49. The TYPO3 CR
Extensions to JSR-283
A vendor may choose to offer additional features in his CR
implementation
The TYPO3CR will offer support for
Persistency through code annotations
Automatic node type generation based on class members
Rules for setting up virtual root nodes based on node types
Inspiring people to
share
50. Extensions to JSR-283
Persistency to the CR
Annotations define objects and their properties to be persistable
Properties are stored in the CR according to reflection results
and hints from annotations
The FLOW3 persistence manager is transparently enhanced by
the CR persistence mechanism
An object-to-object mapper does the hard work
Inspiring people to
share
51. Extensions to JSR-283
Automatic node type generation
Persistency stores properties in the CR according to reflection
results and hints from annotations
Node types can be generated automatically if wanted
Manually adding content cannot break the needed structure
Browsing the repository reveals a clear structure
Using content from other applications is less error-prone
Maybe this is utter nonsense - depends on whom you ask :)
Inspiring people to
share
52. Extensions to JSR-283
Virtual root nodes
The repository has one root node, added nodes must be placed
somewhere
It might be useful to find all nodes under a common node,
depending on type or other attributes
Such a virtual root node is
like a smart folder or playlist
like a view in a RDBMS
Inspiring people to
share
53. The TYPO3 CR
Current status
Currently the code supports a subset of the required features of
levels 1 & 2 and the optional parts of the JSR-283 specification
Basic read & write access
Namespace registration
Node type discovery and registration
Data storage uses the naive approach known from TYPO3 v4
Have a look at the Subversion repository for up-to-date
information
Inspiring people to
share
54. The TYPO3 CR
Future plans
Write test
Code
Test
Write test
Code
Test
...
Inspiring people to
share
55. The TYPO3 CR
Summary
Implementing the specification is not an easy task, but doable
For the various parts a lot of research has already been done
2008 will see full-time development on the TYPO3 CR
The repository is a major improvement over currently
widespread ways of storing data
The whole PHP community could^Wwill benefit!
Inspiring people to
share
56. So long and thanks for the fish
Links
TYPO3 Website
http://typo3.org
TYPO3 Development Website
http://forge.typo3.org
FLOW3 Website
http://flow3.typo3.org
TYPO3 5.0 Subsite
http://typo3.org/gimmefive
Inspiring people to
share
57. So long and thanks for the fish
Questions?
Inspiring people to
share beer