AEM Meetup Sydney, 2017-05-31.
A closer look at the content migration tool and its various options. Discussion around how to use the tool for version upgrades and BAU activity (like Blue/Green deployments). Highlighting benefits, potential issues and things to consider when using the tool.
AEM Meetup Sydney - Content Migration with CRX2Oak
1. Content Migration
with CRX2Oak
BY MICHAEL HENDERSON
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
2. What’s the problem?
You want to transfer content from one AEM instance to another
You don’t want to create from a snapshot or a backup from the existing
environment
Residual corruption? Dodgy data? Worried about previous versioned/patched
assets that might transfer through via a full clone? Or, just want to start fresh?
Options?
Content package export/import: Large. Slow. No versions. No orphaned
content. No resume.
Content transfer tool: CRX2Oak.
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
3. What is the tool?
CRX2Oak has been created to migrate content from one AEM repository to
another
Common use-cases for CRX2Oak are:
Upgrade: From an older version of AEM to a newer version
Sidegrade: From one instance to another of the same version but with
architectural changes. I.e. Changing Node store or Blob store
Migration: From one instance to another fresh build of the same version and
same architecture. Content + code release. Sometimes this is referred to as a
“Blue/Green Deployment”
An open source version (without CRX2 upgrade functionality) is available
as the Jackrabbit Oak “oak-upgrade” migration tool
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
4. What can it do?
Resume Support: If you interrupt the command, you can continue it later
on and it will continue where it left off
Customisable Upgrade Logic: Add custom values and control via Java
classes
Support for Memory Mapped Operations: Improve performance
Selective Migration of Content: Copy just what you want to
Path Merging: Merge migrated content without clobbering
Version Support: Ability to copy versions including orphans
Open Source Version: oak-upgrade available (no CRX2 support)
Speed: Approx. 1.5GB per min with versions (on c4.xlarge)
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
5. What version should I use?
Ensure the version of CRX2Oak is the same version (or as close as you can) as your destination
version of Oak: http://localhost:4502/system/console/status-
Repository%20Apache%20Jackrabbit%20Oak
Download the required version from Adobe’s public repository:
https://repo.adobe.com/nexus/content/groups/public/com/adobe/granite/crx2oak/
WARNING: Using a mismatched Oak version of the tool can corrupt your instance!
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
6. CRX2Oak – Migration Options http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
7. CRX2Oak – Merge Option http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
8. CRX2Oak – Version Options http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
9. Let’s look at the command
Usage: Run via the command-line:
java -jar crx2oak.jar [options] [datastore-options] [source] [destination]
[source] & [destination] are directory or URI
CRX: /path/to/aem/crx-quickstart/repository
Mongo: mongodb://<hostname>:<port>/<database>
JDBC: jdbc:mysql://<hostname>:<port>/<schema>
Notes:
Both AEM instances NEED to be shut down
Both repositories NEED to be filesystem/script accessible
Docs: https://docs.adobe.com/docs/en/aem/6-2/deploy/upgrade/using-crx2oak.html
Note: Extra options available in help [java -jar <jar-file> -h]
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
10. Larger Example
java
-Dlogback.configurationFile=logback.xml [logging]
-Xmx4g [memory usage]
-jar crx2oak-1.4.6-standalone.jar [CRX2Oak jar version]
/opt/aem/author60/crx-quickstart/repository [source]
/opt/aem/author62/crx-quickstart/repository [destination]
--include-
paths=/content,/etc/cloudservices/testandtarget,/etc/designs/x,/etc/designs/y,/etc/designs/
z,/etc/tags,/etc/xyz,/home/users/x,/home/users/y,/home/groups/x,/home/groups/y
[include paths]
--exclude-paths=/home/users/x/xyz,/home/groups/x/xyz [exclude paths]
--copy-versions=2016-08-29 [ignore versions before 2016-08-29]
--copy-orphaned-versions=2016-08-29 [ignore versions before 2016-08-29]
--copy-binaries [copy binaries – don’t just reference old DataStore]
--fail-on-error=true [fail if there’s an error]
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au
11. Things to look out for
AEM structures can change between versions; Notably:
From AEM 6.1 onwards user & group paths are hashed (by default), and then stored under a
path with the first letter of their hash (including admin)
After AEM 6.0, workflow instance paths have an additional server node in their path.
/etc/workflow/instances/serverXYZ/YYYY-MM-DD
From AEM 6.2 onwards, /content/campaign paths introduced a concept of area (default =
master)
Content structures that you have created are the safest
I.e. /content/xyz
Ask whether you want versions or orphaned content to come across?
If not, then exclude them via “--copy-versions=false” and “--copy-orphaned-versions=false”
(will run faster)
If so, specify appropriate dates so you only copy what you need
Don’t migrate package contents. Deploy packages as normal instead
Best to reindex everything post-migration. Delete indexes or run re-index tool
http://www.filstan.com.au
Michael Henderson
Technical Director
michael@filstan.com.au