This document discusses content conversion tools for Plone including FunnelWeb and the transmogrify library. FunnelWeb is a recipe and script that crawls websites, filters content, removes templates, and uploads content to Plone. Transmogrify is a Python library for content conversions that includes blueprints for crawling, extracting content, analyzing sites, and uploading to Plone. The document demonstrates using these tools to convert an existing site into a Plone site in an automated manner.
2. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Content Conversions suck
Large existing sites
Static html or old CMS
Hard to quote on
Content audit
Use plone to fix content
Convert Docs to Pages (coming...)
3. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
History
2008 - Obrien Intranet
2009 – pretaweb.funnelweb (deprecated)
Plone UI > Actions > Import
2010 – transmogrify.* release on pypi
2010 – collective.developermanual
sphinx to plone
2010 – funnelweb Recipe + Script
Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim
Knap
11. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Making your own blueprint
class MyBlueprint(object):
classProvides(ISectionBlueprint)
implements(ISection)
def __init__(self, transmogrifier, name, options, previous):
self.previous = previous
def __iter__(self):
for item in self.previous:
dosomethingto(item)
yield item
<utility component=".myblueprint.MyBluePrintr"
name="transmogrify.myblueprint" />
12. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.webcrawler
transmogrify.webcrawler
Crawls site or cache for content
transmogrify.webcrawler.typerecognitor
Sets Plone content type based on mime-type
transmogrify.webcrawler.cache
Saves content to disk
13. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.htmlcontentextractor
transmogrify.htmlcontentextractor
Provide XPath for title, description, text etc.
transmogrify.htmlcontentextractor.auto
Guesses XPaths from content
14. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.siteanalyser
transmogrify.siteanalyser.relinker
Moves, renames, url tidying
transmogrify.siteanalyser.title
Guess page titles
transmogrify.siteanalyser.defaultpage
Move index pages into folders
transmogrify.siteanalyser.attach
Move attachments closer to pages
15. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
transmogrify.ploneremote
Remoteconstructor
Adds content to plone via xmlrpc
Remoteschemaupdater
Updates content of existing object
Remotenavigationexcluder
Hides content not in orginal sites navigation
Remoteworkflowupdater
Publish content
Remoteredirector
Creates aliases for items that have moved
16. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Other blueprints
transmogrify.pathsorter
Puts folders before content and content in
right order
collective.transmogrifier.sections.condition
Useful to drop certain content
17. dylan@pretaweb.comPlone Conf 2010 Dylan Jay
Where to get it
http://github.com:djay/funnelweb.git
http://github.com:djay/transmogrify.*
Pypi release TBA