SlideShare una empresa de Scribd logo
1 de 39
Metadata Extraction and Content Transformations 2 Nick Burch Software Engineer, Alfresco twitter: @gagravarr
Introduction – 3 Content Related Services 3 Covering Uses Interfaces Calling the Services Java & JavaScript APIs Demos Extensions Apache Tika Metadata Extractor Content Transformer Renditions
The Metadata Extractor Service 4 What, How, Why? ,[object Object]
 Document Metadata is converted into the content model
 Typically used with uploaded binary files
 Upload a PDF, extract out the Title and Description, save these as the properties on the Alfresco Node
 Powered internally by a number of different extractors
 Service picks the appropriate extractor for you
 Since Alfresco 3.4, makes heavy use of Apache Tika,[object Object]
 Driven by mime types, source and destination
 Used to generate plain text versions for indexing
 Used to generate SWF versions for preview
 Used to generate PDF versions for web download
 Powered by a large number of different transformers
 Transformers can be linked together, eg .doc -> .pdf via Open Office, then .pdf -> .swf via pdf2swf
 Since Alfresco 3.4, makes heavy use of Apache Tika,[object Object]
 Or can just alter some content as-is
 Used to manipulate images, eg crop and resize
 Used to generate HTML .docx previews in Web Quick Start
 Often uses the Content Transformation Service
 Replaced the Thumbnail Service
 Renditions are actions,[object Object]
 Grew out of the Lucene community, now widely used
 Provides detection of files – eg this binary blob is really a word file
 Plain text, HTML and XHTML versions of a wide range of different file formats
 Consistent Metadata from different files
Tika hides the complexity of the different formats, and presents a simple, powerful API
 Easy to use and extend,[object Object]
Alfresco 3.3 - Supported Formats 9 File Formats supported out of the box ,[object Object]
 Word, PowerPoint, Excel
 HTML
 Open Document Formats (OpenOffice)
 RFC822 Email
 Outlook .msg Email,[object Object]
 DWG (CAD)
Epub
 RSS and ATOM Feeds
 True Type Fonts
 HTML

Más contenido relacionado

La actualidad más candente

(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in AlfrescoAngel Borroy López
 
How to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseHow to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseAngel Borroy López
 
Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...
Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...
Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...J V
 
Intro to the Alfresco Public API
Intro to the Alfresco Public APIIntro to the Alfresco Public API
Intro to the Alfresco Public APIJeff Potts
 
Alfresco Bulk Import toolのご紹介
Alfresco Bulk Import toolのご紹介Alfresco Bulk Import toolのご紹介
Alfresco Bulk Import toolのご紹介MoritakaSoma
 
Moving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryMoving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryJeff Potts
 
Exciting New Alfresco REST APIs
Exciting New Alfresco REST APIsExciting New Alfresco REST APIs
Exciting New Alfresco REST APIsJ V
 
Alfresco - You probably didn't know that
Alfresco - You probably didn't know thatAlfresco - You probably didn't know that
Alfresco - You probably didn't know thatDavid Ciamberlano
 
Share UIカスタマイズ Widget編
Share UIカスタマイズ Widget編Share UIカスタマイズ Widget編
Share UIカスタマイズ Widget編MoritakaSoma
 
From zero to hero Backing up alfresco
From zero to hero Backing up alfrescoFrom zero to hero Backing up alfresco
From zero to hero Backing up alfrescoToni de la Fuente
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and ThenAngel Borroy López
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST APIJ V
 
Guide to alfresco monitoring
Guide to alfresco monitoringGuide to alfresco monitoring
Guide to alfresco monitoringMiguel Rodriguez
 
Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0Angel Borroy López
 
Webinar "Alfresco en une heure"
Webinar "Alfresco en une heure"Webinar "Alfresco en une heure"
Webinar "Alfresco en une heure"Michael Harlaut
 

La actualidad más candente (20)

(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
How to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseHow to migrate from Alfresco Search Services to Alfresco SearchEnterprise
How to migrate from Alfresco Search Services to Alfresco SearchEnterprise
 
Alfresco Certificates
Alfresco Certificates Alfresco Certificates
Alfresco Certificates
 
Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...
Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...
Deep Dive: Alfresco Core Repository (... embedded in a micro-services style a...
 
Alfresco tuning part2
Alfresco tuning part2Alfresco tuning part2
Alfresco tuning part2
 
Intro to the Alfresco Public API
Intro to the Alfresco Public APIIntro to the Alfresco Public API
Intro to the Alfresco Public API
 
Alfresco Bulk Import toolのご紹介
Alfresco Bulk Import toolのご紹介Alfresco Bulk Import toolのご紹介
Alfresco Bulk Import toolのご紹介
 
Moving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco RepositoryMoving Gigantic Files Into and Out of the Alfresco Repository
Moving Gigantic Files Into and Out of the Alfresco Repository
 
Exciting New Alfresco REST APIs
Exciting New Alfresco REST APIsExciting New Alfresco REST APIs
Exciting New Alfresco REST APIs
 
Storage and Alfresco
Storage and AlfrescoStorage and Alfresco
Storage and Alfresco
 
Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
 
Alfresco - You probably didn't know that
Alfresco - You probably didn't know thatAlfresco - You probably didn't know that
Alfresco - You probably didn't know that
 
Share UIカスタマイズ Widget編
Share UIカスタマイズ Widget編Share UIカスタマイズ Widget編
Share UIカスタマイズ Widget編
 
From zero to hero Backing up alfresco
From zero to hero Backing up alfrescoFrom zero to hero Backing up alfresco
From zero to hero Backing up alfresco
 
Alfresco search services: Now and Then
Alfresco search services: Now and ThenAlfresco search services: Now and Then
Alfresco search services: Now and Then
 
Alfresco 5.2 REST API
Alfresco 5.2 REST APIAlfresco 5.2 REST API
Alfresco 5.2 REST API
 
Guide to alfresco monitoring
Guide to alfresco monitoringGuide to alfresco monitoring
Guide to alfresco monitoring
 
Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0Discovering the 2 in Alfresco Search Services 2.0
Discovering the 2 in Alfresco Search Services 2.0
 
Webinar "Alfresco en une heure"
Webinar "Alfresco en une heure"Webinar "Alfresco en une heure"
Webinar "Alfresco en une heure"
 
Alfresco tuning part1
Alfresco tuning part1Alfresco tuning part1
Alfresco tuning part1
 

Destacado

Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsJulien Nioche
 
Open source enterprise search and retrieval platform
Open source enterprise search and retrieval platformOpen source enterprise search and retrieval platform
Open source enterprise search and retrieval platformmteutelink
 
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...hannonhill
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01David Smiley
 
Apache Tika end-to-end
Apache Tika end-to-endApache Tika end-to-end
Apache Tika end-to-endgagravarr
 
Content Analysis with Apache Tika
Content Analysis with Apache TikaContent Analysis with Apache Tika
Content Analysis with Apache TikaPaolo Mottadelli
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friendslucenerevolution
 
Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)Manish kumar
 
Web Crawling with Apache Nutch
Web Crawling with Apache NutchWeb Crawling with Apache Nutch
Web Crawling with Apache Nutchsebastian_nagel
 
An introduction to Storm Crawler
An introduction to Storm CrawlerAn introduction to Storm Crawler
An introduction to Storm CrawlerJulien Nioche
 
Actions rules and workflow in alfresco
Actions rules and workflow in alfrescoActions rules and workflow in alfresco
Actions rules and workflow in alfrescoAlfresco Software
 
Large scale crawling with Apache Nutch
Large scale crawling with Apache NutchLarge scale crawling with Apache Nutch
Large scale crawling with Apache NutchJulien Nioche
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)dnaber
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrAndy Jackson
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco Software
 

Destacado (20)

Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
Open source enterprise search and retrieval platform
Open source enterprise search and retrieval platformOpen source enterprise search and retrieval platform
Open source enterprise search and retrieval platform
 
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
Faster! Optimize Your Cascade Server Experience, by Justin Klingman, Beacon T...
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01
 
Apache Tika end-to-end
Apache Tika end-to-endApache Tika end-to-end
Apache Tika end-to-end
 
Content Analysis with Apache Tika
Content Analysis with Apache TikaContent Analysis with Apache Tika
Content Analysis with Apache Tika
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
ProjectHub
ProjectHubProjectHub
ProjectHub
 
Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)Search Engine Capabilities - Apache Solr(Lucene)
Search Engine Capabilities - Apache Solr(Lucene)
 
Web Crawling with Apache Nutch
Web Crawling with Apache NutchWeb Crawling with Apache Nutch
Web Crawling with Apache Nutch
 
Search engine
Search engineSearch engine
Search engine
 
An introduction to Storm Crawler
An introduction to Storm CrawlerAn introduction to Storm Crawler
An introduction to Storm Crawler
 
Actions rules and workflow in alfresco
Actions rules and workflow in alfrescoActions rules and workflow in alfresco
Actions rules and workflow in alfresco
 
Alfresco content model
Alfresco content modelAlfresco content model
Alfresco content model
 
Large scale crawling with Apache Nutch
Large scale crawling with Apache NutchLarge scale crawling with Apache Nutch
Large scale crawling with Apache Nutch
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...Alfresco In An Hour - Document Management, Web Content Management, and Collab...
Alfresco In An Hour - Document Management, Web Content Management, and Collab...
 

Similar a Metadata Extraction and Content Transformation

Content analysis for ECM with Apache Tika
Content analysis for ECM with Apache TikaContent analysis for ECM with Apache Tika
Content analysis for ECM with Apache TikaPaolo Mottadelli
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tikaSutthipong Kuruhongsa
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tikaSutthipong Kuruhongsa
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddlerholiman
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinPietro Michiardi
 
Mime Magic With Apache Tika
Mime Magic With Apache TikaMime Magic With Apache Tika
Mime Magic With Apache TikaJukka Zitting
 
CustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputsCustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputsSuite Solutions
 
Lecture14Slides.ppt
Lecture14Slides.pptLecture14Slides.ppt
Lecture14Slides.pptVideoguy
 
Developing web apps using Erlang-Web
Developing web apps using Erlang-WebDeveloping web apps using Erlang-Web
Developing web apps using Erlang-Webfanqstefan
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychevDrupalCampDN
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R StudioRupak Roy
 
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic CommunicationIQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic CommunicationTed Leung
 
Rspec API Documentation
Rspec API DocumentationRspec API Documentation
Rspec API DocumentationSmartLogic
 

Similar a Metadata Extraction and Content Transformation (20)

Content analysis for ECM with Apache Tika
Content analysis for ECM with Apache TikaContent analysis for ECM with Apache Tika
Content analysis for ECM with Apache Tika
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tika
 
Understanding information content with apache tika
Understanding information content with apache tikaUnderstanding information content with apache tika
Understanding information content with apache tika
 
Hatkit Project - Datafiddler
Hatkit Project - DatafiddlerHatkit Project - Datafiddler
Hatkit Project - Datafiddler
 
Power tools in Java
Power tools in JavaPower tools in Java
Power tools in Java
 
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig LatinHigh-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
 
Tibco business works
Tibco business worksTibco business works
Tibco business works
 
Mime Magic With Apache Tika
Mime Magic With Apache TikaMime Magic With Apache Tika
Mime Magic With Apache Tika
 
CustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputsCustomizingStyleSheetsForHTMLOutputs
CustomizingStyleSheetsForHTMLOutputs
 
Ajax
AjaxAjax
Ajax
 
Lecture14Slides.ppt
Lecture14Slides.pptLecture14Slides.ppt
Lecture14Slides.ppt
 
Developing web apps using Erlang-Web
Developing web apps using Erlang-WebDeveloping web apps using Erlang-Web
Developing web apps using Erlang-Web
 
SCDJWS 6. REST JAX-P
SCDJWS 6. REST  JAX-PSCDJWS 6. REST  JAX-P
SCDJWS 6. REST JAX-P
 
5 xml parsing
5   xml parsing5   xml parsing
5 xml parsing
 
Apache tika
Apache tikaApache tika
Apache tika
 
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals   maksym moskvychevTwig internals - Maksym MoskvychevTwig internals   maksym moskvychev
Twig internals - Maksym MoskvychevTwig internals maksym moskvychev
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
 
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic CommunicationIQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
IQPC Canada XML 2001: How to Use XML Parsing to Enhance Electronic Communication
 
Rspec API Documentation
Rspec API DocumentationRspec API Documentation
Rspec API Documentation
 

Más de Alfresco Software

Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Software
 
Alfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Software
 
Alfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Software
 
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Software
 
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Software
 
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Software
 
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Software
 
Alfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Software
 
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Software
 
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Software
 
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Software
 
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Software
 
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Software
 
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Software
 
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Software
 
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Software
 
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Software
 
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Software
 

Más de Alfresco Software (20)

Alfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossierAlfresco Day Benelux Inholland studentendossier
Alfresco Day Benelux Inholland studentendossier
 
Alfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management applicationAlfresco Day Benelux Hogeschool Inholland Records Management application
Alfresco Day Benelux Hogeschool Inholland Records Management application
 
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion HogescholenAlfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
Alfresco Day BeNelux: Customer Success Showcase - Saxion Hogescholen
 
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente AmsterdamAlfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
Alfresco Day BeNelux: Customer Success Showcase - Gemeente Amsterdam
 
Alfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of AlfrescoAlfresco Day BeNelux: The success of Alfresco
Alfresco Day BeNelux: The success of Alfresco
 
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo GroupAlfresco Day BeNelux: Customer Success Showcase - Credendo Group
Alfresco Day BeNelux: Customer Success Showcase - Credendo Group
 
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About FlowAlfresco Day BeNelux: Digital Transformation - It's All About Flow
Alfresco Day BeNelux: Digital Transformation - It's All About Flow
 
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
Alfresco Day Vienna 2016: Activiti – ein Katalysator für die DMS-Strategie be...
 
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
Alfresco Day Vienna 2016: Elektronische Geschäftsprozesse auf Basis von Alfre...
 
Alfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest APIAlfresco Day Vienna 2016: Alfrescos neue Rest API
Alfresco Day Vienna 2016: Alfrescos neue Rest API
 
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-KonsoleAlfresco Day Vienna 2016: Support Tools für die Admin-Konsole
Alfresco Day Vienna 2016: Support Tools für die Admin-Konsole
 
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit AlfrescoAlfresco Day Vienna 2016: Entwickeln mit Alfresco
Alfresco Day Vienna 2016: Entwickeln mit Alfresco
 
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
Alfresco Day Vienna 2016: Activiti goes enterprise: Die Evolution der BPM Sui...
 
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: WesternacherAlfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
Alfresco Day Vienna 2016: Partner Lightning Talk: Westernacher
 
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
Alfresco Day Vienna 2016: Bringing Content & Process together with the App De...
 
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novumAlfresco Day Vienna 2016: Partner Lightning Talk - it-novum
Alfresco Day Vienna 2016: Partner Lightning Talk - it-novum
 
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
Alfresco Day Vienna 2016: How to Achieve Digital Flow in the Enterprise - Joh...
 
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
Alfresco Day Warsaw 2016 - Czy możliwe jest spełnienie wszystkich regulacji p...
 
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - SafranAlfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
Alfresco Day Warsaw 2016: Identyfikacja i podpiselektroniczny - Safran
 
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital BusinessAlfresco Day Warsaw 2016: Advancing the Flow of Digital Business
Alfresco Day Warsaw 2016: Advancing the Flow of Digital Business
 

Último

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 

Último (20)

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 

Metadata Extraction and Content Transformation

  • 1. Metadata Extraction and Content Transformations 2 Nick Burch Software Engineer, Alfresco twitter: @gagravarr
  • 2. Introduction – 3 Content Related Services 3 Covering Uses Interfaces Calling the Services Java & JavaScript APIs Demos Extensions Apache Tika Metadata Extractor Content Transformer Renditions
  • 3.
  • 4. Document Metadata is converted into the content model
  • 5. Typically used with uploaded binary files
  • 6. Upload a PDF, extract out the Title and Description, save these as the properties on the Alfresco Node
  • 7. Powered internally by a number of different extractors
  • 8. Service picks the appropriate extractor for you
  • 9.
  • 10. Driven by mime types, source and destination
  • 11. Used to generate plain text versions for indexing
  • 12. Used to generate SWF versions for preview
  • 13. Used to generate PDF versions for web download
  • 14. Powered by a large number of different transformers
  • 15. Transformers can be linked together, eg .doc -> .pdf via Open Office, then .pdf -> .swf via pdf2swf
  • 16.
  • 17. Or can just alter some content as-is
  • 18. Used to manipulate images, eg crop and resize
  • 19. Used to generate HTML .docx previews in Web Quick Start
  • 20. Often uses the Content Transformation Service
  • 21. Replaced the Thumbnail Service
  • 22.
  • 23. Grew out of the Lucene community, now widely used
  • 24. Provides detection of files – eg this binary blob is really a word file
  • 25. Plain text, HTML and XHTML versions of a wide range of different file formats
  • 26. Consistent Metadata from different files
  • 27. Tika hides the complexity of the different formats, and presents a simple, powerful API
  • 28.
  • 29.
  • 32. Open Document Formats (OpenOffice)
  • 34.
  • 36. Epub
  • 37. RSS and ATOM Feeds
  • 38. True Type Fonts
  • 40. Images – JPEG, GIF, PNG, TIFF, Bitmap (including EXIF where found)
  • 42.
  • 43. Microsoft Office (Binary) – Word, PowerPoint, Excel, Visio, Publisher, Works
  • 44. Microsoft Office (OOXML) – Word, PowerPoint, Excel
  • 45. MP3 (id3 v1 and v2)
  • 47. Open Document Format (Open Office)
  • 48. Old-style Open Office (.sxw etc)
  • 49.
  • 54. Java class filesAnd I probably forgot one...!
  • 55. Calling Apache Tika 13 // Get a content detector, and an auto-selecting Parser TikaConfigconfig = TikaConfig.getDefaultConfig(); ContainerAwareDetector detector = new ContainerAwareDetector( config.getMimeRepository() ); Parser parser = new AutoDetectParser(detector); // We’ll only want the plain text contents ContentHandler handler = new BodyContentHandler(); // Tell the parser what we have Metadata metadata = new Metadata(); metadata.set(Metadata.RESOURCE_NAME_KEY, filename); // Have it processed parser.parse(input, handler, metadata, new ParseContext());
  • 56. Metadata Extractor – Java Use 14 MetadataExtractorRegistry registry = (MetadataExtractorRegistry)context.getBean(“metadataExtracterRegistry”); MetadataExtracter extractor = registry.getExtracter(“application/vnd.ms-excel”); Map<QName, Serializable> properties = new HashMap<QName, Serializable>(); ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT); extractor.extract(reader, properties); System.err.println(properties);
  • 57. Metadata Extractor – JavaScript Use 15 JavaScript var action = actions.create("extract-metadata"); action.execute(document); Full access is not directly available You can’t get at the raw properties You can, however, trigger extraction and saving to the node easily Available via an action
  • 58. Metadata Extractor – Geo Content Model 16 <aspect name="cm:geographic"> <title>Geographic</title> <properties> <property name="cm:latitude"> <title>Latitude</title> <type>d:double</type> </property> <property name="cm:longitude"> <title>Longitude</title> <type>d:double</type> </property> </properties> </aspect>
  • 59. Metadata Extractor – Geo Mapping 17 # Namespaces namespace.prefix.cm=http://www.alfresco.org/model/content/1.0 # Geo Mappings geolat=cm:latitude geolong=cm:longitude # Normal Mappings author=cm:author title=cm:title description=cm:description created=cm:created
  • 60. Demo:Geo Tagged Image in Share 18
  • 62.
  • 63. PDF to Image
  • 64. PDF to SWF (for preview)
  • 65. Office File Formats to PDF (via Open Office, using JODConverter in Enterprise)
  • 66. Plain Text and XML to PDF
  • 67. Zip listing to Text
  • 68.
  • 69. Content Transformer – Java Use 22 ContentTransformerRegistry registry = (ContentTransformerRegistry)context.getBean(“contentTransformerRegistry”); ContentTransformer transformer = registry.getTransformer(“application/vnd.ms-excel”,”text/csv”, new TransformationOptions()); ContentReader reader = contentService.getReader(sourceNodeRef, ContentModel.PROP_CONTENT); ContentWriter writer = contentService.getReader(destNodeRef, ContentModel.PROP_CONTENT); transformer.transform(reader, writer);
  • 70. Content Transformer – JavaScript Use 23 JavaScript var action = actions.create("transform"); // Transform into the same folder action.parameters["destination-folder"] = document.parent; action.parameters["assoc-type"] = "{http://www.alfresco.org/model/content/1.0}contains"; action.parameters["assoc-name"] = document.name + "transformed"; action.parameters["mime-type"] = "text/html"; // Execute action.execute(document); Full access is not directly available You can’t control which property is transformed, it’s always Content You can control where the transformed version goes Triggering the transformation is easier than in Java Available via an action
  • 71. Custom Tika Parsers - Interface 24 Interface public interface Parser { Set<MediaType> getSupportedTypes(ParseContext context); void parse( InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException; } The Tika Parser interface is quite simple Need to provide a list of supported mime types, so that auto-detection can work Accept an input stream, populate the Metadata object, and fire SAX events to the supplied handler That’s it!
  • 72. Custom Tika Parser – Hello World Parser 25 public class HelloWorldParser implements Parser { public Set<MediaType> getSupportedTypes(ParseContext context) { Set<MediaType> types = new HashSet<MediaType>(); types.add(MediaType.parse("hello/world")); return types; } public void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws SAXException { XHTMLContentHandlerxhtml = new XHTMLContentHandler(handler, metadata); xhtml.startDocument(); xhtml.startElement("h1"); xhtml.characters("Hello, World!"); xhtml.endElement("h1"); xhtml.endDocument(); metadata.set("hello","world"); metadata.set("title","Hello World!"); } }
  • 73. Custom Command Line Transformer 26 <bean id="transformer.worker.helloWorldCMD" class="org.alfresco.repo.content.transform.RuntimeExecutableContentTransformerWorker"> <property name="mimetypeService“><ref bean="mimetypeService"/></property> <property name="transformCommand"> <bean class="org.alfresco.util.exec.RuntimeExec"> <property name="commandsAndArguments“><map> <entry key=".*“><list> <value>/bin/bash</value> <value>-c</value> <value>/bin/echo 'Hello World - ${source}' &gt; ${target}</value> </list></entry> </map></property> <property name="errorCodes“><value>1,127</value></property> </bean> </property <property name="explicitTransformations"> <list><bean class="org.alfresco.repo.content.transform.ExplictTransformationDetails"> <property name="sourceMimetype“><value>text/plain</value></property> <property name="targetMimetype“><value>hello/world</value></property> </bean></list> </property> </bean> <bean id="transformer.helloWorldCMD" class="org.alfresco.repo.content.transform.ProxyContentTransformer" parent="baseContentTransformer"> <property name="worker"><ref bean="transformer.worker.helloWorldCMD"/></property> </bean>
  • 74.
  • 75. Use our Tikatransfomer to turn this back into plain text
  • 76.
  • 78.
  • 79. image – crop, resize, etc
  • 80. freemarker – runs a Freemarker Template against the content of the node
  • 81. html – turns .docx files into clean HTML + images
  • 82. xslt – runs a XSLT Transformation against the content of the node, XML content nodes only!
  • 83.
  • 84. Then set all the parameters against it
  • 85. Finally execute it against a node
  • 86. For very complicated / common renditions, you can save the definition to the data dictionary
  • 87. It can then be retrieved and run
  • 88.
  • 89. QNamerenditionName = QName.createQName( NamespaceService.CONTENT_MODEL_1_0_URI, "myRendDefn");
  • 91. // Make some changes.
  • 94. // Persist the changes.
  • 96. // Run the Rendition
  • 97.
  • 103.
  • 104. They won’t show up in Share when defining Rules, or in Explorer for running a Custom Action
  • 105. Solution – create a JS Script, or some custom Java
  • 106. Use this from your Rule / to run as an Action
  • 107. No dedicated REST API, but Renditions are available through CMIS
  • 108.
  • 109. This delivers lots of flexibility, and means anyone who can write Custom Actions already knows enough to write Custom Rendition Engines!
  • 111.
  • 113. Demo 3:Word .docx -> HTML & Images(Using Web Quick Start) 38
  • 115. Learn More 40 wiki.alfresco.com forums.alfresco.com blogs.alfresco.com/wp/nickb/ twitter: @AlfrescoECM @Gagravarr

Notas del editor

  1. Use devconf-files/geotagged.jpgAKA quickGEO.jpg from projects/repository/source/test-resources/quick/Upload it into ShareGo to the details page, and show the EXIF and GEO bits
  2. Create a script with the code shown, also available as devconf-js/contentTransformHellowWorld.jsCreate a simple .txt entry in explorerRun the script against itShow that we get a .bin with content type of hello/worldMaybe view the contents of thisRun the script againShow that we get a new plain text file, with the simple contents
  3. Use quick.xls or similar from projects/repository/source/test-resources/quick/Upload it into ExplorerRun a custom action which is a scriptvarnameBase = document.name.substring(0, document.name.lastIndexOf(&quot;.&quot;));var action = actions.create(&quot;transform&quot;);action.parameters[&quot;destination-folder&quot;] = document.parent;action.parameters[&quot;assoc-type&quot;] = &quot;{http://www.alfresco.org/model/content/1.0}contains&quot;;action.parameters[&quot;assoc-name&quot;] = nameBase + &quot;.txt&quot;;action.parameters[&quot;mime-type&quot;] = &quot;text/plain&quot;;action.execute(document);action.parameters[&quot;assoc-name&quot;] = nameBase + &quot;.csv&quot;;action.parameters[&quot;mime-type&quot;] = &quot;text/csv&quot;;action.execute(document);action.parameters[&quot;assoc-name&quot;] = nameBase + &quot;.html&quot;;action.parameters[&quot;mime-type&quot;] = &quot;text/xml&quot;;action.execute(document);
  4. In Share, create a rule on a folder to execute the script belowEnsure the folder that the script writes to isn’t a child!Upload an image, at least 600x400 big into the folderIn share, use the repo browser to get to the special folder, and show the smaller cropped versionvarrenditionDef = renditionService.createRenditionDefinition(&quot;Test&quot;, &quot;htmlRenderingEngine&quot;);renditionDef.parameters[&quot;destination-path-template&quot;] = &quot;/Company Home/Test/${name}.html&quot;;renditionDef.execute(document);renditionDef.parameters[&quot;destination-path-template&quot;] = &quot;/Company Home/Test/BodyOnly/${name}.html&quot;;renditionDef.parameters[&quot;bodyContentsOnly&quot;] = true;
  5. In share, create a Video folder, and a Video Drop subfolderOn the subfolder, create a ruleRule is copy + transformTarget mimetype is flash videoTarget directory is the parent (Videos)Don’t run in the backgroundUpload devconf-files/Video.mp4WaitHopefully show the thumbnail of the .mp4 (refresh if needed)Go to the videos directoryShow the large preview image of the new .flv image
  6. Upload a .docx (including images) to a Web Quick Start folder with .docx extraction enabled via ShareView in Share the HTML, to show what it looks likeFlip to the WCM site, and show it with imagesBACKUP – Create a script based on devconf-files/renderTest.jsIn share, create a folder with a rule of execute scriptPick the scriptUpload devconf-files/word3imgs.docxGo to Company Home/TestShow the htmlShow the images(You can’t show the two together without WCM, sorry....)