SlideShare una empresa de Scribd logo
1 de 90
Descargar para leer sin conexión
Bill Kasdorf
VP and Principal Consultant,Apex Content Solutions
Markup, Metadata,
Formats, and Workflows
How Publishing Works in the Digital Era
Part I
Markup & Metadata
Content
Think of content as the
stuff you can see.
Markup
Think of markup as
the engineering that
makes it work
like a well-oiled machine.
Metadata
Think of metadata
as the oil.
Content
Think first about the content,
not about the publication.
Content
Think first about the content,
not about the publication.
That helps you focus on
what things are,
not what they look like.
Content
Think first about the content,
not about the publication.
That helps you focus on
what things are,
not what they look like.
That leads to adaptable markup
that you can optimize for
print, online, ebooks, or apps.
Content Analysis
What kind of content is this?
Content Analysis
What kind of content is this?
Who needs it? Why? (Later, ask “how?”)
Content Analysis
What kind of content is this?
Who needs it? Why? (Later, ask “how?”)
What pieces are meaningful?
Content Analysis
What kind of content is this?
Who needs it? Why? (Later, ask “how?”)
What pieces are meaningful?
What chunks are needed for rendering?
Content Analysis
What kind of content is this?
Who needs it? Why? (Later, ask “how?”)
What pieces are meaningful?
What chunks are needed for rendering?
What chunks will people want to point to?
Content Analysis
What kind of content is this?
Who needs it? Why? (Later, ask “how?”)
What pieces are meaningful?
What chunks are needed for rendering?
What chunks will people want to point to?
How does one chunk relate to other
chunks . . . across all your publications?
Content Analysis
What kind of content is this?
Who needs it? Why? (Later, ask “how?”)
What pieces are meaningful?
What chunks are needed for rendering?
What chunks will people want to point to?
How does one chunk relate to other
chunks . . . across all your publications?
The Goal:
THOUGHTFUL CHUNKING
Vocabulary and Markup:
What to name the components
and how to tag them
for editing,
typesetting,
and digital publishing.
It works best if the same vocabulary
(but not necessarily the same markup syntax)
can be used for all of these
phases of your workflow.
Design: Typography and Layout
Typography is really implied “markup.”
Typography distinguishes the components.
Layout is a navigation guide.
This is a centuries-in-the-making
collection of design conventions.
Design is based on semantic distinctions:
What is this thing? How important is it?
How does it relate to
the other things around it?
What do you see
on this page?
What do you see
on this page?
“Huge numeral?”
“24 pt Meta, fl rr?”
“11 pt Charter,
letterspaced?”
“Rag right para
indented on left?”
“12 pt Meta Black all
caps, & sm caps?”
“Bold term?”
I don’t think so. . . .
Here’s what we
“see” on this page:
“Chapter number”
“Chapter title”
“Author’s name”
“Introductory
paragraph”
“Level 1 subhead”
“Level 2 subhead”
“Glossary term”
We see structure and
semantics, not specs.
XML
XML enables
the separation of
structure and semantics
from
rendering, presentation.
<CN> </CN>
</CT>
</AU>
<INTRO>
</INTRO>
<H1>
<H2>
</H1>
</H2>
<CT>
<AU>
<GLOSS> </GLOSS>
Here’s one possible
markup scheme:
“Chapter number”
“Chapter title”
“Author’s name”
“Introductory
paragraph”
“Level 1 subhead”
“Level 2 subhead”
“Glossary term”
That’s XML markup.
Those are “tags.”
<CN> </CN>
</CT>
</AU>
<INTRO>
</INTRO>
<H1>
<H2>
</H1>
</H2>
<CT>
<AU>
<GLOSS> </GLOSS>
Here’s one possible
markup scheme:
“Chapter number”
“Chapter title”
“Author’s name”
“Introductory
paragraph”
“Level 1 subhead”
“Level 2 subhead”
“Glossary term”
That’s XML markup.
Those are “tags.”
You don’t have
to use XML.
You do need
some form of
markup, even
if in the form
of styles, to
distinguish the
components.
XML is the
most powerful,
future-proof
markup.
XML
Extensible Markup Language
XML
Extensible Markup Language
Extensible:
Designed to adapt to various
• kinds of documents
• modes of publication
• patterns of access and use
XML
Extensible Markup Language
Markup:
Taggingadocument
toprovide
•structuralinformation
•semanticinformation
•formattinginformation
•supplemental
	information
XML
Extensible Markup Language
Language:
Astandard way to express markup.
Not a set of tags or a vocabulary,
but an agreed-upon way to express
a given vocabulary or tag set.
XML
XML liberates your content
from any particular page design,
any particular reading system,
any particular workflow.
Print, app, ebook, and online:
all from the same XML document!
XML is not a set of tags.
It is a LANGUAGE for expressing:
XML is not a set of tags.
It is a LANGUAGE for expressing:
• Semantic information: what the pieces are
XML is not a set of tags.
It is a LANGUAGE for expressing:
• Semantic information: what the pieces are
• Structural information:
how the pieces fit together
XML is not a set of tags.
It is a LANGUAGE for expressing:
• Semantic information: what the pieces are
• Structural information:
how the pieces fit together
• Metadata: information about the content
XML is not a set of tags.
It is a LANGUAGE for expressing:
• Semantic information: what the pieces are
• Structural information:
how the pieces fit together
• Metadata: information about the content
• Presentation information, but only where
semantics and structure don’t apply
XML is not a set of tags.
It is a LANGUAGE for expressing:
• Semantic information: what the pieces are
• Structural information:
how the pieces fit together
• Metadata: information about the content
• Presentation information, but only where
semantics and structure don’t apply
. . . creating an unlimited number
of presentations from
a single XML document.
So where do the tags
come from?
Surely you don’t
just make them up.
Wasn’t the whole point
to make the tagging
clear, consistent,
and non-proprietary?
Well, technically,
you can just make them up.
But then only you know what they mean.
As long as you follow the XML rules,
it’s called “well-formed” XML.
Well, technically,
you can just make them up.
But then only you know what they mean.
As long as you follow the XML rules,
it’s called “well-formed” XML.
It’s better to have a formal specification
(a DTD or other schema), and if your XML
also conforms to that, it’s called
“valid” XML (which is also well-formed).
That lets any XML-based system
interpret and use your markup.
DTD
Document Type Definition
A special formal syntax
used to define a particular
type of document
or set of related documents.
It defines a tag set:
the specific tags and how they’re used.
DTD
Elements are the nouns:
e.g., <title> or <blockquote>.
A chunk of content is surrounded by
a “start tag” and an “end tag”:
e.g., <title>This Publication</title>;
and elements must “nest” properly.
Now systems can tell the chunks apart
and process them appropriately.
DTD
Attributes are the adjectives
that describe the elements:
e.g., <title class="title-page">
vs. <title class="chapter">.
Now they can be distinguished,
processed, and rendered differently.
Unique IDs identify “this specific one,” e.g.,
<section class="chapter" id="ch001">.
DTD
DTDs can also define metadata:
information about the content.
For example:
• Bibliographic information
• Subject codes
• Author and publisher information
• Technical information
• Rights and usage information
DTD
DTDs (or other types of schemas)
are often called “models.”
Most publishers’ models today
are based on one of a number of
standard models that are
widely used and well known
in a certain “community.”
Some Standard Models
DocBook
A generic book model, initially developed
for technical books and documentation
TEI, the Text Encoding Initiative
Mainly used for textual research
NLM/JATS/BITS
The model for scholarly journals and books
XHTML
The language of the Web and EPUB,
expressed as XML
Some Standard Models
DocBook
A generic book model, initially developed
for technical books and documentation
TEI, the Text Encoding Initiative
Mainly used for textual research
NLM/JATS/BITS
The model for scholarly journals and books
XHTML
The language of the Web and EPUB,
expressed as XML
These each
provide a
standard,
widely used
framework
to which a
publisher’s
specific
vocabulary
can be added
to address
their needs.
Part II
Workflows
We all know what the stages of the
editorial and production workflow are . . .
Design.
Copyediting.
Typesetting.
Artwork.
Indexing.
Quality Control.
Online/Ebook Creation.
. . . but we need to look deeper
to optimize how they work
in any given organization.
They’re usually done in silos.
Which are hard to see into,
and are starting to break down.
Thinking of these stages
in the traditional way
leads to suboptimization.
In today’s digital ecosystem
we need to deconstruct them
in order to optimize:
Who does what?
At what stage(s) of the workflow?
How to best manage the process?
Who Does What?
Do it in-house?
Outsource it?
Automate it?
You can’t answer these questions properly
without deconstructing the categories.
And the answers differ
from publisher to publisher.
At What Stage(s) of the Workflow?
How do these aspects intersect?
How do you avoid duplication and rework?
How do you get out of “loopy QC”?
Getting the right things right upstream
eliminates a lot of headaches downstream.
How Best to Manage the Process?
Balancing predictability and creativity:
where to be strict, and where to be flexible?
How can systems and standards help?
Buy vs. build vs. wing it?
Your systems, partners, and processes
should make it easy for you to do the right work
and keep you from doing the wrong work.
Let’s deconstruct
two key workflow stages
to see what options there are
for optimizing them.
Copyediting
Editing in Word?
Who cleans up the author’s messy MS files?
Who “normalizes” the styling?
Who designs those styles in the first place?
Who checks all the links to figures,
tables, cross references, notes?
Who actually does the intellectual work?
How do the files get trafficked?
What about version control?
Copyediting
Editing in Word?
Who cleans up the author’s messy MS files?
Who “normalizes” the styling?
Who designs those styles in the first place?
Who checks all the links to figures,
tables, cross references, notes?
Who actually does the intellectual work?
How do the files get trafficked?
What about version control?
The copyeditor?
The project or production editor?
Dedicated in-house file prep team?
Outsourced to vendor?
“Normalizing? What’s that?”
Copyediting
Editing in Word?
Who cleans up the author’s messy MS files?
Who “normalizes” the styling?
Who designs those styles in the first place?
Who checks all the links to figures,
tables, cross references, notes?
Who actually does the intellectual work?
How do the files get trafficked?
What about version control?
They need to be
aligned with your XML markup
and easy to use by the copyeditor.
Copyediting
Editing in Word?
Who cleans up the author’s messy MS files?
Who “normalizes” the styling?
Who designs those styles in the first place?
Who checks all the links to figures,
tables, cross references, notes?
Who actually does the intellectual work?
How do the files get trafficked?
What about version control?
The copyeditor?
An editorial assistant?
The editorial vendor?
The typesetter?
Software?
Copyediting
Editing in Word?
Who cleans up the author’s messy MS files?
Who “normalizes” the styling?
Who designs those styles in the first place?
Who checks all the links to figures,
tables, cross references, notes?
Who actually does the intellectual work?
How do the files get trafficked?
What about version control?
In-house copyeditor?
Freelance copyeditor?
An editorial service?
Full-service comp vendor?
Copyediting
Editing in Word?
Who cleans up the author’s messy MS files?
Who “normalizes” the styling?
Who designs those styles in the first place?
Who checks all the links to figures,
tables, cross references, notes?
Who actually does the intellectual work?
How do the files get trafficked?
What about version control?
Email files, named whatever. . . .
Consistent file naming, FTP, transmittals.
Digital Asset Management System (DAM).
Content Management System (CMS).
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
Freelance designer, ad hoc?
Compositor’s own system?
Publisher’s system?
XML?
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
“They don’t.”
“The typesetter does it,
we don’t know what they do.”
Word styles imported into InDesign.
Programmatic transforms to XML.
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
“Then we fix it in-house.”
“We send it to an art studio.”
“The typesetter fixes it.”
“We make the author fix it.”
“It depends. . . .”
“The author. Sorta.”
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
Manually based on callouts
marked by copyeditor.
Automatically from XML in
Typefi, 3B2.
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
“Nope.”
“The typesetter adds them.”
“We put them in when we make the ebook.”
“Yes, they’re in the XML.”
Typesetting
Who determines the tags or style names?
How do the editing styles translate to comp?
Who does the artwork?
How are figures, tables, etc. placed?
Are links preserved or implemented?
How do the files get trafficked?
What about version control?
Email files, named whatever. . . .
Consistent file naming, FTP, transmittals.
Digital Asset Management System (DAM).
Content Management System (CMS).
Sound familiar?
Workflow
Workflow is where it all comes together:
A vocabulary that fits your publications.
Markup that makes your content agile.
Metadata that makes it meaningful.
The standards that make it interoperable.
The technologies that fit your capabilities.
Part III
File Formats and Standards
Publications today are composed of
a multitude of files and formats.
Text Files
Metadata
Image Files
Video and Audio Files
Scripts
Fonts
Stylesheets
Deliverable Products
XML is not the whole story!
Some Common Text File Formats
Microsoft Word
Used for most authoring and editing
TeX/LaTeX
Common for math, statistics, engineering
InDesign
The leading design/page layout format
XML
The foundation of most modern publishing
HTML5
The format of the World Wide Web
Some Common Text File Formats
Microsoft Word
Used for most authoring and editing
TeX/LaTeX
Common for math, statistics, engineering
InDesign
The leading design/page layout format
XML
The foundation of most modern publishing
HTML
The format of the World Wide Web
Ubiquitous but typically undisciplined
Authors do lots of inconsistent, messy things
Style templates work well for editing
Visually distinct styles for elements,
names align with terms in rest of workflow
Old .doc is “binary”; new .docx is XML
Don’t get excited; this “WordML” is full of messy stuff,
but at least it can be worked with
Some Common Text File Formats
Microsoft Word
Used for most authoring and editing
TeX/LaTeX
Common for math, statistics, engineering
InDesign
The leading design/page layout format
XML
The foundation of most modern publishing
HTML
The format of the World Wide Web
Very specialized
Encountered only in specific disciplines
Often used for authoring + typesetting
Difficult to convert, so publishers often
treat TeX as an outlier and skip XML
Some Common Text File Formats
Microsoft Word
Used for most authoring and editing
TeX/LaTeX
Common for math, statistics, engineering
InDesign
The leading design/page layout format
XML
The foundation of most modern publishing
HTML
The format of the World Wide Web
Ideal for design-intensive publications
Integrated withAdobe’s full toolset, now cloud-based
Structure: paragraph & character styles
Align vocabulary with rest of workflow
Can import and export XML
This is how Typefi and PShift work;
IDML and EPUB export can be problematic
Some Common Text File Formats
Microsoft Word
Used for most authoring and editing
TeX/LaTeX
Common for math, statistics, engineering
InDesign
The leading design/page layout format
XML
The foundation of most modern publishing
HTML5
The format of the World Wide Web
Most flexible, future-proof format
Adapts as technologies change
and new products are developed
Optimal for multi-channel delivery
Same XML file for print, ebook, app, & online,
either directly or with automated transformation
Some Common Text File Formats
Microsoft Word
Used for most authoring and editing
TeX/LaTeX
Common for math, statistics, engineering
InDesign
The leading design/page layout format
XML
The foundation of most modern publishing
HTML5
The format of the World Wide Web
Can be expressed as XML: XHTML5
The HTML “tag set” following XML syntax and rules
HTML5 is structure + semantics
Presentation is via CSS (Cascading Style Sheets)
Basis of Open Web Platform and EPUB 3
OWPis a huge collection of standards
that form the Web ecosystem:
HTML5, CSS3, JavaScript, and many more
Some Common Image Formats
TIFF (.tif or .tiff)
“Tagged Image File Format”
JPEG (.jpg or .jpeg)
“Joint Photographic Experts Group”
GIF (.gif)
“Graphics Interchange Format”
PNG (.png)
“Portable Network Graphics”
SVG (.svg)
“Scalable Vector Graphics”
Some Common Image Formats
TIFF (.tif or .tiff)
“Tagged Image File Format”
JPEG (.jpg or .jpeg)
“Joint Photographic Experts Group”
GIF (.gif)
“Graphics Interchange Format”
PNG (.png)
“Portable Network Graphics”
SVG (.svg)
“Scalable Vector Graphics”
Mainly used for photos (continuous tone)
“Raster”or“bitmap”(gridofpixels)
Typically“lossless”:keepsalltheimagedata
Primarily for print
Grayscale or CMYK high-resolution images
File sizes are usually quite large, esp. color images
Some Common Image Formats
TIFF (.tif or .tiff)
“Tagged Image File Format”
JPEG (.jpg or .jpeg)
“Joint Photographic Experts Group”
GIF (.gif)
“Graphics Interchange Format”
PNG (.png)
“Portable Network Graphics”
SVG (.svg)
“Scalable Vector Graphics”
Alsomainlyforcontinuoustoneimages
“Lossy”compression:canadjustbalanceof
qualityandfilesize
Primarily for online, ebooks, etc.
Time to “load” is a factor (plus device capacity)
Preserve more data when zooming is needed
Some Common Image Formats
TIFF (.tif or .tiff)
“Tagged Image File Format”
JPEG (.jpg or .jpeg)
“Joint Photographic Experts Group”
GIF (.gif)
“Graphics Interchange Format”
PNG (.png)
“Portable Network Graphics”
SVG (.svg)
“Scalable Vector Graphics”
Mainly for line art(diagrams, flat color)
Smallfilesize:designedforonline/digital
Losslesscompression
Can be animated: “Animated GIF”
[Editorial comment:
also can be annoying. ;-) ]
Some Common Image Formats
TIFF (.tif or .tiff)
“Tagged Image File Format”
JPEG (.jpg or .jpeg)
“Joint Photographic Experts Group”
GIF (.gif)
“Graphics Interchange Format”
PNG (.png)
“Portable Network Graphics”
SVG (.svg)
“Scalable Vector Graphics”
Created as open-source successor to GIF
Smallfilesizeforlineart,flatcolor;offersexcellent
quality,goodtransparency,losslesscompression
Can be used for photos or line art.
Better than JPEG at flat color areas,
but PNG photos are larger files than JPEGs
Some Common Image Formats
TIFF (.tif or .tiff)
“Tagged Image File Format”
JPEG (.jpg or .jpeg)
“Joint Photographic Experts Group”
GIF (.gif)
“Graphics Interchange Format”
PNG (.png)
“Portable Network Graphics”
SVG (.svg)
“Scalable Vector Graphics”
W3C standardXML-based vector format
VectormathbasedonAdobe’sPDF/Postscript
Searchable,accessibletext
No loss of quality when resized
Sharp on for laptop, tablet, phone, zoom—like PDF
Not widely or consistently implemented yet,
but should become a dominant image format
. . . and Some Common Proprietary Formats
AI (.ai)
Adobe Illustrator
PSD (.psd)
Photoshop
EPS (.eps)
Encapsulated Postscript
PPT (.ppt)
PowerPoint
WMF/EMF
Windows Metafile / Enhanced Metafile
These are used
in production
but don’t belong
in deliverable
products.
Audio and Video Formats
HTML5 vs. Proprietary
Best: open formats permitted by HTML5
in the <audio> and <video> elements:
theyworknativelyinbrowsers&e-readers
Proprietary formats like Flash (.swf) and
QuickTime (.mov, .qt) require plug-ins
Ideal: Formats Recommended by EPUB 3
Audio: MP3 and MP4 AAC LC
Video: H.264 and VP8/WebM
(often both due to browser/RS inconsistency)
Scripts
JavaScript
Fundamental to the Open Web Platform
JavaScript Libraries
“Pre-written” scripts to adapt as needed
Most popular: open-source jQuery
Widgets
Interactive features like quizzes, sliders,
“assessments” in educational content,
graphing data from a table, etc.
Fonts
OpenType
Primary font format for print
WOFF
Primary font format for web
Licensing
Know what rights you’ve got!
Obfuscating and Embedding
Enable ebook to contain the fonts it needs
Unicode Fonts
Character encoding of the Web & XML
Fonts
OpenType
Primary font format for print
WOFF
Primary font format for web
Licensing
Know what rights you’ve got!
Obfuscating and Embedding
Enable ebook to contain the fonts it needs
UNICODE Fonts
Encoding aligns with Web and XML
The “legal” fonts in EPUB3
Readingsystemsrequiredtohandleboth—
butmanysystemsjustusetheirowndefaultfontsnow
Many fonts available in both formats
WOFF is a “wrapper” for underlying font data/metrics
Fonts
OpenType
Primary font format for print
WOFF
Primary format for web
Licensing
Know what rights you’ve got!
Obfuscating and Embedding
Enable ebook to contain the fonts it needs
UNICODE Fonts
Encoding aligns with Web and XML
Needlicense toembed font in ebook
Beware“free”fonts!“OpenLicenseFonts”aresafe
Need“fallbacks” for embedded fonts
“Systemfonts”arebuiltintoareadingsystem
“Webfonts”requireyoutobeonline—notforebooks
TheCSSletsyoudefaultto“serif”or“sans-serif ”
Embedded fonts for “special characters”
Math,linguistics,quotesfromnon-latinlanguages
Fonts
OpenType
Primary font format for print
WOFF
Primary format for web
Licensing
Know what rights you’ve got!
Obfuscating and Embedding
Enable ebook to contain the fonts it needs
Unicode Fonts
Character encoding of the Web & XML
AllthecharactersinXML
are Unicode by definition
Thisenablesunambiguous characterspecification
Word, InDesign, and XML-based systems
allunderstandanduseUnicode
UseUnicodefontsthroughoutyourworkflow!
Stylesheets
Word
A good “styles library” helps add
structure and semantics
InDesign/Quark
Paragraph styles and character styles
ensure consistency, efficiency
Browsers/Ebooks
CSS (Cascading Style Sheets)
Adapts rendering for context/device
Enables “responsive design”
Deliverable Products
PDF
Preserves look of typeset page
Used for printing, online delivery
Doesn’t “reflow” for different screen sizes
EPUB
International standard format
Non-proprietary, works almost everywhere
Reflowable or fixed layout
KF8
Amazon’s proprietary ebook format
Thanks!
Bill Kasdorf
bkasdorf@apexcovantage.com
+1 734 904 6252
@BillKasdorf

Más contenido relacionado

Destacado

Practical strategies for incorporating rich media in digital products
Practical strategies for incorporating rich media in digital productsPractical strategies for incorporating rich media in digital products
Practical strategies for incorporating rich media in digital productsApex CoVantage
 
The Interoperability Imperative
The Interoperability ImperativeThe Interoperability Imperative
The Interoperability ImperativeApex CoVantage
 
Publishing Production, Distribution, & Operations
Publishing Production, Distribution, & OperationsPublishing Production, Distribution, & Operations
Publishing Production, Distribution, & OperationsApex CoVantage
 
Publishing Technology Today
Publishing Technology TodayPublishing Technology Today
Publishing Technology TodayApex CoVantage
 
Content Management for Publishers
Content Management for PublishersContent Management for Publishers
Content Management for PublishersApex CoVantage
 
Epub summit 2017 - Readium, the perfect EPUB/PWP companion
Epub summit 2017 - Readium, the perfect EPUB/PWP companionEpub summit 2017 - Readium, the perfect EPUB/PWP companion
Epub summit 2017 - Readium, the perfect EPUB/PWP companionLaurent Le Meur
 
Epub summit 2017 - Readium LCP on the launchpad
Epub summit 2017 - Readium LCP on the launchpadEpub summit 2017 - Readium LCP on the launchpad
Epub summit 2017 - Readium LCP on the launchpadLaurent Le Meur
 
UX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and ArchivesUX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and ArchivesNed Potter
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging ChallengesAaron Irizarry
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with DataSeth Familian
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 

Destacado (13)

Practical strategies for incorporating rich media in digital products
Practical strategies for incorporating rich media in digital productsPractical strategies for incorporating rich media in digital products
Practical strategies for incorporating rich media in digital products
 
The Interoperability Imperative
The Interoperability ImperativeThe Interoperability Imperative
The Interoperability Imperative
 
Publishing Production, Distribution, & Operations
Publishing Production, Distribution, & OperationsPublishing Production, Distribution, & Operations
Publishing Production, Distribution, & Operations
 
Publishing Technology Today
Publishing Technology TodayPublishing Technology Today
Publishing Technology Today
 
EPUB Is Here to Stay
EPUB Is Here to StayEPUB Is Here to Stay
EPUB Is Here to Stay
 
Content Management for Publishers
Content Management for PublishersContent Management for Publishers
Content Management for Publishers
 
Epub summit 2017 - Readium, the perfect EPUB/PWP companion
Epub summit 2017 - Readium, the perfect EPUB/PWP companionEpub summit 2017 - Readium, the perfect EPUB/PWP companion
Epub summit 2017 - Readium, the perfect EPUB/PWP companion
 
Epub summit 2017 - Readium LCP on the launchpad
Epub summit 2017 - Readium LCP on the launchpadEpub summit 2017 - Readium LCP on the launchpad
Epub summit 2017 - Readium LCP on the launchpad
 
UX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and ArchivesUX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and Archives
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging Challenges
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Similar a Digital Publishing Workflow Optimization

Essential Tools Of An Xml Workflow2003comp
Essential Tools Of An Xml Workflow2003compEssential Tools Of An Xml Workflow2003comp
Essential Tools Of An Xml Workflow2003compljnd
 
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USARelevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USALeonardo Dias
 
Wisneski TeI workshop 2009-2010
Wisneski TeI workshop 2009-2010Wisneski TeI workshop 2009-2010
Wisneski TeI workshop 2009-2010Rich Wisneski
 
Xml Case Learns 2008
Xml Case Learns 2008Xml Case Learns 2008
Xml Case Learns 2008Rich Wisneski
 
Tcs Technical Question
Tcs Technical QuestionTcs Technical Question
Tcs Technical QuestionVinay Kumar
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Karen Thompson
 
Post conference workshop (xml and structure)
Post conference workshop (xml and structure)Post conference workshop (xml and structure)
Post conference workshop (xml and structure)Scriptorium Publishing
 
his article focuses on new job roles and responsibilitie
his article focuses on new job roles and responsibilitiehis article focuses on new job roles and responsibilitie
his article focuses on new job roles and responsibilitieSusanaFurman449
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and RetrievalOptum
 
Domain oriented development
Domain oriented developmentDomain oriented development
Domain oriented developmentrajmundr
 
XXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair KernerXXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair KernerDarrell W. Gunter
 
Write Compare And Contrast Essay
Write Compare And Contrast EssayWrite Compare And Contrast Essay
Write Compare And Contrast EssayMarci Vredeveld
 
Keep Calm and Specialize your Content Model
Keep Calm and Specialize your Content ModelKeep Calm and Specialize your Content Model
Keep Calm and Specialize your Content Modelctnitchie
 
Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)muhammadmubinmacadad2
 
Intro to Graph Theory
Intro to Graph TheoryIntro to Graph Theory
Intro to Graph TheoryRay Lukas
 

Similar a Digital Publishing Workflow Optimization (20)

Essential Tools Of An Xml Workflow2003comp
Essential Tools Of An Xml Workflow2003compEssential Tools Of An Xml Workflow2003comp
Essential Tools Of An Xml Workflow2003comp
 
Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)Markup For Dummies (Russ Ward)
Markup For Dummies (Russ Ward)
 
93 peter butterfield
93 peter butterfield93 peter butterfield
93 peter butterfield
 
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USARelevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
 
Wisneski TeI workshop 2009-2010
Wisneski TeI workshop 2009-2010Wisneski TeI workshop 2009-2010
Wisneski TeI workshop 2009-2010
 
Xml Case Learns 2008
Xml Case Learns 2008Xml Case Learns 2008
Xml Case Learns 2008
 
Tcs Technical Question
Tcs Technical QuestionTcs Technical Question
Tcs Technical Question
 
beck-beyondthepdf
beck-beyondthepdfbeck-beyondthepdf
beck-beyondthepdf
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
 
Post conference workshop (xml and structure)
Post conference workshop (xml and structure)Post conference workshop (xml and structure)
Post conference workshop (xml and structure)
 
The Three Core Topic Types
The Three Core Topic TypesThe Three Core Topic Types
The Three Core Topic Types
 
his article focuses on new job roles and responsibilitie
his article focuses on new job roles and responsibilitiehis article focuses on new job roles and responsibilitie
his article focuses on new job roles and responsibilitie
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
 
Domain oriented development
Domain oriented developmentDomain oriented development
Domain oriented development
 
XXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair KernerXXIX Charleston 2009 Silverchair Kerner
XXIX Charleston 2009 Silverchair Kerner
 
Write Compare And Contrast Essay
Write Compare And Contrast EssayWrite Compare And Contrast Essay
Write Compare And Contrast Essay
 
Keep Calm and Specialize your Content Model
Keep Calm and Specialize your Content ModelKeep Calm and Specialize your Content Model
Keep Calm and Specialize your Content Model
 
Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)
 
Intro to Graph Theory
Intro to Graph TheoryIntro to Graph Theory
Intro to Graph Theory
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 

Último

Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFChandresh Chudasama
 
Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsIndiaMART InterMESH Limited
 
Effective Strategies for Maximizing Your Profit When Selling Gold Jewelry
Effective Strategies for Maximizing Your Profit When Selling Gold JewelryEffective Strategies for Maximizing Your Profit When Selling Gold Jewelry
Effective Strategies for Maximizing Your Profit When Selling Gold JewelryWhittensFineJewelry1
 
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...Hector Del Castillo, CPM, CPMM
 
digital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingdigital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingrajputmeenakshi733
 
Entrepreneurship lessons in Philippines
Entrepreneurship lessons in  PhilippinesEntrepreneurship lessons in  Philippines
Entrepreneurship lessons in PhilippinesDavidSamuel525586
 
How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...
How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...
How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...SOFTTECHHUB
 
Introducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsIntroducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsKnowledgeSeed
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfJamesConcepcion7
 
NAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataNAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...
Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...
Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...Associazione Digital Days
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerAggregage
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Anamaria Contreras
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Americas Got Grants
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterJamesConcepcion7
 
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...Operational Excellence Consulting
 
20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdfChris Skinner
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationAnamaria Contreras
 
Pitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deckPitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deckHajeJanKamps
 

Último (20)

Guide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDFGuide Complete Set of Residential Architectural Drawings PDF
Guide Complete Set of Residential Architectural Drawings PDF
 
Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan Dynamics
 
Effective Strategies for Maximizing Your Profit When Selling Gold Jewelry
Effective Strategies for Maximizing Your Profit When Selling Gold JewelryEffective Strategies for Maximizing Your Profit When Selling Gold Jewelry
Effective Strategies for Maximizing Your Profit When Selling Gold Jewelry
 
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
How Generative AI Is Transforming Your Business | Byond Growth Insights | Apr...
 
digital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingdigital marketing , introduction of digital marketing
digital marketing , introduction of digital marketing
 
Entrepreneurship lessons in Philippines
Entrepreneurship lessons in  PhilippinesEntrepreneurship lessons in  Philippines
Entrepreneurship lessons in Philippines
 
How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...
How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...
How To Simplify Your Scheduling with AI Calendarfly The Hassle-Free Online Bo...
 
Introducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applicationsIntroducing the Analogic framework for business planning applications
Introducing the Analogic framework for business planning applications
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdf
 
NAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataNAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors Data
 
Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...
Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...
Lucia Ferretti, Lead Business Designer; Matteo Meschini, Business Designer @T...
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon Harmer
 
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
 
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
 
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptxThe Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare Newsletter
 
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
The McKinsey 7S Framework: A Holistic Approach to Harmonizing All Parts of th...
 
20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf20200128 Ethical by Design - Whitepaper.pdf
20200128 Ethical by Design - Whitepaper.pdf
 
PSCC - Capability Statement Presentation
PSCC - Capability Statement PresentationPSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
 
Pitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deckPitch Deck Teardown: Xpanceo's $40M Seed deck
Pitch Deck Teardown: Xpanceo's $40M Seed deck
 

Digital Publishing Workflow Optimization

  • 1. Bill Kasdorf VP and Principal Consultant,Apex Content Solutions Markup, Metadata, Formats, and Workflows How Publishing Works in the Digital Era
  • 2. Part I Markup & Metadata
  • 3. Content Think of content as the stuff you can see.
  • 4. Markup Think of markup as the engineering that makes it work like a well-oiled machine.
  • 6. Content Think first about the content, not about the publication.
  • 7. Content Think first about the content, not about the publication. That helps you focus on what things are, not what they look like.
  • 8. Content Think first about the content, not about the publication. That helps you focus on what things are, not what they look like. That leads to adaptable markup that you can optimize for print, online, ebooks, or apps.
  • 9. Content Analysis What kind of content is this?
  • 10. Content Analysis What kind of content is this? Who needs it? Why? (Later, ask “how?”)
  • 11. Content Analysis What kind of content is this? Who needs it? Why? (Later, ask “how?”) What pieces are meaningful?
  • 12. Content Analysis What kind of content is this? Who needs it? Why? (Later, ask “how?”) What pieces are meaningful? What chunks are needed for rendering?
  • 13. Content Analysis What kind of content is this? Who needs it? Why? (Later, ask “how?”) What pieces are meaningful? What chunks are needed for rendering? What chunks will people want to point to?
  • 14. Content Analysis What kind of content is this? Who needs it? Why? (Later, ask “how?”) What pieces are meaningful? What chunks are needed for rendering? What chunks will people want to point to? How does one chunk relate to other chunks . . . across all your publications?
  • 15. Content Analysis What kind of content is this? Who needs it? Why? (Later, ask “how?”) What pieces are meaningful? What chunks are needed for rendering? What chunks will people want to point to? How does one chunk relate to other chunks . . . across all your publications? The Goal: THOUGHTFUL CHUNKING
  • 16. Vocabulary and Markup: What to name the components and how to tag them for editing, typesetting, and digital publishing. It works best if the same vocabulary (but not necessarily the same markup syntax) can be used for all of these phases of your workflow.
  • 17. Design: Typography and Layout Typography is really implied “markup.” Typography distinguishes the components. Layout is a navigation guide. This is a centuries-in-the-making collection of design conventions. Design is based on semantic distinctions: What is this thing? How important is it? How does it relate to the other things around it?
  • 18. What do you see on this page?
  • 19. What do you see on this page? “Huge numeral?” “24 pt Meta, fl rr?” “11 pt Charter, letterspaced?” “Rag right para indented on left?” “12 pt Meta Black all caps, & sm caps?” “Bold term?” I don’t think so. . . .
  • 20. Here’s what we “see” on this page: “Chapter number” “Chapter title” “Author’s name” “Introductory paragraph” “Level 1 subhead” “Level 2 subhead” “Glossary term” We see structure and semantics, not specs.
  • 21. XML XML enables the separation of structure and semantics from rendering, presentation.
  • 22. <CN> </CN> </CT> </AU> <INTRO> </INTRO> <H1> <H2> </H1> </H2> <CT> <AU> <GLOSS> </GLOSS> Here’s one possible markup scheme: “Chapter number” “Chapter title” “Author’s name” “Introductory paragraph” “Level 1 subhead” “Level 2 subhead” “Glossary term” That’s XML markup. Those are “tags.”
  • 23. <CN> </CN> </CT> </AU> <INTRO> </INTRO> <H1> <H2> </H1> </H2> <CT> <AU> <GLOSS> </GLOSS> Here’s one possible markup scheme: “Chapter number” “Chapter title” “Author’s name” “Introductory paragraph” “Level 1 subhead” “Level 2 subhead” “Glossary term” That’s XML markup. Those are “tags.” You don’t have to use XML. You do need some form of markup, even if in the form of styles, to distinguish the components. XML is the most powerful, future-proof markup.
  • 25. XML Extensible Markup Language Extensible: Designed to adapt to various • kinds of documents • modes of publication • patterns of access and use
  • 27. XML Extensible Markup Language Language: Astandard way to express markup. Not a set of tags or a vocabulary, but an agreed-upon way to express a given vocabulary or tag set.
  • 28. XML XML liberates your content from any particular page design, any particular reading system, any particular workflow. Print, app, ebook, and online: all from the same XML document!
  • 29. XML is not a set of tags. It is a LANGUAGE for expressing:
  • 30. XML is not a set of tags. It is a LANGUAGE for expressing: • Semantic information: what the pieces are
  • 31. XML is not a set of tags. It is a LANGUAGE for expressing: • Semantic information: what the pieces are • Structural information: how the pieces fit together
  • 32. XML is not a set of tags. It is a LANGUAGE for expressing: • Semantic information: what the pieces are • Structural information: how the pieces fit together • Metadata: information about the content
  • 33. XML is not a set of tags. It is a LANGUAGE for expressing: • Semantic information: what the pieces are • Structural information: how the pieces fit together • Metadata: information about the content • Presentation information, but only where semantics and structure don’t apply
  • 34. XML is not a set of tags. It is a LANGUAGE for expressing: • Semantic information: what the pieces are • Structural information: how the pieces fit together • Metadata: information about the content • Presentation information, but only where semantics and structure don’t apply . . . creating an unlimited number of presentations from a single XML document.
  • 35. So where do the tags come from? Surely you don’t just make them up. Wasn’t the whole point to make the tagging clear, consistent, and non-proprietary?
  • 36. Well, technically, you can just make them up. But then only you know what they mean. As long as you follow the XML rules, it’s called “well-formed” XML.
  • 37. Well, technically, you can just make them up. But then only you know what they mean. As long as you follow the XML rules, it’s called “well-formed” XML. It’s better to have a formal specification (a DTD or other schema), and if your XML also conforms to that, it’s called “valid” XML (which is also well-formed). That lets any XML-based system interpret and use your markup.
  • 38. DTD Document Type Definition A special formal syntax used to define a particular type of document or set of related documents. It defines a tag set: the specific tags and how they’re used.
  • 39. DTD Elements are the nouns: e.g., <title> or <blockquote>. A chunk of content is surrounded by a “start tag” and an “end tag”: e.g., <title>This Publication</title>; and elements must “nest” properly. Now systems can tell the chunks apart and process them appropriately.
  • 40. DTD Attributes are the adjectives that describe the elements: e.g., <title class="title-page"> vs. <title class="chapter">. Now they can be distinguished, processed, and rendered differently. Unique IDs identify “this specific one,” e.g., <section class="chapter" id="ch001">.
  • 41. DTD DTDs can also define metadata: information about the content. For example: • Bibliographic information • Subject codes • Author and publisher information • Technical information • Rights and usage information
  • 42. DTD DTDs (or other types of schemas) are often called “models.” Most publishers’ models today are based on one of a number of standard models that are widely used and well known in a certain “community.”
  • 43. Some Standard Models DocBook A generic book model, initially developed for technical books and documentation TEI, the Text Encoding Initiative Mainly used for textual research NLM/JATS/BITS The model for scholarly journals and books XHTML The language of the Web and EPUB, expressed as XML
  • 44. Some Standard Models DocBook A generic book model, initially developed for technical books and documentation TEI, the Text Encoding Initiative Mainly used for textual research NLM/JATS/BITS The model for scholarly journals and books XHTML The language of the Web and EPUB, expressed as XML These each provide a standard, widely used framework to which a publisher’s specific vocabulary can be added to address their needs.
  • 46. We all know what the stages of the editorial and production workflow are . . . Design. Copyediting. Typesetting. Artwork. Indexing. Quality Control. Online/Ebook Creation. . . . but we need to look deeper to optimize how they work in any given organization.
  • 47. They’re usually done in silos. Which are hard to see into, and are starting to break down.
  • 48. Thinking of these stages in the traditional way leads to suboptimization. In today’s digital ecosystem we need to deconstruct them in order to optimize: Who does what? At what stage(s) of the workflow? How to best manage the process?
  • 49. Who Does What? Do it in-house? Outsource it? Automate it? You can’t answer these questions properly without deconstructing the categories. And the answers differ from publisher to publisher.
  • 50. At What Stage(s) of the Workflow? How do these aspects intersect? How do you avoid duplication and rework? How do you get out of “loopy QC”? Getting the right things right upstream eliminates a lot of headaches downstream.
  • 51. How Best to Manage the Process? Balancing predictability and creativity: where to be strict, and where to be flexible? How can systems and standards help? Buy vs. build vs. wing it? Your systems, partners, and processes should make it easy for you to do the right work and keep you from doing the wrong work.
  • 52. Let’s deconstruct two key workflow stages to see what options there are for optimizing them.
  • 53. Copyediting Editing in Word? Who cleans up the author’s messy MS files? Who “normalizes” the styling? Who designs those styles in the first place? Who checks all the links to figures, tables, cross references, notes? Who actually does the intellectual work? How do the files get trafficked? What about version control?
  • 54. Copyediting Editing in Word? Who cleans up the author’s messy MS files? Who “normalizes” the styling? Who designs those styles in the first place? Who checks all the links to figures, tables, cross references, notes? Who actually does the intellectual work? How do the files get trafficked? What about version control? The copyeditor? The project or production editor? Dedicated in-house file prep team? Outsourced to vendor? “Normalizing? What’s that?”
  • 55. Copyediting Editing in Word? Who cleans up the author’s messy MS files? Who “normalizes” the styling? Who designs those styles in the first place? Who checks all the links to figures, tables, cross references, notes? Who actually does the intellectual work? How do the files get trafficked? What about version control? They need to be aligned with your XML markup and easy to use by the copyeditor.
  • 56. Copyediting Editing in Word? Who cleans up the author’s messy MS files? Who “normalizes” the styling? Who designs those styles in the first place? Who checks all the links to figures, tables, cross references, notes? Who actually does the intellectual work? How do the files get trafficked? What about version control? The copyeditor? An editorial assistant? The editorial vendor? The typesetter? Software?
  • 57. Copyediting Editing in Word? Who cleans up the author’s messy MS files? Who “normalizes” the styling? Who designs those styles in the first place? Who checks all the links to figures, tables, cross references, notes? Who actually does the intellectual work? How do the files get trafficked? What about version control? In-house copyeditor? Freelance copyeditor? An editorial service? Full-service comp vendor?
  • 58. Copyediting Editing in Word? Who cleans up the author’s messy MS files? Who “normalizes” the styling? Who designs those styles in the first place? Who checks all the links to figures, tables, cross references, notes? Who actually does the intellectual work? How do the files get trafficked? What about version control? Email files, named whatever. . . . Consistent file naming, FTP, transmittals. Digital Asset Management System (DAM). Content Management System (CMS).
  • 59. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control?
  • 60. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control? Freelance designer, ad hoc? Compositor’s own system? Publisher’s system? XML?
  • 61. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control? “They don’t.” “The typesetter does it, we don’t know what they do.” Word styles imported into InDesign. Programmatic transforms to XML.
  • 62. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control? “Then we fix it in-house.” “We send it to an art studio.” “The typesetter fixes it.” “We make the author fix it.” “It depends. . . .” “The author. Sorta.”
  • 63. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control? Manually based on callouts marked by copyeditor. Automatically from XML in Typefi, 3B2.
  • 64. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control? “Nope.” “The typesetter adds them.” “We put them in when we make the ebook.” “Yes, they’re in the XML.”
  • 65. Typesetting Who determines the tags or style names? How do the editing styles translate to comp? Who does the artwork? How are figures, tables, etc. placed? Are links preserved or implemented? How do the files get trafficked? What about version control? Email files, named whatever. . . . Consistent file naming, FTP, transmittals. Digital Asset Management System (DAM). Content Management System (CMS). Sound familiar?
  • 66. Workflow Workflow is where it all comes together: A vocabulary that fits your publications. Markup that makes your content agile. Metadata that makes it meaningful. The standards that make it interoperable. The technologies that fit your capabilities.
  • 67. Part III File Formats and Standards
  • 68. Publications today are composed of a multitude of files and formats. Text Files Metadata Image Files Video and Audio Files Scripts Fonts Stylesheets Deliverable Products XML is not the whole story!
  • 69. Some Common Text File Formats Microsoft Word Used for most authoring and editing TeX/LaTeX Common for math, statistics, engineering InDesign The leading design/page layout format XML The foundation of most modern publishing HTML5 The format of the World Wide Web
  • 70. Some Common Text File Formats Microsoft Word Used for most authoring and editing TeX/LaTeX Common for math, statistics, engineering InDesign The leading design/page layout format XML The foundation of most modern publishing HTML The format of the World Wide Web Ubiquitous but typically undisciplined Authors do lots of inconsistent, messy things Style templates work well for editing Visually distinct styles for elements, names align with terms in rest of workflow Old .doc is “binary”; new .docx is XML Don’t get excited; this “WordML” is full of messy stuff, but at least it can be worked with
  • 71. Some Common Text File Formats Microsoft Word Used for most authoring and editing TeX/LaTeX Common for math, statistics, engineering InDesign The leading design/page layout format XML The foundation of most modern publishing HTML The format of the World Wide Web Very specialized Encountered only in specific disciplines Often used for authoring + typesetting Difficult to convert, so publishers often treat TeX as an outlier and skip XML
  • 72. Some Common Text File Formats Microsoft Word Used for most authoring and editing TeX/LaTeX Common for math, statistics, engineering InDesign The leading design/page layout format XML The foundation of most modern publishing HTML The format of the World Wide Web Ideal for design-intensive publications Integrated withAdobe’s full toolset, now cloud-based Structure: paragraph & character styles Align vocabulary with rest of workflow Can import and export XML This is how Typefi and PShift work; IDML and EPUB export can be problematic
  • 73. Some Common Text File Formats Microsoft Word Used for most authoring and editing TeX/LaTeX Common for math, statistics, engineering InDesign The leading design/page layout format XML The foundation of most modern publishing HTML5 The format of the World Wide Web Most flexible, future-proof format Adapts as technologies change and new products are developed Optimal for multi-channel delivery Same XML file for print, ebook, app, & online, either directly or with automated transformation
  • 74. Some Common Text File Formats Microsoft Word Used for most authoring and editing TeX/LaTeX Common for math, statistics, engineering InDesign The leading design/page layout format XML The foundation of most modern publishing HTML5 The format of the World Wide Web Can be expressed as XML: XHTML5 The HTML “tag set” following XML syntax and rules HTML5 is structure + semantics Presentation is via CSS (Cascading Style Sheets) Basis of Open Web Platform and EPUB 3 OWPis a huge collection of standards that form the Web ecosystem: HTML5, CSS3, JavaScript, and many more
  • 75. Some Common Image Formats TIFF (.tif or .tiff) “Tagged Image File Format” JPEG (.jpg or .jpeg) “Joint Photographic Experts Group” GIF (.gif) “Graphics Interchange Format” PNG (.png) “Portable Network Graphics” SVG (.svg) “Scalable Vector Graphics”
  • 76. Some Common Image Formats TIFF (.tif or .tiff) “Tagged Image File Format” JPEG (.jpg or .jpeg) “Joint Photographic Experts Group” GIF (.gif) “Graphics Interchange Format” PNG (.png) “Portable Network Graphics” SVG (.svg) “Scalable Vector Graphics” Mainly used for photos (continuous tone) “Raster”or“bitmap”(gridofpixels) Typically“lossless”:keepsalltheimagedata Primarily for print Grayscale or CMYK high-resolution images File sizes are usually quite large, esp. color images
  • 77. Some Common Image Formats TIFF (.tif or .tiff) “Tagged Image File Format” JPEG (.jpg or .jpeg) “Joint Photographic Experts Group” GIF (.gif) “Graphics Interchange Format” PNG (.png) “Portable Network Graphics” SVG (.svg) “Scalable Vector Graphics” Alsomainlyforcontinuoustoneimages “Lossy”compression:canadjustbalanceof qualityandfilesize Primarily for online, ebooks, etc. Time to “load” is a factor (plus device capacity) Preserve more data when zooming is needed
  • 78. Some Common Image Formats TIFF (.tif or .tiff) “Tagged Image File Format” JPEG (.jpg or .jpeg) “Joint Photographic Experts Group” GIF (.gif) “Graphics Interchange Format” PNG (.png) “Portable Network Graphics” SVG (.svg) “Scalable Vector Graphics” Mainly for line art(diagrams, flat color) Smallfilesize:designedforonline/digital Losslesscompression Can be animated: “Animated GIF” [Editorial comment: also can be annoying. ;-) ]
  • 79. Some Common Image Formats TIFF (.tif or .tiff) “Tagged Image File Format” JPEG (.jpg or .jpeg) “Joint Photographic Experts Group” GIF (.gif) “Graphics Interchange Format” PNG (.png) “Portable Network Graphics” SVG (.svg) “Scalable Vector Graphics” Created as open-source successor to GIF Smallfilesizeforlineart,flatcolor;offersexcellent quality,goodtransparency,losslesscompression Can be used for photos or line art. Better than JPEG at flat color areas, but PNG photos are larger files than JPEGs
  • 80. Some Common Image Formats TIFF (.tif or .tiff) “Tagged Image File Format” JPEG (.jpg or .jpeg) “Joint Photographic Experts Group” GIF (.gif) “Graphics Interchange Format” PNG (.png) “Portable Network Graphics” SVG (.svg) “Scalable Vector Graphics” W3C standardXML-based vector format VectormathbasedonAdobe’sPDF/Postscript Searchable,accessibletext No loss of quality when resized Sharp on for laptop, tablet, phone, zoom—like PDF Not widely or consistently implemented yet, but should become a dominant image format
  • 81. . . . and Some Common Proprietary Formats AI (.ai) Adobe Illustrator PSD (.psd) Photoshop EPS (.eps) Encapsulated Postscript PPT (.ppt) PowerPoint WMF/EMF Windows Metafile / Enhanced Metafile These are used in production but don’t belong in deliverable products.
  • 82. Audio and Video Formats HTML5 vs. Proprietary Best: open formats permitted by HTML5 in the <audio> and <video> elements: theyworknativelyinbrowsers&e-readers Proprietary formats like Flash (.swf) and QuickTime (.mov, .qt) require plug-ins Ideal: Formats Recommended by EPUB 3 Audio: MP3 and MP4 AAC LC Video: H.264 and VP8/WebM (often both due to browser/RS inconsistency)
  • 83. Scripts JavaScript Fundamental to the Open Web Platform JavaScript Libraries “Pre-written” scripts to adapt as needed Most popular: open-source jQuery Widgets Interactive features like quizzes, sliders, “assessments” in educational content, graphing data from a table, etc.
  • 84. Fonts OpenType Primary font format for print WOFF Primary font format for web Licensing Know what rights you’ve got! Obfuscating and Embedding Enable ebook to contain the fonts it needs Unicode Fonts Character encoding of the Web & XML
  • 85. Fonts OpenType Primary font format for print WOFF Primary font format for web Licensing Know what rights you’ve got! Obfuscating and Embedding Enable ebook to contain the fonts it needs UNICODE Fonts Encoding aligns with Web and XML The “legal” fonts in EPUB3 Readingsystemsrequiredtohandleboth— butmanysystemsjustusetheirowndefaultfontsnow Many fonts available in both formats WOFF is a “wrapper” for underlying font data/metrics
  • 86. Fonts OpenType Primary font format for print WOFF Primary format for web Licensing Know what rights you’ve got! Obfuscating and Embedding Enable ebook to contain the fonts it needs UNICODE Fonts Encoding aligns with Web and XML Needlicense toembed font in ebook Beware“free”fonts!“OpenLicenseFonts”aresafe Need“fallbacks” for embedded fonts “Systemfonts”arebuiltintoareadingsystem “Webfonts”requireyoutobeonline—notforebooks TheCSSletsyoudefaultto“serif”or“sans-serif ” Embedded fonts for “special characters” Math,linguistics,quotesfromnon-latinlanguages
  • 87. Fonts OpenType Primary font format for print WOFF Primary format for web Licensing Know what rights you’ve got! Obfuscating and Embedding Enable ebook to contain the fonts it needs Unicode Fonts Character encoding of the Web & XML AllthecharactersinXML are Unicode by definition Thisenablesunambiguous characterspecification Word, InDesign, and XML-based systems allunderstandanduseUnicode UseUnicodefontsthroughoutyourworkflow!
  • 88. Stylesheets Word A good “styles library” helps add structure and semantics InDesign/Quark Paragraph styles and character styles ensure consistency, efficiency Browsers/Ebooks CSS (Cascading Style Sheets) Adapts rendering for context/device Enables “responsive design”
  • 89. Deliverable Products PDF Preserves look of typeset page Used for printing, online delivery Doesn’t “reflow” for different screen sizes EPUB International standard format Non-proprietary, works almost everywhere Reflowable or fixed layout KF8 Amazon’s proprietary ebook format