SlideShare una empresa de Scribd logo
1 de 10
Descargar para leer sin conexión
Better problem solving through scripting: How to think through your
#eprdctn roadblocks and script your way to efficiency
Do you find yourself repeating the same task over and over? Or feeling certain there is a
way to automate a task but it's just outside of your skill set? Kris Coppieters from Rorohiko
has built a career solving just those kinds of problems. Whether it's scripting a solution from
within InDesign, or using AppleScript to finish off some markup, Kris can show you how to
bring high-level thinking to quick and dirty tasks.
Two categories of automation
When it comes to automation, we can classify most of the software used into two large
categories which I'll call tools and systems.
Tools
Tools are smaller programs which are similar to tools used in various crafts, e.g. carpentry.
In carpentry, you might use a hammer, nails, screws, screwdrivers, saws, CNC machines,
drills... A few of these tools are more complex, and have a wide range of functions, but
many tools are simple and have a very specific function. Some of these tools are specifically
made for a specific task.
In #eprdctn, the tools used are text editors, search-and-replace, XML editors like Oxygen,
testing tools, reader programs...
Systems
Systems are more complex than simple tools. They take in some raw materials and spit out
finished results at the other end.
A sawmill can be seen as a system. It processes logs and produces planks, poles, boards...
The people working in the sawmill will use machines and tools to make the system work.
A working system can encompass many processes and sub-processes. Some processes are
automated, some are manual processes.
In #eprdctn, we might have a system that takes in raw data, and produces ebooks. In
#eprdctn publishing in general, systems are often called workflows.
Let's talk about tools and jigs
This presentation is all about tools for your craft.
There's a guy called Dan Erlewine on YouTube, who is a luthier. He has many YouTube
movies on guitar repair, and many of those movies are about clever self-made tools he uses
to work on guitars. He calls them 'jigs'.
https://www.youtube.com/results?search_query=dan+erlewine+jig
I like that term, so I want to coin the term for 'jig' for #eprdcnt custom tools. You've heard it
here first!
The first step is to take notice when you find yourself repeatedly performing a cumbersome,
difficult, or repetitive task. Then ask yourself whether it's possible to create a jig for that.
Basic Tools
When working in #eprdctn, a lot of work involves editing various text files. More often than
not, the editing will aim to affect the structure of the document, rather than the content:
retagging, restructuring tags, managing CSS classes,...
Another common operation will be unpacking and repacking files. For example, EPUB files
are nothing more than glorified .ZIP files.
Regular Expressions
One of the tools you want in your toolchest some familiarity with regular expressions. Once
you master regular expressions, you can use search-and-replace for a fairly wide range of
tasks.
Regular expressions can help with tasks that go beyond a simple search-and replace; for
example you could use a search and replace with regular expressions for restructuring
HTML.
Regular expressions are not easy to master. They are very cryptic, almost impossible to
read, and they are not standardized.
Regular expressions are also often loosely referred to as 'GREP' which is a reference to the
Unix command line tool which started it all. GREP = Global Regular Expression Print.
Lack of standardization
You might be using a text editor like BBEdit, Notepad++, Sublime Edit,... All of these support
regular expressions, but each of them will its own unique 'dialect'. They'll be 95% the same
between the different text editors, but there are subtle differences.
You might be using InDesign as the source for EPUB documents. InDesign supports regular
expressions in its Find-Change dialog. And these are in a specific InDesign 'dialect' which has
only 80% similarity when compared to the regular expressions used common text editors.
InDesign has some fairly unique features with regards to regular expressions, features which
you won't find in your text editor.
It does not end there. InDesign supports a scripting language called ExtendScript which is a
form of JavaScript. ExtendScript supports regular expressions, and guess what? They use yet
another dialect of regular expressions, again quite different to the InDesign regular
expressions as seen in the Find-Change dialog.
Then we have the various scripting languages that could be used for tool creation - PHP,
Perl, Python, JavaScript, awk, sed... There are many 10s of them. None of them is 'better' -
whatever works, works.
Again, all of them have regular expressions, and each scripting language will have its own
unique dialect.
Understand the basic principle and use the documentation
To make sense of it all, my recommendation is that you must understand the basic ideas of
regular expressions and how they are constructed. These are well supported in all the
dialects.
Once you understand the basic ideas, you need to consult the documentation and/or use
the facilities of the software at hand to determine the proper expressions.
To match a thin space, for example:
InDesign: ~<
Most text editors: x{2009}
Referring to matched parenthesized sub-expressions in replacement patterns is another
point of difference. For example, sometimes, you need to use $1, sometimes you need to
use 1 to refer to the first parenthesized subexpression.
Text Editors
Another basic tool you need is a (set of) good text editors. Steer clear of word processors or
underpowered tools: don't use MS Word, Apple's TextEdit, Notepad.exe ... None of these
are proper text editors.
Word processors often try to be helpful and 'helpfully' change quotes into curly quotes, or
muck with line endings and character encodings, blithely destroying your HTML structure.
Some editors are much more than that. If you can afford it, you want to have Oxygen XML in
your tool chest. This tool can serve as a text editor but it also understands XML, HTML, CSS...
and will allow you to do much smarter editing with much reduced risk.
Editing XML files in a regular text editor works just fine, but you run the risk of damaging
some finely tuned tag balance, and never know it.
Some text editors (e.g. BBEdit, Oxygen, Atom...) are smart enough to handle text files inside
ZIP-ped data files (e.g. EPUB files) and you can edit text files inside an EPUB without needing
to 'crack it open'.
Scripting Languages
Another powerful basic tool is to have a basic understanding of some scripting language or
languages. There are many out there: the most popular ones are probably Python,
JavaScript, PHP, Perl.
Most of these scripting languages offer the necessary support for complex operations, e.g.
zipping and unzipping, regular expressions, search-and-replace, XML parsing, accessing data
over HTTP or HTTPS connections, connecting to databases...
A scripting language comes in handy when you're faced with a repetitive task that's going
beyond what you can accomplish with find-and-replace and regular expressions. For
example, when there is some 'if-then' logic that needs to handled, or some processing that
needs to be done.
Tools
EPUB unpack/repack: eCanCrusher
When working with EPUB, if you have access to tools like BBEdit or Oxygen, you can perform
EPUB-wide operations like straight on the EPUB file without ever having to
decompress/recompress it.
But sometimes you want to decompress the EPUB, make some changes, then re-compress
it.
There are multiple tools that do this. eCanCrusher is one of them. It works in simple
drag/drop fashion.
https://www.docdataflow.com/ecancrusher/
To decompress: drag/drop an EPUB file onto the eCanCrusher application icon. A
decompressed EPUB folder will appear.
To recompress: drag/drop an EPUB folder onto the eCanCrusher application icon. A
compressed EPUB file will appear.
To configure: double-click the eCanCrusher application icon.
Custom Scripts
Another set of tools on your toolbelt can be custom scripts written in a variety of scripting
languages.
Languages like Python, PHP, JavaScript/Node.js... can be used to write scripts that process
individual text files (e.g. XHTML, CSS,...) or complete EPUB.
None of these is particularly better or worse, and switching to a different language than the
one you already know is rarely beneficial.
All of these scripting languages have features to facilitate handling of XML, pattern
matching, and so on.
There are two hurdles to writing scripts: first of all, installing and configuring the software to
use the scripting language is not always straightforward.
Second, writing scripts is not for the faint of heart, but the rewards are tremendous.
Pick one language, get good at it.
Macintosh
On a standard Mac, some common scripting languages are pre-installed (e.g. PHP, Python
2.7). Installing additional languages is straightforward: open the Terminal window
(Application -> Utilities -> Terminal.app) then invoke the scripting language. For common
scripting languages the Mac will propose to download and install the necessary command
line tools. In the screenshot below, I've just typed python3
As python3 is not installed by default, the Mac offers to fetch and install it:
For node.js (JavaScript) you need to visit and download from https://nodejs.org
Windows
Windows does not have the more common scripting languages pre-installed. There are
many options to download and install these.
One of the many options is Cygwin, which installs a 'Unix-like' environment on Windows and
allows you to use the same command-line tools as Mac and Linux users:
https://cygwin.com
When you run the Cygwin installer, you'll see a window where you can pick-and-choose and
decide which Linux/Unix tools to install.
I find it easiest to install all PHP-related and all Python-related stuff, and things like zip and
unzip.
Rather than try and be selective, I simply search for 'PHP' and/or 'Python' in the package list
and select whole 'PHP' and 'Python' collection.
To install Node.js you need to download and run an installer from https://nodejs.org
Creating a script
There are many ways to go about this, and I won't even attempt to list all of them.
Instead, I'll be creating a very simple script from scratch and will run it on some XHTML files.
For the sake of argument, my task is to go through an EPUB file and find the style attribute
associated with the <body> tag, and remove it. Instead, I'll move that style attribute into the
CSS file for the body tag.
The first step is to experiment a bit. I'll can decompress the EPUB using eCanCrusher, or I
can use an EPUB-aware text editor like BBEdit, Atom, Oxygen XML.
I'll open one of the xhtml files and set out to find the regular expression pattern that works.
Eventually, I came up with:
(body[^>]*) style="[^"]*"([^>]*>)
to be replaced by
12
or
$1$2
depending on the dialect of GREP your text editor is using.
If we only need to do one EPUB, we could simply do a search-and-replace all.
But we want to do this to many EPUBs.
The next step is to create a script that will take a file name as a command line parameter,
which then reads the file, performs the search-and-replace, and overwrites the file with the
updated file.
I created a file deleteBodyStyle.php which has the following script:
<?php
$fileContents = file_get_contents($argv[1]);
$fileContents = preg_replace('/(<body[^>]*)
style="[^"]*"([^>]*>)/','$1$2',$fileContents);
file_put_contents($argv[1], $fileContents);
This script reads the file content of the file at hand (file path is provided as $argv[1]), does
the search-and-replace, and writes out the updated file contents.
I also created the equivalent Python version in a file deleteBodyStyle.py:
import sys
import re
with open(sys.argv[1], 'r') as inFile:
data = inFile.read()
data = re.sub(r'(<body[^>]*) style="[^"]*"([^>]*>)', r'12', data)
with open(sys.argv[1], 'w') as outFile:
outFile.write(data)
We can now test these scripts. I will be using a sample EPUB made by means of Adobe
InDesign 2020 from an InDesign sample file called Adobe History.indd that came with
InDesign CS3. It is an excerpt from the book Inside the Publishing Revolution: The Adobe
Story by Pamela Pfiffner.
Before looking at DropToScript, I'll first use the scripts in a manual fashion. That's slightly
cumbersome, but we need to go through it to get a good understanding of what is going on.
I crack open the EPUB exported from InDesign, and then I'll use Terminal on Mac (or Cygwin
Terminal on Windows) to execute the scripts from the command line, using drag/drop to
avoid having to type the path to the xhtml files.
cd Desktop
php deleteBodyStyle.php /Users/kris/Desktop/Adobe
History/OEBPS/Adobe_History-6.xhtml
python deleteBodyStyle.py /Users/kris/Desktop/Adobe
History/OEBPS/Adobe_History-7.xhtml
To execute either of these scripts on all .xhtml files, we can use some command-line magic.
or
dirToScan="/Users/kris/Desktop/Adobe History/OEBPS/"; ls
"$dirToScan"*.xhtml | while read fileToScan; do python
deleteBodyStyle.py "$fileToScan"; done
After adjusting the CSS file, we can recompress the EPUB.
I've not done any error handling. It would be cleaner to also add additional checks (e.g.
verify the file name extension of the file being processed) and error checks (e.g. report any
unexpected circumstances), but for most intents and purposes the above script will work
fine.
DropToScript Script Wrapper
Once you have a script, you'll often find that you're going through the same motions over
and over:
• Decompress EPUB
• Run the script on a bunch of files (e.g. html files or css files).
• Repackage EPUB
DropToScript manages the decompress/repackage part automatically. Once you have a
script (in PHP, Python, Node JavaScript...)that can process a single file at a time, you can
configure DropToScript to automatically perform the same script on many files, simply by
dragging an EPUB or a collection of file icons onto the DropToScript icon.
DropToScript comes bundled with a number of pre-made useful scripts, but you can easily
add your own.
After downloading it, you need to configure it so it can find the PHP or Python installation
on your computer. You do this by double-clicking the icon of the application.
As an example, you can copy either the deleteBodyStyle.php or deleteBodyStyle.py file into
the DropScripts folder and then drag-drop any EPUB onto DropToScript to have
deleteBodyStyle executed on the text files inside the EPUB.
Stuff I use
cd_to (Mac): https://github.com/jbtule/cdto
Cygwin: https://www.cygwin.com/
Atom Text Editor: https://atom.io/
Notepad++ Text Editor: https://notepad-plus-plus.org/
eCanCrusher: https://www.docdataflow.com/ecancrusher/
DropToScript: https://github.com/BCLibCoop/nnels-a11y-publishing/tree/kris-
enhancements-20200318/ReleaseVersions
Guitar Jigs: https://www.youtube.com/results?search_query=Dan+Erlewine+jig
Inside the Publishing Revolution: The Adobe Story:
https://www.amazon.com/Inside-Publishing-Revolution-Adobe-Story/dp/0321115643

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

What is Coding
What is CodingWhat is Coding
What is Coding
 
Coding vs programming
Coding vs programmingCoding vs programming
Coding vs programming
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language Workbenches
 
Best Practices For Writing Super Readable Code
Best Practices For Writing Super Readable CodeBest Practices For Writing Super Readable Code
Best Practices For Writing Super Readable Code
 
Ijet Talk
Ijet TalkIjet Talk
Ijet Talk
 
What Is Coding And Why Should You Learn It?
What Is Coding And Why Should You Learn It?What Is Coding And Why Should You Learn It?
What Is Coding And Why Should You Learn It?
 
computer languages
computer languagescomputer languages
computer languages
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
Margareth lota
Margareth lotaMargareth lota
Margareth lota
 
Programming language
Programming languageProgramming language
Programming language
 
Comp2
Comp2Comp2
Comp2
 
MT and Translator's Tools
MT and Translator's ToolsMT and Translator's Tools
MT and Translator's Tools
 
Programming Language
Programming LanguageProgramming Language
Programming Language
 
The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210
 
Architecting Domain-Specific Languages
Architecting Domain-Specific LanguagesArchitecting Domain-Specific Languages
Architecting Domain-Specific Languages
 
Programming Language
Programming LanguageProgramming Language
Programming Language
 
Web programming UNIT II by Bhavsingh Maloth
Web programming UNIT II by Bhavsingh MalothWeb programming UNIT II by Bhavsingh Maloth
Web programming UNIT II by Bhavsingh Maloth
 
Algorithm vs
Algorithm vsAlgorithm vs
Algorithm vs
 
Programming language
Programming languageProgramming language
Programming language
 
Lec 0 p pl
Lec 0 p plLec 0 p pl
Lec 0 p pl
 

Similar a Better problem solving through scripting: How to think through your #eprdctn roadblocks - Course notes

Classification Of Software
Classification Of SoftwareClassification Of Software
Classification Of Softwarepy7rjs
 
Specification Of The Programming Language Of Java
Specification Of The Programming Language Of JavaSpecification Of The Programming Language Of Java
Specification Of The Programming Language Of JavaKim Moore
 
notes on Programming fundamentals
notes on Programming fundamentals notes on Programming fundamentals
notes on Programming fundamentals ArghodeepPaul
 
Java And Community Support
Java And Community SupportJava And Community Support
Java And Community SupportWilliam Grosso
 
Markdown - friend or foe?
Markdown - friend or foe?Markdown - friend or foe?
Markdown - friend or foe?Ellis Pratt
 
Cmp2412 programming principles
Cmp2412 programming principlesCmp2412 programming principles
Cmp2412 programming principlesNIKANOR THOMAS
 
Cs121 Unit Test
Cs121 Unit TestCs121 Unit Test
Cs121 Unit TestJill Bell
 
lecture2-PerlProgramming
lecture2-PerlProgramminglecture2-PerlProgramming
lecture2-PerlProgrammingtutorialsruby
 
lecture2-PerlProgramming
lecture2-PerlProgramminglecture2-PerlProgramming
lecture2-PerlProgrammingtutorialsruby
 
Reverse Engineering in Linux - The tools showcase
Reverse Engineering in Linux - The tools showcaseReverse Engineering in Linux - The tools showcase
Reverse Engineering in Linux - The tools showcaseLevis Nickaster
 
Procedural Programming Of Programming Languages
Procedural Programming Of Programming LanguagesProcedural Programming Of Programming Languages
Procedural Programming Of Programming LanguagesTammy Moncrief
 
Intro. to prog. c++
Intro. to prog. c++Intro. to prog. c++
Intro. to prog. c++KurdGul
 
Low maintenance perl notes
Low maintenance perl notesLow maintenance perl notes
Low maintenance perl notesPerrin Harkins
 

Similar a Better problem solving through scripting: How to think through your #eprdctn roadblocks - Course notes (20)

Classification Of Software
Classification Of SoftwareClassification Of Software
Classification Of Software
 
Specification Of The Programming Language Of Java
Specification Of The Programming Language Of JavaSpecification Of The Programming Language Of Java
Specification Of The Programming Language Of Java
 
notes on Programming fundamentals
notes on Programming fundamentals notes on Programming fundamentals
notes on Programming fundamentals
 
Unit 1
Unit 1Unit 1
Unit 1
 
JAVA
JAVAJAVA
JAVA
 
Java And Community Support
Java And Community SupportJava And Community Support
Java And Community Support
 
Markdown - friend or foe?
Markdown - friend or foe?Markdown - friend or foe?
Markdown - friend or foe?
 
Presentation-1.pptx
Presentation-1.pptxPresentation-1.pptx
Presentation-1.pptx
 
Cmp2412 programming principles
Cmp2412 programming principlesCmp2412 programming principles
Cmp2412 programming principles
 
Cs121 Unit Test
Cs121 Unit TestCs121 Unit Test
Cs121 Unit Test
 
SYSTEM DEVELOPMENT
SYSTEM DEVELOPMENTSYSTEM DEVELOPMENT
SYSTEM DEVELOPMENT
 
lecture2-PerlProgramming
lecture2-PerlProgramminglecture2-PerlProgramming
lecture2-PerlProgramming
 
lecture2-PerlProgramming
lecture2-PerlProgramminglecture2-PerlProgramming
lecture2-PerlProgramming
 
Python overview
Python overviewPython overview
Python overview
 
Reverse Engineering in Linux - The tools showcase
Reverse Engineering in Linux - The tools showcaseReverse Engineering in Linux - The tools showcase
Reverse Engineering in Linux - The tools showcase
 
Procedural Programming Of Programming Languages
Procedural Programming Of Programming LanguagesProcedural Programming Of Programming Languages
Procedural Programming Of Programming Languages
 
Learning to code in 2020
Learning to code in 2020Learning to code in 2020
Learning to code in 2020
 
Intro. to prog. c++
Intro. to prog. c++Intro. to prog. c++
Intro. to prog. c++
 
Low maintenance perl notes
Low maintenance perl notesLow maintenance perl notes
Low maintenance perl notes
 
Ic lecture8
Ic lecture8 Ic lecture8
Ic lecture8
 

Más de BookNet Canada

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...BookNet Canada
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024BookNet Canada
 
Transcript: Book industry state of the nation 2024 - Tech Forum 2024
Transcript: Book industry state of the nation 2024 - Tech Forum 2024Transcript: Book industry state of the nation 2024 - Tech Forum 2024
Transcript: Book industry state of the nation 2024 - Tech Forum 2024BookNet Canada
 
Book industry state of the nation 2024 - Tech Forum 2024
Book industry state of the nation 2024 - Tech Forum 2024Book industry state of the nation 2024 - Tech Forum 2024
Book industry state of the nation 2024 - Tech Forum 2024BookNet Canada
 
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024BookNet Canada
 
Transcript: Trending now: Book subjects on the move in the Canadian market - ...
Transcript: Trending now: Book subjects on the move in the Canadian market - ...Transcript: Trending now: Book subjects on the move in the Canadian market - ...
Transcript: Trending now: Book subjects on the move in the Canadian market - ...BookNet Canada
 
Transcript: New stores, new views: Booksellers adapting engaging and thriving...
Transcript: New stores, new views: Booksellers adapting engaging and thriving...Transcript: New stores, new views: Booksellers adapting engaging and thriving...
Transcript: New stores, new views: Booksellers adapting engaging and thriving...BookNet Canada
 
Show and tell: What’s in your tech stack? - Tech Forum 2023
Show and tell: What’s in your tech stack? - Tech Forum 2023Show and tell: What’s in your tech stack? - Tech Forum 2023
Show and tell: What’s in your tech stack? - Tech Forum 2023BookNet Canada
 
Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023
Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023
Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023BookNet Canada
 
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...BookNet Canada
 
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023BookNet Canada
 

Más de BookNet Canada (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
 
Transcript: Book industry state of the nation 2024 - Tech Forum 2024
Transcript: Book industry state of the nation 2024 - Tech Forum 2024Transcript: Book industry state of the nation 2024 - Tech Forum 2024
Transcript: Book industry state of the nation 2024 - Tech Forum 2024
 
Book industry state of the nation 2024 - Tech Forum 2024
Book industry state of the nation 2024 - Tech Forum 2024Book industry state of the nation 2024 - Tech Forum 2024
Book industry state of the nation 2024 - Tech Forum 2024
 
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
Trending now: Book subjects on the move in the Canadian market - Tech Forum 2024
 
Transcript: Trending now: Book subjects on the move in the Canadian market - ...
Transcript: Trending now: Book subjects on the move in the Canadian market - ...Transcript: Trending now: Book subjects on the move in the Canadian market - ...
Transcript: Trending now: Book subjects on the move in the Canadian market - ...
 
Transcript: New stores, new views: Booksellers adapting engaging and thriving...
Transcript: New stores, new views: Booksellers adapting engaging and thriving...Transcript: New stores, new views: Booksellers adapting engaging and thriving...
Transcript: New stores, new views: Booksellers adapting engaging and thriving...
 
Show and tell: What’s in your tech stack? - Tech Forum 2023
Show and tell: What’s in your tech stack? - Tech Forum 2023Show and tell: What’s in your tech stack? - Tech Forum 2023
Show and tell: What’s in your tech stack? - Tech Forum 2023
 
Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023
Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023
Transcript: Show and tell: What’s in your tech stack? - Tech Forum 2023
 
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
 
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
 

Último

Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsaqsarehman5055
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfSkillCertProExams
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCamilleBoulbin1
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...amilabibi1
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Delhi Call girls
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 

Último (20)

Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 

Better problem solving through scripting: How to think through your #eprdctn roadblocks - Course notes

  • 1. Better problem solving through scripting: How to think through your #eprdctn roadblocks and script your way to efficiency Do you find yourself repeating the same task over and over? Or feeling certain there is a way to automate a task but it's just outside of your skill set? Kris Coppieters from Rorohiko has built a career solving just those kinds of problems. Whether it's scripting a solution from within InDesign, or using AppleScript to finish off some markup, Kris can show you how to bring high-level thinking to quick and dirty tasks. Two categories of automation When it comes to automation, we can classify most of the software used into two large categories which I'll call tools and systems. Tools Tools are smaller programs which are similar to tools used in various crafts, e.g. carpentry. In carpentry, you might use a hammer, nails, screws, screwdrivers, saws, CNC machines, drills... A few of these tools are more complex, and have a wide range of functions, but many tools are simple and have a very specific function. Some of these tools are specifically made for a specific task. In #eprdctn, the tools used are text editors, search-and-replace, XML editors like Oxygen, testing tools, reader programs... Systems Systems are more complex than simple tools. They take in some raw materials and spit out finished results at the other end. A sawmill can be seen as a system. It processes logs and produces planks, poles, boards... The people working in the sawmill will use machines and tools to make the system work. A working system can encompass many processes and sub-processes. Some processes are automated, some are manual processes. In #eprdctn, we might have a system that takes in raw data, and produces ebooks. In #eprdctn publishing in general, systems are often called workflows. Let's talk about tools and jigs This presentation is all about tools for your craft. There's a guy called Dan Erlewine on YouTube, who is a luthier. He has many YouTube movies on guitar repair, and many of those movies are about clever self-made tools he uses to work on guitars. He calls them 'jigs'. https://www.youtube.com/results?search_query=dan+erlewine+jig I like that term, so I want to coin the term for 'jig' for #eprdcnt custom tools. You've heard it here first! The first step is to take notice when you find yourself repeatedly performing a cumbersome, difficult, or repetitive task. Then ask yourself whether it's possible to create a jig for that. Basic Tools When working in #eprdctn, a lot of work involves editing various text files. More often than not, the editing will aim to affect the structure of the document, rather than the content: retagging, restructuring tags, managing CSS classes,... Another common operation will be unpacking and repacking files. For example, EPUB files are nothing more than glorified .ZIP files.
  • 2. Regular Expressions One of the tools you want in your toolchest some familiarity with regular expressions. Once you master regular expressions, you can use search-and-replace for a fairly wide range of tasks. Regular expressions can help with tasks that go beyond a simple search-and replace; for example you could use a search and replace with regular expressions for restructuring HTML. Regular expressions are not easy to master. They are very cryptic, almost impossible to read, and they are not standardized. Regular expressions are also often loosely referred to as 'GREP' which is a reference to the Unix command line tool which started it all. GREP = Global Regular Expression Print. Lack of standardization You might be using a text editor like BBEdit, Notepad++, Sublime Edit,... All of these support regular expressions, but each of them will its own unique 'dialect'. They'll be 95% the same between the different text editors, but there are subtle differences. You might be using InDesign as the source for EPUB documents. InDesign supports regular expressions in its Find-Change dialog. And these are in a specific InDesign 'dialect' which has only 80% similarity when compared to the regular expressions used common text editors. InDesign has some fairly unique features with regards to regular expressions, features which you won't find in your text editor. It does not end there. InDesign supports a scripting language called ExtendScript which is a form of JavaScript. ExtendScript supports regular expressions, and guess what? They use yet another dialect of regular expressions, again quite different to the InDesign regular expressions as seen in the Find-Change dialog. Then we have the various scripting languages that could be used for tool creation - PHP, Perl, Python, JavaScript, awk, sed... There are many 10s of them. None of them is 'better' - whatever works, works. Again, all of them have regular expressions, and each scripting language will have its own unique dialect. Understand the basic principle and use the documentation To make sense of it all, my recommendation is that you must understand the basic ideas of regular expressions and how they are constructed. These are well supported in all the dialects. Once you understand the basic ideas, you need to consult the documentation and/or use the facilities of the software at hand to determine the proper expressions. To match a thin space, for example: InDesign: ~< Most text editors: x{2009} Referring to matched parenthesized sub-expressions in replacement patterns is another point of difference. For example, sometimes, you need to use $1, sometimes you need to use 1 to refer to the first parenthesized subexpression. Text Editors Another basic tool you need is a (set of) good text editors. Steer clear of word processors or underpowered tools: don't use MS Word, Apple's TextEdit, Notepad.exe ... None of these are proper text editors. Word processors often try to be helpful and 'helpfully' change quotes into curly quotes, or muck with line endings and character encodings, blithely destroying your HTML structure.
  • 3. Some editors are much more than that. If you can afford it, you want to have Oxygen XML in your tool chest. This tool can serve as a text editor but it also understands XML, HTML, CSS... and will allow you to do much smarter editing with much reduced risk. Editing XML files in a regular text editor works just fine, but you run the risk of damaging some finely tuned tag balance, and never know it. Some text editors (e.g. BBEdit, Oxygen, Atom...) are smart enough to handle text files inside ZIP-ped data files (e.g. EPUB files) and you can edit text files inside an EPUB without needing to 'crack it open'. Scripting Languages Another powerful basic tool is to have a basic understanding of some scripting language or languages. There are many out there: the most popular ones are probably Python, JavaScript, PHP, Perl. Most of these scripting languages offer the necessary support for complex operations, e.g. zipping and unzipping, regular expressions, search-and-replace, XML parsing, accessing data over HTTP or HTTPS connections, connecting to databases... A scripting language comes in handy when you're faced with a repetitive task that's going beyond what you can accomplish with find-and-replace and regular expressions. For example, when there is some 'if-then' logic that needs to handled, or some processing that needs to be done. Tools EPUB unpack/repack: eCanCrusher When working with EPUB, if you have access to tools like BBEdit or Oxygen, you can perform EPUB-wide operations like straight on the EPUB file without ever having to decompress/recompress it. But sometimes you want to decompress the EPUB, make some changes, then re-compress it. There are multiple tools that do this. eCanCrusher is one of them. It works in simple drag/drop fashion. https://www.docdataflow.com/ecancrusher/ To decompress: drag/drop an EPUB file onto the eCanCrusher application icon. A decompressed EPUB folder will appear. To recompress: drag/drop an EPUB folder onto the eCanCrusher application icon. A compressed EPUB file will appear. To configure: double-click the eCanCrusher application icon.
  • 4. Custom Scripts Another set of tools on your toolbelt can be custom scripts written in a variety of scripting languages. Languages like Python, PHP, JavaScript/Node.js... can be used to write scripts that process individual text files (e.g. XHTML, CSS,...) or complete EPUB. None of these is particularly better or worse, and switching to a different language than the one you already know is rarely beneficial. All of these scripting languages have features to facilitate handling of XML, pattern matching, and so on. There are two hurdles to writing scripts: first of all, installing and configuring the software to use the scripting language is not always straightforward. Second, writing scripts is not for the faint of heart, but the rewards are tremendous. Pick one language, get good at it. Macintosh On a standard Mac, some common scripting languages are pre-installed (e.g. PHP, Python 2.7). Installing additional languages is straightforward: open the Terminal window (Application -> Utilities -> Terminal.app) then invoke the scripting language. For common scripting languages the Mac will propose to download and install the necessary command line tools. In the screenshot below, I've just typed python3 As python3 is not installed by default, the Mac offers to fetch and install it: For node.js (JavaScript) you need to visit and download from https://nodejs.org
  • 5. Windows Windows does not have the more common scripting languages pre-installed. There are many options to download and install these. One of the many options is Cygwin, which installs a 'Unix-like' environment on Windows and allows you to use the same command-line tools as Mac and Linux users: https://cygwin.com When you run the Cygwin installer, you'll see a window where you can pick-and-choose and decide which Linux/Unix tools to install. I find it easiest to install all PHP-related and all Python-related stuff, and things like zip and unzip.
  • 6. Rather than try and be selective, I simply search for 'PHP' and/or 'Python' in the package list and select whole 'PHP' and 'Python' collection.
  • 7.
  • 8. To install Node.js you need to download and run an installer from https://nodejs.org Creating a script There are many ways to go about this, and I won't even attempt to list all of them. Instead, I'll be creating a very simple script from scratch and will run it on some XHTML files. For the sake of argument, my task is to go through an EPUB file and find the style attribute associated with the <body> tag, and remove it. Instead, I'll move that style attribute into the CSS file for the body tag. The first step is to experiment a bit. I'll can decompress the EPUB using eCanCrusher, or I can use an EPUB-aware text editor like BBEdit, Atom, Oxygen XML. I'll open one of the xhtml files and set out to find the regular expression pattern that works. Eventually, I came up with: (body[^>]*) style="[^"]*"([^>]*>) to be replaced by 12 or $1$2 depending on the dialect of GREP your text editor is using. If we only need to do one EPUB, we could simply do a search-and-replace all. But we want to do this to many EPUBs. The next step is to create a script that will take a file name as a command line parameter, which then reads the file, performs the search-and-replace, and overwrites the file with the updated file. I created a file deleteBodyStyle.php which has the following script:
  • 9. <?php $fileContents = file_get_contents($argv[1]); $fileContents = preg_replace('/(<body[^>]*) style="[^"]*"([^>]*>)/','$1$2',$fileContents); file_put_contents($argv[1], $fileContents); This script reads the file content of the file at hand (file path is provided as $argv[1]), does the search-and-replace, and writes out the updated file contents. I also created the equivalent Python version in a file deleteBodyStyle.py: import sys import re with open(sys.argv[1], 'r') as inFile: data = inFile.read() data = re.sub(r'(<body[^>]*) style="[^"]*"([^>]*>)', r'12', data) with open(sys.argv[1], 'w') as outFile: outFile.write(data) We can now test these scripts. I will be using a sample EPUB made by means of Adobe InDesign 2020 from an InDesign sample file called Adobe History.indd that came with InDesign CS3. It is an excerpt from the book Inside the Publishing Revolution: The Adobe Story by Pamela Pfiffner. Before looking at DropToScript, I'll first use the scripts in a manual fashion. That's slightly cumbersome, but we need to go through it to get a good understanding of what is going on. I crack open the EPUB exported from InDesign, and then I'll use Terminal on Mac (or Cygwin Terminal on Windows) to execute the scripts from the command line, using drag/drop to avoid having to type the path to the xhtml files. cd Desktop php deleteBodyStyle.php /Users/kris/Desktop/Adobe History/OEBPS/Adobe_History-6.xhtml python deleteBodyStyle.py /Users/kris/Desktop/Adobe History/OEBPS/Adobe_History-7.xhtml To execute either of these scripts on all .xhtml files, we can use some command-line magic. or dirToScan="/Users/kris/Desktop/Adobe History/OEBPS/"; ls "$dirToScan"*.xhtml | while read fileToScan; do python deleteBodyStyle.py "$fileToScan"; done After adjusting the CSS file, we can recompress the EPUB. I've not done any error handling. It would be cleaner to also add additional checks (e.g. verify the file name extension of the file being processed) and error checks (e.g. report any unexpected circumstances), but for most intents and purposes the above script will work fine. DropToScript Script Wrapper Once you have a script, you'll often find that you're going through the same motions over and over: • Decompress EPUB • Run the script on a bunch of files (e.g. html files or css files). • Repackage EPUB DropToScript manages the decompress/repackage part automatically. Once you have a script (in PHP, Python, Node JavaScript...)that can process a single file at a time, you can
  • 10. configure DropToScript to automatically perform the same script on many files, simply by dragging an EPUB or a collection of file icons onto the DropToScript icon. DropToScript comes bundled with a number of pre-made useful scripts, but you can easily add your own. After downloading it, you need to configure it so it can find the PHP or Python installation on your computer. You do this by double-clicking the icon of the application. As an example, you can copy either the deleteBodyStyle.php or deleteBodyStyle.py file into the DropScripts folder and then drag-drop any EPUB onto DropToScript to have deleteBodyStyle executed on the text files inside the EPUB. Stuff I use cd_to (Mac): https://github.com/jbtule/cdto Cygwin: https://www.cygwin.com/ Atom Text Editor: https://atom.io/ Notepad++ Text Editor: https://notepad-plus-plus.org/ eCanCrusher: https://www.docdataflow.com/ecancrusher/ DropToScript: https://github.com/BCLibCoop/nnels-a11y-publishing/tree/kris- enhancements-20200318/ReleaseVersions Guitar Jigs: https://www.youtube.com/results?search_query=Dan+Erlewine+jig Inside the Publishing Revolution: The Adobe Story: https://www.amazon.com/Inside-Publishing-Revolution-Adobe-Story/dp/0321115643