This document discusses algorithms and automation in the content industry. It provides an overview of trends toward increased automation through machine translation, speech recognition/synthesis, and end-to-end automated workflows. While automation offers benefits like increased efficiency, standardization and quality, it also risks cost explosions and data issues if not implemented carefully. The document advocates developing an all-encompassing automation strategy that considers compliance, acceptance, and monitoring in order to maximize the benefits of automation.
Uneak White's Personal Brand Exploration Presentation
Algorithms for the content industry
1. Algorithms in the content industry
How much automation makes sense
Tatsiana Bilimava, berns language consulting
for better language consulting
2. • We are thoroughly familiar with the language industry
• We have no financial ties to software or translation vendors
• We automate and structure your language data processes
for better language consultingfor better language consulting
3. for better language consultingfor better language consulting
Creation
process
Translation
process
Publication
process
Tatsiana Bilimava
+49 (0) 211 22 06 77 11
bilimava@berns-language-consulting.de
www.xing.com/profile/Tatsiana_Bilimava
www.linkedin.com/in/Tatsiana-Bilimava-5900b9139
@bilimava
6. 6Folie Nr.
After 120 years of service
weather observers are
leaving the mountain -
weather observation is
now fully automated.
Zugspitze: 31th May 2018
Source: Angelika Warmuth/dpa
Modern Occupations Are Changing
8. 8Folie Nr.
163 ZB
2025
Our Digital Universe Is Expanding Incredibly Fast
2016
16 ZB
Source: IDC/Seagate – The Evolution of Data through 2025
5 MB
1956
10. 10Folie Nr.
Automatable Activities Are Manifold
Source: McKinsey Global Institute – A future that works: Automation, employment, and productivity
11. 11Folie Nr.
A Healthy Mix of Automatic and Manual Activities
Source: McKinsey Global Institute – A future that works: Automation, employment, and productivity
Most occupations
will evolve and
adapt instead of
being automated
away
12. 12Folie Nr.
Some Areas Are Less Permeated by Automation
Source: McKinsey Global Institute – The technical potential for automation in the US
Decision-making
Planning/conceptualizing Unpredictable environments
Creative/empathetic tasks
13. 13Folie Nr.
Factors Affecting Automation
Source: McKinsey Global Institute – The technical potential for automation in the US
Costs to automate2
Staff availability/costs3
Extended benefits4
Technical feasibility1
Social and regulatory issues5
15. 15Folie Nr.
Translation Services of the Near Future Are...
“...always on. Happen
in the background.“
“... provided without delay.
Lower quality is acceptable
when speed is essential.“
“... efficient and
automated, avoiding
redundancy.“
“...the core of a
multilingual content
strategy.“
16. 16Folie Nr.
General Advantages of Automation
Standardization2
Better quality3
Higher productivity4
EFFICIENCY COSTS
Ease of planning1
17. 17Folie Nr.
Automation Offers Solutions to Specific Problems
QA keeps you awake at night?
You can’t agree on terminology?
Your deadlines are hard to keep?
Your text volume is exploding?
Your tools do not get along?
Your processes are too manual?
18. 18Folie Nr.
Real-time Translation Everywhere, Connected to Everything
Terminology
Language Data
Tools (CAT, CMS)
MT Engines
Speech Data
Translation
Specialist
20. 20Folie Nr.
Integrating Machine Translation
Your roadmap
• Optimize your texts for machine translation
• Evaluate and select a system supplier
• Train the engine for your domain
• Integrate MT into your workflow
Your wins
• You translate fast and at anytime
• You expand your language portfolio over night
• You make human translation more efficient
• You save translation and PM costs
Up to 50%
savings with
machine
translation
No
real-time
translation
without MT
Pre-Processing Training Evaluation Integration
21. 21Folie Nr.
Automating Your End-to-end Process
Creation Translation Review/QA Publication
Your roadmap
• Use the right software combination for your content
• Detail the way content is created and translated
• Automate and test your workflow
Your wins
• Save time and money while improving quality
• Ensure smooth content processes
• Use synergies to your advantage
80 %
of all international
companies face
revenue losses due
to translation
issues
90 %
savings can be
achieved through
automation
22. 22Folie Nr.
APIs Workflows Testing Go-live
Connecting Your Systems in a Smooth Workflow
Your roadmap
• Draw up a master concept
• Connect all your systems
• Test if everything works as designed
Your wins
• Your systems communicate with each other
• Your workflows run smoothly
• You win time for text creation, translation etc.
Well-prepared
automation
makes your
processes stable
and future-proof
Many superfluous
clicks
without proper
system interfaces
27. 27Folie Nr.
All-encompassing Automation Strategy
Automatic content transfer via APIs
Integrating automatized steps into manual routines
Advanced monitoring and reporting tools
Automatic notifications to improve communication
30. 30Folie Nr.
Chances & Risks
- Slow, rigid, outdated
- Quality issues
- Missing opportunities
- Risk of fines
Not automatingAutomating
- Leaner, faster
- Quality boost
- Compliance
- New business
- Investing into systems
- Risk of losing data
- High internal workload
- Staff wary of innovation
- Known routines
- No investments
- No re-training
ChancesRisks
31. 31Folie Nr.
Action Plan to Get to the Desired State
Draw up current state
Focus on important aspects
Focus on finding weaknesses and strengths
Design desired state
Lose superfluous steps and add useful steps
Deduct system requirements and prioritize
Check and select future system
Deduct criteria for system selection
Reality check: cost and effort versus usefulness
Ask all relevant people the right questions – and listen well!
How do they currently work, what is good or not so great?
What do they need to be more successful?
32. 32Folie Nr.
Memberships, partnerships etc.
We are German TAUS representatives and in the committee of the
DQF Manufacturing Group (Automotive) for the development of
standard quality assurance procedures in translation processes
We are members of TAPICC workgroups for the development of
open source API standards aiming to improve data exchange
between authoring and translation systems
We are members of the largest German association for technical
communication tekom
We have a large network with machine translation system providers
33. Thank you very much!
@blcTeam
+49 (0) 211 22 06 77 0
info@berns-language-consulting.de
www.berns-language-consulting.de
www.facebook.com/bernslanguageconsulting
for
better
language
consulting
Notas del editor
Automation routines are extensively used in content creation, localization and publishing. However, even the most sophisticated artificial intelligence cannot replace unique human skills. The talk will give an overview of applications, chances and limits of automation and touch upon the role of man in the Second Machine Age.
Help clientsreorganise their multilingual content process
CAT tools, interfaces, automatiion, QA and terminology
After almost 120 years, the German Weather Service is withdrawing its observers from the Zugspitze. Norbert Stadler is one of them - after 40 years. Meteorologists prepare the weather forecast, weather observers collect data: How much has it rained, how much has it snowed, what are the pressure, temperature, humidity, wind direction and speed, how long does the sun shine, and what do the clouds look like? Stadler and his colleagues used to check the weather every half hour. They gave the results of their observation to the DWD headquarters in Offenbach.
Devices have taken on many tasks step by step. Thermometers and air pressure gauges have long been transmitting their values digitally; the duration of sunshine is recorded digitally. "We automate; this continues month after month," says DWD spokesman Uwe Kirsche. People are not completely replaceable. However, the following often applies: "Technology can collect much more data and does it faster”.
development of the steam machine in 1765: the first machine age — it was possible for the first time to produce massive amounts of mechanical power. Now comes the second machine age, where we experience tremendous progress in digital technologies, the changes will be very beneficial for humans but there will be new problems as well.
Automation of activities can enable businesses to improve performance by reducing errors and improving quality and speed, and in some cases achieving outcomes that go beyond human capabilities. Automation also contributes to productivity, as it has done historically.
Algorithms are the core of automation routines-….
In September 1956 IBM launched the 305 RAMAC, the first ‘SUPER’ computer with a hard disk drive (HDD). The HDD weighed over a ton and stored 5 MB of data.
a zettabyte is one trillion gigabytes
by 2025 the global datasphere will grow to 163 zettabytes That’s ten times the 16.1ZB of data generated in 2016. 40 trillion DVDs which would reach to the moon
and back over 100 million times
Nearly 20% of the data in the global datasphere will be critical to our daily lives and nearly 10% of that will be hypercritical.
Ambiguity
Highly cultural and contextual understanding
Less than 5 percent of all occupations can be automated entirely using demonstrated technologies, about 60 percent of all occupations have at least 30 percent of constituent activities that could be automated. On the whole, more occupations will change than will be automated away.
Activities most susceptible to automation involve physical activities in highly structured and predictable environments, as well as the collection and processing of data.
................
Less than 5 percent of all occupations can be automated entirely using demonstrated technologies, about 60 percent of all occupations have at least 30 percent of constituent activities that could be automated. On the whole, more occupations will change than will be automated away.
Activities most susceptible to automation involve physical activities in highly structured and predictable environments, as well as the collection and processing of data.
................
1 Applying expertise to decision making, planning, and creative tasks.
2 Unpredictable physical work (physical activities and the operation of machinery) is performed in unpredictable environments, while in predictable physical work, the environments are predictable.
Of course, it’s not just the technical feasibility…
The companies know the challenges they are faced with and must make decisions about how they will set up their automation in the future and whether the company's own IT systems meet the upcoming can still perform their tasks. Often missing the means for companies to use IT in accordance withthe requirements and the resultingrelated investments. Thereforecompanies are reducing their tight financial budgets in such a way that is as efficient as possible. Insufficient investment hampers innovation of the company and too high investments may not be economical enough.
Increase efficieny and keep down the costs
To be more specific…
Man-machine cooperation
E.g. you can use out-of-the box solution by system providers ()
More time to do what exactly?
The companies know the challenges they are faced with and must make decisions about how they will set up their automation in the future and whether the company's own IT systems meet the upcoming can still perform their tasks. Often missing the means for companies to use IT in accordance with the requirements and the resulting related investments. Therefore companies are reducing their tight financial budgets in such a way that is as efficient as possible. Insufficient investment hampers innovation of the company and too high investments may not be economical enough.
Wordbee uses automation extensively
Escaping data blindness
The General Data Protection Regulation brings lots of important changes in 2018. Privacy (and Data Protection) by design and by default is written into Article 25 of the EU GDPR.
Privacy by Design states that any action a company undertakes that involves processing personal data must be done with data protection and privacy in mind at every step. This includes internal projects, product development, software development, IT systems, and much more. In practice, this means that the IT department, or any department that processes personal data, must ensure that privacy is built in to a system during the whole life cycle of the system or process. Up to now, tagging security or privacy features on at the end of a long production process would be fairly standard.
Privacy by Default means that once a product or service has been released to the public, the strictest privacy settings should apply by default, without any manual input from the end user. In addition, any personal data provided by the user to enable a product's optimal use should only be kept for the amount of time necessary to provide the product or service. If more information than necessary to provide the service is disclosed, then "privacy by default" has been breached.
GDPR is not specific about how you implement these changes, but for many organisations adopting a privacy by design approach will require a significant culture change.
Pseudonymization replaces the data that would allow identification with an alias, for example a code. However, there is a separate assignment (e.g. in the form of a table) between the subject and the pseudonym, so that it is ultimately still possible to identify the subject again if this assignment is known.
The processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information.
The application of pseudonymization to personal data can reduce the risks to the data subjects concerned and help controllers and processors to meet their data protection obligations. To pseudonymise a data set, the “additional information” must be “kept separately and subject to technical and organisational measures to ensure non-attribution to an identified or identifiable person
Type of information sanitization whose intent is privacy protection. It is the process of either encrypting or removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous.
Technology that converts clear text data into a nonhuman readable and irreversible form, and encryption techniques in which the decryption key has been discarded. Family names, patronyms, first names, maiden names, aliases – Postal addresses, telephone numbers, postal codes and cities – IDs: social security number (e.g. fiscal code in Italy, National Insurance number in UK), bank account details (e.g. IBAN), credit card numbers, valid keys, partial anonymisation.
Distinct from data masking, data encryption translates data into another form, or code, so that only people with access to a secret key (formally called a decryption key) or password can read it.
Listen well means: If people say: everything is ‘OK’ then you have to ask them to be more specific: What is really working (how) and what isn’t, why isn’t it great?
We are lazy creatures, we make do with workarounds, it is vital that you give people the time and chance to think about process and system pit-falls
At the same time this is part of a change management process, when asking people their opinion and view on current state of affairs, you give them a chance to be part of the change, instead of just forcing change on them, and then working against them instead of with them
With the help of all experts (users) you draw up the current processes (as-is) and at the same time you list all weaknesses (things you want to get rid of) and strenghts (things you want to keep)
Now design your desired process with this information and from here you can deduct your requirements for your new system that will replace the old one.
The requirement list will contain features that your old system has but also new features and maybe have more (or less) functionality than your old system depending on your ideal process for your company in this specific context
an example...