📞 Contact Number 8617697112 VIP Ganderbal Call Girls
DH101 2013/2014 course 5 - Project on Venice / Datafication / Regulated representations / XML
1. Digital Humanities 101 - 2013/2014 - Course 5
Digital Humanities Laboratory
Fr´d´ric Kaplan
e e
frederic.kaplan@epfl.ch
2. o
Peer reviewing of blog posts has started this week. Your
reviews are expected by the time of next week’s course.
You can change your grades till the last moment.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
2
3. o
Semester 1 : Content of each course
• (1) 19.09 Introduction to the course / Live Tweeting and Collective note
taking
• (2) 25.09 Introduction to Digital Humanities / Wordpress / First assignment
• (3) 2.10 Introduction to the Venice Time Machine project / Zotero
• 9.10 No course
• (4) 16.10 Digitization techniques / Deadline first assignment
• (5) 23.10 Datafication / Presentation of projects
• (6) 30.10 Pattern recognition / OCR / Deadline peer-reviewing of first
assignment
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
3
4. o
Semester 1 : Content of each course
• (7) 6.11 Semantic modelling / RDF
• (8) 13.11 Historical Geographical Information Systems, Procedural modelling
/ City Engine / Deadline Project selection
• (9) 20.11 Crowdsourcing / Wikipedia / OpenStreetMap
• (10) 27.11 Cultural heritage interfaces and visualisation / Museographic
experiences
• 4.12 Group work on the projects
• 11.12 Oral exam / Presentation of projects / Deadline Project blog
• 18.12 Oral exam / Presentation of projects
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
4
5. o
Structure of today's course
• Presentation of the projects for semester 2
• Introduction to datafication and regulated representations (maps + textual
documents)
• A short introduction to a possible content encoding tool : XML
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
5
7. o
Context : The Venice Atlas. A book and an interactive site.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
7
8. o
Each project could be a section of this atlas (population,
politics, timelines, etc.)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
8
9. o
What all the projects have in common
• Each project will start with a set of sources. We will suggest some possible
sources on dh101.ch, but you can add others.
• These sources should be digitised (course 4)
• The content of these sources will be transformed in a data model (course 5 +
7 + 8), possibly with automatic processes (course 6)
• This data model will be stored in a database (course 7 + 8), possibly
permitting other contributors to extend or improve the data (course 9)
• This data model will be the basis of both a static (a set of images) and
interactive visual representation (an HTML5, UNITY or or other site) (course
8 + course 10)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
9
10. o
What all the projects have in common
• The project will be conducted by groups of 2-3 students
• We will present a list of possible projects (but you can also invent your own
provided that it respects the common features and objectives described on the
previous slide)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
10
11. o
What you have do to
• Form groups and choose or invent a project (Deadline 13.11)
• Use Framapad for this, put your name under the project you are interested in.
• Create an independent blog (NOT dh101.ch) for your project including
(Deadline 11.12, 30 % of your final grade).
• The definition of the project objectives and deliverables (100 words)
• A methodology section (How you will approach the digitization, modelisation and presentation of
your data) (750 words)
• A project plan with milestones
• Present the project orally in group on 11.12 or 18.12 (7 minutes presentations
+ 3 minutes questions, 20 % of your final grade)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
11
13. o
Dominio da Mar (T1)
Timeline of Dominio da Mar (cities,
fortresses, colonies)
The objective is to synthetize
chronogically the Venetians settlements
overseas. You will have to separate the
direct administration and the places
indirectly supervised by Venice. Territories
will appear and disappear over the
centuries.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
13
14. o
Dominio da Terra Ferma (T2)
Timeline of Dominio da Terra Ferma. The
goal is to see that Venice was also
powerful on the ground and locked the
key sites for exchanges and money :
rivers, cities, roads.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
14
15. o
Political structure (T3)
An evolution of the political and
administrative structure. The political and
administrative structure of Venice is
special. It’s a complex game of control
and retro-control. The objective here is to
visualize and to understand over the
years, how this system has been built and
what are the events at the origin of their
creation.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
15
16. o
Venetian cartography (T4)
History of Venetian Mapping from
Middle-Age to late Republic : from Fra
Mauro to Albrizzi Understand the
complex issues involved with mapping and
geographical representations in different
times. Following the work of prominent
Venetian cartographers via prominent
examples available online, visually
highlight the evolution of such craft.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
16
17. o
3D and procedural modeling (MP)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
17
18. o
A 3D model of the Venetians ships (MP1)
A 3D model of the Venetians ships
(Galleys, Coques, Bucintauro...). The goal
of this project is to reconstruct in 3D the
model of some kinds of ships (including
the inside of ships !), based on the
documentation gathered by the DHLAB.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
18
19. o
Architectural grammars (MP2)
Automatic extraction of facades building
based on a picture. The objective of this
project is to build a system to extract the
architectural grammar of a building based
on a single picture and to use the
resulting models to recreate unknown
building using procedural approaches.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
19
20. o
The Lepanto battle (MP3)
A simulation of the Lepanto battle. The
Lepanto battle is still (with Trafalgar) one
of the greatest naval battles of the history.
It’s well documented and painted. The
objective of this project is to enter the
core of the battle and to go beyond the
narration or the simple 2D visualizations.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
20
21. o
Galley rowing (MP4)
How to row a Galley There were different
ways to row. The objective here is to
show in an interactive and didactic
manner the technics for moving those
giants of the seas.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
21
22. o
Facades of Venice (MP5)
A complete model of all the facade of
Venice. The goal of this project is to
create a database of all the facades of all
the buildings of Venice. The starting
point will be some existing 3D models like
one of Google Earth from which could be
extracted low quality pictures. The
challenge will be to improve these pictures
to create higher resolutions models.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
22
23. o
Data mining and pattern recognition (D)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
23
24. o
Tourists pictures (D1)
A ”Google Street View” of Venice. Based
on a large number of photo taken by
tourists is it possible to build a kind of
”Google street view” of Venice ? What
else can we extract from these pictures ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
24
25. o
Ornaments in print (D2)
Matching techniques. Ornaments in print
offer a unique signature to identify the
origin of a printed documents. The goal
of the project is to extract from a
database of document ornaments
presented on each page and to design a
classifier permitting to attribute a given
set of ornaments to a given venetian
printer. The tool could be used to map
the diffusion of venetian prints
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
25
26. o
Citations of the archive (D3)
Text mining. The goal of this project is to
identify which sections or documents of
the Archivio di Stato are most often used
by scholars. The project could use text
mining techniques on articles or scanned
books to create representations of the
parts of the archive that are the most
used
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
26
28. o
Piracy and corsairs (S1)
A representation of the piracy/corsairs
areas in the Mediterranean Sea. Pirates
and corsairs are where the high values
cargoes are transiting. The project can
model one type or another or follow some
famous characters. The objective is to
localize the dangerous areas and the
conflicts with the Venetians maritime
routes.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
28
29. o
High values cargoes' networks (S2)
A representation of the high values
cargoes’ networks (silk, pepper, spices,
sugar, wood, metal, cotton, slaves...) The
objective is to model the network for
trading pepper, cotton, salt, slaves ...
from their countries of origin. This project
can be easily divided into several
subprojects focusing on one good.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
29
30. o
Pilgrimage (S3)
Pilgrimage from Venice to Jerusalem.
Testimonies are a great source and
important source of information. The idea
here is to extract the information from a
pilgrim about the trip on board of a
Venetian galley and to model the trip.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
30
31. o
Concurrent networks (S4)
A representation of the concurrent trades
at sea (Genovese, Pisano, Catalans,
Spanyards...) Everyone has an archenemy.
Venice had some for quite some time and
the major one was Genoa. The objective
here is to localize the main ports and
stopovers and to model their shipping
lanes.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
31
32. o
Algorithm models for maritime routes (S5)
Algorithm models for maritime routes.
The objective is to model itineraries
automatically when the stopovers are
known and to add collateral data such as
winds, currents, speeds known for the
ships used, etc.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
32
33. o
Route planner (S6)
A Mediterranean route planner Based on
the data available about the Venetian
ships, can we built a Mediterranean route
planner ? If I am in Corfu in june 1342
and want to get to Constantinople, when
can I take a boat ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
33
35. o
Financial networks (F1)
The objective of this project is to model
the the complexity of the market and the
incoming/leaving flows of money in the
Venetian empire.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
35
37. o
Venetian prints (P1)
Mapping the venetian prints in Europe
Quantitative outlook through mining of
online catalogues. What was printed and
when ? Where is it now ? Query online
catalogs for Venetian printed old books
(i.e. before 1797), build a database out of
that. Make the database accessible via a
geomap, and add a time slider. What can
you conclude about Venetian printing
industry on the long run ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
37
38. o
Mapping the printing industry inside Venice (P2)
Mapping the printing industry inside
Venice Take de’ Barbari’s map, make it
interactive with information about the
position of the different printing shop,
academies and other places of culture.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
38
39. o
Coevolution of the city with its environment (E)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
39
40. o
Acque Alte (E1)
A representation of the Acque Alte. How
can we model the rising level and the
floods in Venice ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
40
41. o
The Plague (E2)
Venice and the plague The plague’s
epidemics have been strong during the
Middle-Age and Venice as a big city has
been hit badly. The idea is to visualize the
propagation of the disease into town as
well as the major changes the Venetian
administration in order to handle the
epidemics (quarantine, doctors,
lazaretto...).
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
41
42. o
Life in Venice (L)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
42
43. o
Demography (L1)
Representation of the demographic
evolution. Venice was one of the most
populated cities during the Middle-Age. A
few information is available. How did
Venice grow ? Where are the major
incidents ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
43
44. o
Famous characters (L2)
Following a famous character in Venice.
What are the differences between the
Venice of Goldoni and the Venice of
Byron ? What were the building they
could have visited, where they were
meeting friends, hanging out. Can we
follow them into town ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
44
45. o
Venetian cryptographies (L3)
Spies, code-crackers and ciphering Some
of the first code-crackers were working in
Venice, as Giovanni Soro at the beginning
of the 16th century, known as the father
of modern cryptography. What did
ciphers look like at the time in Venice ?
How and when were they used ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
45
46. o
Visual representations of power (L4)
Visual representation of power : public
ceremony and the enforcement of social
hierarchy Get a scholarly understanding of
the socio-political implications of public
ceremonies via literature. Select
meaningful paintings (or other sources),
and build a visual explanation of (some
of) these events. The project could do
comparisons or highlights of
relations/differences.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
46
47. o
A Facebook of the Venetian elite (L5)
A Facebook of the Venetian elite Based
on pictorial and textual source, recreate a
database of the Venetian elite, with
images of all the most important
characters.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
47
48. o
You can invent your own projects
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
48
49. o
In the last course we learn how to digitize documents.
Today we are going to learn how to code the content of a
document in a structured format. Next week we will see
how we can automatise such kind of encoding through
pattern recognition.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
49
50. o
The topic of today is the transformation of an image into
information : A datafication process.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
50
51. o
A special form of encoding is done by the palaeographers
when they transcribe document and produce critical
editions.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
51
52. o
They are many kinds of editions, with different focuses.
Some focus only on the textual content (the immaterial
part), others describe also aspects of the document itself
(the material part). It all depends on the goal and
expected usages of the transcription.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
52
53. o
In this course, we will take a more general view on this
problem, by introducing the concept of regulated
representation.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
53
54. o
Most documents are regulated representations
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
54
55. o
A regulated representation is a representation governed
by a set of production and usage rules.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
55
56. o
Examples of regulated representations
• A list of names
• An accounting table
• A family tree
• A map of a region
• A Census
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
56
57. o
Maps as regulated representations
• There are conventional presentation rules to follow when creating a map,
like indicating the scale or the direction of the North and conventional
methods to follow in order to create the map contents.
• In terms of usage, one must learn how to read a map. This map-reading
skill also involves many related skills for handling the map, orientating
oneself in front of it, etc. These skills are either taught or learnt by
imitation.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
57
58. o
Information is encoded using a given method and
decoded using another method
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
58
59. o
Using regulated representation is an act of
communication. Regulated representations impose a
structure to create a channel. On this channel some
information can be transmitted.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
59
61. o
Regulated representation change over time
• They usually tend to become more regular.
• The general process of this regulating tendency is the transformation of
conventions into a mechanisms
• The regulation usually proceeds in two consecutive steps :
• mechanizing the representation production rules
• mechanizing its conventional usages.
• Ultimately, through this process, regulated representations tend to
become machines
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
61
73. o
How maps have become machines
• From a tool to machine : By becoming a machine, maps have
internalized their own usage rules. As machines, they offer much more
possibilities than traditional maps. However, these various new modes of
usage are explicitly programmed.
• As maps became machines, they are progressively merged into a global
mechanic system in which a multitude of maps became aggregated into a
single one. As regulated representations get more regular, they tend to
aggregate into unified systems.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
73
74. o
How maps have become machines
• The relation of maps to time changes during the mechanization process.
When the map gets fully mechanized, the image of a map becomes just a
transitory state that can be automatically updated at any moment to
reflect more accurately the state of the earth.
• Mechanization changes where the value lies. What is of value in the new
associated economy is not the map contents but the traces of usage left
by the map readers.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
74
75. o
In the remaining part of this course, we are going to
consider only textual documents (we will talk about maps
and other kinds of objects in another course)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
75
76. o
Regulated textual document are characterized by specific
layout or internal structuring.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
76
79. o
Each family of documents is characterised by a common
structure. Each document is characterised by specific
textual content.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
79
80. o
The printing press revolution is an important step in the
history of regulated textual representations
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
80
81. o
Principle of the printing press
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
81
89. o
Form ready for printing
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
89
90. o
Printing press chronology (beginnings)
• Woodblock printing (China, VII c.e.)
• Cast metal movable type (Korea XIII c.e.)
• Paper starts to be used in Europe (rag paper), from Asia. First paper
mills (XIV)
• Block printing in Europe, esp. cheap devotion publications (beginning
XV) / Block book
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
90
91. o
Biblia Pauperum, Wood Blocks, No movable types
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
91
100. o
Printing press chronology (end)
• Iron press printing (Stanhope press, 1800)
• Rotary printing press (1818, Napier)
• Electrotyping (1838)
• Type-composing machine (1841)
• Industrial paper made from wood pulp (1870)
• Linotype machine (1886)
• First Xerox inkjet printer (1955)
• First 3d printer (1984)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
100
101. o
How to encode the content of a document ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
101
102. o
Textual document are multidimensional
• the linguistic dimension (text, grammatical rules)
• the semantic dimension (what the word means)
• the literary dimension (style, rhetorical features)
• the graphemic dimension (the kind of letter forms used to represent sounds)
• the iconic dimension (the ornaments in the document)
• the codicological dimension (the study of the manuscript itself)
• All these dimensions can be studied separately (cf. Elena Pierazzo on Digital Scholarly
Editing http://www.elenapierazzo.org/)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
102
103. o
Coding the content of a document depends of the
purpose of a study
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
103
105. o
What is XML and what is it good for ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
105
106. o
What is XML and what is it good for ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
106
107. o
What is XML
• XML stands for eXtensibile Markup Language
• To write in XML you write text with tags : ¡atag¿ my text ¡/atag ¿
• This can be done in any text editor.
• XML is a W3C recommandation
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
107
108. o
4 characteristics of XML
• XML is used to describe data, not to display them. XML does nothing. It
describes.
• XML tags are not predefined. You can define your own tags. This gives you a
lot of freedom to describe the structure you want to describe.
• When you are satisfied with your structure, you can fix our XML language by
writing a DTD (Document Type Description). Thus, XML permits both
fluidity and then rigor.
• XML is designed to be self-descriptive and easily readable. It is used to write
pivotal descriptions in production chains.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
108
109. o
Genealogy of XML
• In the 50s, the first computers could not communicate with one another, if
they were from different brands.
• In the 60s, IBM creates GML (Generalized Markup Language) to enable data
exchanges and make the data structure explicit. This is a great success. It
becomes a standard : SGML (Standard Generalized Markup Language). The
US fed gov. adopts it.
• In the 90s, Tim Berners-Lee at CERN creates the HTML language using a
subset of SGML. HTML get specialized in displaying data but does not
impose a standard way for describing data. A group of researchers imagines
another language to do this. The first version of XML is ready in 1998.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
109
110. o
HTML vs XML
• XML is a markup language like HTML.
• XML is not a replacement of HTML. The two languages have different goals.
• XML is for the transport and the description of structured data.
• XML does nothing. It just describes.
• XML is like a database in plain text.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
110
111. o
Structure of an XML file
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
111
114. o
With XML, you can create your own tags.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
114
115. o
The header specifies the XML version and the encoding
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
115
116. o
An XML file is like a tree
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
116
117. o
Is this a problem ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
117
118. o
DTD (Document Type Description)
• A well-formed XML document follows
the general rules of XML syntax.
• A valid XML document follows the
specific rules written in a DTD
(Document Type Description)
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
118
119. o
When to use a DTD
• To use a DTD is not mandatory.
• A DTD permits to agree on common XML dialect.
• Some software permit to check whether an XML file is valid compared to a
given DTD.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
119
120. o
TEI (Text Encoding Initiative) is a family of special XML
dialects for describing the content of documents
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
120
121. o
CSS and XSLT script
• The way an XML file is displayed can be
specified in a CSS stylesheet.
• A document can also be transformed
using an XSLT script. This is now the
recommended method
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
121
124. o
XML is a pivotal format
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
124
125. o
Debate : Is XML the right way for representing the content
of document. What are its strengths and weaknesses ?
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
125
126. o
In two weeks we will learn about a complementary
technique for encoding information : Semantic graphs.
Digital Humanities 101 - 2013/2014 - Course 5 | 2013
126