This is an extensive but high level look at principles, methods, and tools looking to a couple case studies around the use of data visualisation for humanities research.
1. Data Visualisation
in the Digital Arts and
Humanities
Tools, Methods & Techniques to
Put Your Research Data to Work
!
Shawn Day
Queen’s University Library
2. Objective
‣
To appreciate the rich variety of techniques and tools
available to digital humanities scholars for data
visualisation and analysis.
This workshop will provide an introduction to the varied
use of data visualisation in the humanities through
examples, case studies and hopefully inspire you to
some hands-on fun.
4. Upcoming Seminars and Workshops
‣
‣
‣
‣
‣
‣
‣
18 November - A Survey of Digital Humanities
2 December - Engaging Your Auduence with Your Research Data
(Exhibit)
9 December - Telling Stories with Data – Collections Visualisation
for Arts and Humanities Scholars (OMEKA)
January - Digital Project Management
Februrary - Hands On Workshop – Data Visualisation for
Presentation
February - Social Scholarship – Tools for Collaborative Research
March - Data Visualisation for Textual and Spatial Analysis
!
‣
More to come: http://qubdh.co.uk
6. Breakpoint
‣
One of the keys to good visualization is understanding
what your immediate (and longer term) goals are.
!
‣
Are you visualizing data to understand what’s in it, or are
you trying to communicate meaning to others?
!
‣
You - Visualisation for Data Analysis
!
‣
Share with Others - Visualisation for Presentation
7. Why Visualise?
The Basics
‣
Open Up Large Datasets
‣
Increase Density of Observable Data
‣
Reduce Complexity
‣
Aestheticise Data
‣
Illustrate an Interpretation
‣
Make an Argument
8. Why Visualise?
The Psychology and Physiology
‣
Bypass language centres to tap directly into the visual
cortex;
‣
Leverage ability to recognise patterns - what they call
visual sense-making;
‣
Powerful graphics engines now allow for live data
processing and sophisticated animations and interactive
research environments.
9. Why Visualise?
From a Data Perspective
‣
Can link different formats
‣
Can share more easily with others
‣
Can see new meanings and connections
‣
Sort and re-organize in automated fashion
‣
Manage larger amounts of information
‣
Visualise your results
10. Why Visualise?
For Humanities Research
‣
Work with new data to create new knowledge
‣
Explore data to discover things that used to be unknown,
unknowable or impractical to know
‣
Take a new perspective on the familiar to reveal
previously hidden insights
19. How Could You Use Data Analysis?
‣
‣
‣
‣
“In the Lab” - for your own analysis
Online as part of collabourative groups
Through dissemination for extension of own work crowdsourcing
Others?
28. Mechanics of the Process
‣
Generate word frequency (Voyant, TAPoR)
‣
Isolate known farm activities (NLP - LanguageWare)
‣
Collocate to link activity references to time, duration, and
resources (Voyant)
32. What is the Value of this Visualisation?
‣
‣
‣
Easier to compare over intervals
Multiple vectors with greater granularity in a compressed
space
The challenge is to find rich enough source materials to
yield substantive datasets
37. Case Study: Occupations of Politicians
‣
What are we studying?
• Self-declared occupations of politicians
‣ Why?
• What bias might they bring to their job?
‣ How?
• Visualising past occupation and mapping to political
platform of party affiliated with
41. The Result/ New Patterns
‣
‣
‣
The emergence of the professional politician with no
private sector experience
Occupational continuity across changes in governing
party
http://dev.dho.ie/~sday/dail/index.html
44. The Value of Data Vis for Analysis
‣
New ways of presenting allow new ways of seeing
‣
Hidden patterns become evident
‣
Suggest other hypotheses to test for
‣
Good research raises more questions than answers
46. General Steps in Data Vis for DH
1.Discovery / Acquisition
2.Cleaning / ‘Munging’
3.Analysis / Exploratory Vis
4.Presentation
47. Types of Data to Visualise
‣
‣
‣
‣
‣
Audio Data
Categorical Data
Cartographic Data
Collections
Image Data
• Still
• Moving
‣ Metadata
‣ Multimedia Data
‣ Network Data
• Social
• Other
‣ Numerical Data
‣ Temporal Data
‣ Textual Data
• Narrative
• Qualitative
‣ ????
49. Audio Data
‣
What does sound look like?
Visualisation of "Canada is Really Big" by The Arrogant Worms”
http://www.sonicvisualiser.org/
50. Audio Data: The Shape of Song
‣
‣
‣
http://www.turbulence.org/Works/song/index.html
Measuring Musical Patterns using Translucent Arcs
Repetition
Phillip Glass, Candyman 2
Madonna, Like a Prayer
51. Audio Data: IBM ‘Glass Engine’
http://www.philipglass.com/glassengine/
52. Categorical Data
‣
‣
‣
Data is grouped into categories based on a qualitative
trait,
The resulting data represents the labels of these groups.
Nominal, Ordinal
and/or Binary
56. Digital Collections
‣
Collections of data, images, movies, sound … etc
• Visualise the
object in
context as
part of
collection
• Represent
the structure
of the
collection
61. Numerical/Quantitative Data
‣
Does anyone really need me to tell them about this?
• Analysed using statistical methods
• displayed using tables, charts, histograms and graphs…
62. Social Network Data
‣
‣
Nodes and Edges
Representing relations and quantifying and qualifyign the
same between objects
74. Visualizing What?
‣
Basic types of content that we are used to deal with:
• Text
• Numbers
• Image
• Video
‣ Other, more “complex” stuff:
• Relations, connections, links - a genealogy
• Time and space coords - the path of migratory birds
• Animations – a piece of courseware
• 3D models – the plan of your house
78. Cleaning / Munging
(Normalisation, Format Conversion)
‣
Tools:
• Data Wrangler
• Google Refine
• Mr. Data Converter
!
‣
Data Wrangler
• Does simple, split, clear, fold/unfold transforms on data
• See example --> Data and Script
!
‣
Google Refine
• Works with larger datasets
84. Now You’ve Got Data ...
‣
‣
‣
What’s Next?
Data Visualisation in the Analysis Process
Data Visualisation for Presentation
85. General Steps in Data Vis for DH
1. Discovery / Acquisition
2. Cleaning / ‘Munging’
3. Analysis / Exploratory Vis
4. Presentation
86. Breakpoint
‣
Are you visualizing data to understand what’s in it, or are
you trying to communicate meaning to others?
‣
You - Visualisation for Data Analysis
‣
Share with Others - Visualisation for Presentation
87. Google NGram Viewers
‣
Examine word frequency in digitised books
‣
Currently about 4% of books ever published
‣
In English, Chinese, French, German, Hebrew, Russian, and
Spanish
‣
Changes in word usage
‣
Trends
89. The Value of Data Vis for Analysis
‣
New ways of presenting allow new ways of seeing
‣
Hidden patterns become evident
‣
Suggest other hypotheses to test for
‣
Good research raises more questions than answers
91. Data Analysis Principles
1. Process is a Way of Thinking, not a Substitute for
Thinking
2. Data needs to be considered and reported in Context
3. Look Before you Leap - Get to Know Your Data
4. Question Everything - CollectionProcess, Bias, etc.
5. Do a Gut Check
6. Coincidence is Not the Same as Causality
7. Just Because Data Exists Doesn’t Mean its Relevant
Fern Halper - Seven Guiding Principles
100. Objectives
‣
‣
‣
Consider best practices in sharing research findings
using visualisation tools;
Identify and judge between publicly available tools to
create and deploy humanities visualisation research
products;
Consider data visualisation as part of a larger research
discussion.
101. General Steps in Data Vis for DH
‣
‣
‣
‣
Discovery / Acquisition
Cleaning / ‘Munging’
Analysis / Exploratory Vis
Presentation
105. Data Visualisation Lessons from Tufte
‣
‣
‣
‣
‣
‣
‣
‣
‣
Show the Data
Provoke Thought about the Subject at Hand
Avoid Distorting the Data
Present Many Numbers in a Small Space
Make Large Datasets Coherent
Encourage Eyes to Compare Data
Reveal Data at Several Levels of Detail
Serve a Reasonably Clear Purpose
Be Closely Integrated with Statistical and Verbal
Descriptions of the Dataset
106. What Visual Techniques Exist?
‣
‣
‣
‣
‣
Connecting your data with the right visualisation
What is your message?
How do we know what we might use?
Start with your Exploratory/Research/Analytical
Environment
How do visuals fit into your narrative?
107. What Visual Techniques Exist?
Connecting your data with the right visualisation
r data with the right visualisation
Visual Everything
108. Structured Data Presentation Tools
(a tiny subset)
‣ Webservices
• Temporal: TimeFlow
• Google Fusion Tables
• Textual, Spatial and
Numeric: Many Eyes
• Temporal: Dipity
• Infographics:Visual.ly
!
!
!
!
‣ Frameworks
• GraphViz
• Gephi
• Prefuse
• D3
• Processing
• Exhibit (Exercise)
113. Visual.ly
‣
‣
‣
‣
Well crafted Infographics gaining credibility
The new poster presentation
Data-driven narrative in words and pictures
Visual.ly currently driven by social media
115. Frameworks and Languages
‣ GraphViz
‣ R Programming Language
‣ JIT (JavaScript Infovis
Toolkit)
‣ Protovis
‣ D3
‣ Processing
‣ Tableau
‣ Prefuse
‣ Gephi
‣ WEAVE (http://
www.oicweave.org/)
!
‣ Exhibit (Exercise)
116. Graphviz
‣
‣
‣
‣
‣
An Open Source Framework
Mature (1988)
AT&T Labs
Used as a basis for subsequent
A great prototyping and starting point
!
!
!
!
!
‣
http://www.graphviz.org/
117. R Programming Language
‣
‣
‣
‣
‣
Geared towards statistical analysis
More recently has had some powerful graphics
frameworks added
Open Source
Typically Command Line but a variety of GUI editors
available
> Jeff Rydberg-Cox: R for the Digital Humanities
118. JavaScript InfoVis Toolkit (JIT)
‣
‣
‣
‣
JIT Demos (http://thejit.org/demos/)
The JavaScript InfoVis Toolkit is a complete set of tools to
create Interactive Data Visualizations for the Web. It
includes JSON loading, animation, 2D point and graph
classes and some predefined tree visualization methods.
Smaller datasets in a clean form
Related and Aggregated/Categorised Data
122. ProtoVis
‣
‣
‣
‣
Protovis is a visualization toolkit for JavaScript using SVG.
It takes a graphical approach to data visualization,
composing custom views of data with simple graphical
primitives like bars and dots. These primitives are called
marks, and each mark encodes data visually through
dynamic properties such as color and position.
Jerome Cukier: ProtoVis Tutorial
Development shifted to D3
ProtoVis still very accessible and usable
125. D3
‣
‣
D3 allows you to bind arbitrary data to a Document
Object Model (DOM), and then apply data-driven
transformations to the document. As a trivial example,
you can use D3 to generate a basic HTML table from an
array of numbers. Or, use the same data to create an
interactive SVG bar chart with smooth transitions and
interaction.
Open Source
127. Processing
‣
‣
‣
‣
‣
‣
‣
‣
‣
‣
Now we are getting serious...
Ben Fry
Like R has a serious statistical bent
Has a client and development environment, but deploys
easily to the web using processing.js
Large and VL datasets
Good with related data
Serious support for aesthetics
Modelling Environment
http://processing.org/
http://www.openprocessing.org/
131. Tableau
‣
‣
‣
‣
‣
Commercial
Offers a Free Public Application
Encourages sharing and focusses on building a narrative
around visualisation of your research data
Education and Non-Commercial Licenses available
Mature and evolving rapidly to demonstrate the newest
and most exciting visualisation types
134. Gephi
‣
‣
‣
‣
‣
‣
Open Source
Mapping and Visualising Relationships and Networks
An outstanding Visual Development Environment
Multiplatform
Extensible!!
https://gephi.org/
137. Where to go further
‣
‣
‣
‣
‣
‣
‣
‣
‣
DIRT (Digital Research Toolkit)
Timeline Tools
Visualisation in Education
Visual Complexity
DataVis.ca
R: A Tiny Handbook of R - Springer
Using R in DH
MONK
http://datajournalism.stanford.edu/
138. Upcoming Workshops
‣
‣
‣
‣
‣
‣
‣
18 November - A Survey of Digital Humanities
2 December - Engaging Your Auduence with Your Research Data
(Exhibit)
9 December - Telling Stories with Data – Collections Visualisation
for Arts and Humanities Scholars (OMEKA)
January - Digital Project Management
Februrary - Hands On Workshop – Data Visualisation for
Presentation
February - Social Scholarship – Tools for Collaborative Research
March - Data Visualisation for Textual and Spatial Analysis
!
‣
More to come: http://qubdh.co.uk
139. Thank You
Shawn Day - s.day@qub.co.uk - @iridium
!
The Library/Institute for Collaborative Research in the Humanities
18 University Square
Ground Floor
http://qubdh.co.uk