Presentation by Tessa Hauswedell and Melvin Wevers at DH Benelux 2015. In the presentation, we show how we have used computational tools to research how the British newspapers the Pall Mall Gazette has reported on the British Empire.
Reporting the Empire: The Pall Mall Gazette 1870-1900
1. Reporting the Empire
The Pall Mall Gazette 1870-1900
Tessa Hauswedell, Asymenc, University College London
Melvin Wevers, Translantis, Utrecht University
2. Research Questions
● What insights can distant reading methods provide into
the reporting of empire?
● Can we detect trends, patterns that will give us new
insights into how newspapers reported about the
colonies - and the British Empire overall at home to its
readers, on a larger timescale (30 years, 1870-1900)
3. The Pall Mall Gazette
● Digitised version from 1870-1900 with high quality OCR
● London-based evening newspaper
● Small circulation but influential editor William Stead
● Newspapers played a vital role in communicating the British empire at
home, increasingly so from the 1870s onwards with advent of professional
news agencies and improvements in technology (underwater cables etc)
which enabled speedier reporting of news.
4.
5. British Newspapers and Empire
● Widespread view amongst historians that the press was
crucial in gaining support for the empire domestically.
● But few actual studies conclusively demonstrate this.
● Studying smaller newspaper might therefore add more
nuance to th picture.
6. Methods Used
● Four techniques from corpus linguistics:
o Word Frequencies (AntConc)
o Named Entity Recognition (StanfordNER) +
Geomapping (Google Fusion Tables)
o Topic Modeling (Mallet)
o Collocation Analysis (AntConc)
7. DH Workflows (1) NER
● Extract entities: locations over 30-year
period
● January and July newspapers (n>20)
● Used the output to generate a heatmap in
Google Fusion Tables
9. AntConc Frequencies (2)
● Count the mentioned of cities
● Dominance of Paris > bump in 1870-1871 Franco-
Prussian War > not for Berlin though
● New York most prominent American city
● Fall in mentions of European vis-a-vis American cities
(90 percent in 1870 to 78 percent in 1900)
10. DH Workflows (3) - Topic Modeling
● Topic model for each year 1870-1900 (50 topics, 15 words, 200
iterations)
● Using TextVoyant > distinctive locations
o 1878: Russian, Treaty, Russia, Cyprus
o 1882: Egypt, Egyptian
o 1890: Ireland, Chicago, Colony
o 1900: Peking, Paris, China, Bloemfontein, Sydney
● Find topic words for these words
11. Topic Modeling (1878 & 1890)
● jan peace armistice war january interests russian aged constantinople pasha dec russia england turkish
grand
● engraved russian russians forster bread fortifications turk humour gipsy bulgarians organ lewis statue
loud orloff
● oct berlin light started cyprus church november collision envoy asia treaty river gas socialist musical
● feb fleet russia constantinople pope russian meeting conference derby february dardanelles gallipolli
british conditions div
● colonies colony Conference hon penny National labour postage Parnell pupil Bench Museum expenditure
Randolph colonial
● Parnell Mr Irish PARNELL party Ireland leader Rule Koch thle Home leadership Shea Dr political
● Teufel October September Ireland nurses Congress Birchall Comte Church PILLS hospital Hospital life Sept terrier
● calling Victoria Trains Class Returning lead Junction Chicago Fare Fares Isle avenue Guards yards Return
● lIlocblob April Stanley artist Davis licences Goschen ROAD Jones WEDDING painter Chicago Fred PRESENTS
collieries
12. Cluster / Collocation Analysis (AntConc)
● AntConc Enables concordancing and
collocations, as well as cluster or n-gram
analysis
● In sets of ten years
13. Cluster Analysis Right, Example Output
Total No. of Cluster Types: 39
#Total No. of Cluster Tokens: 1475
1 742 10 the british
2 354 10 of the british
3 50 10 in the british
4 35 9 to the british
5 20 9 and the british
6 20 9 that the british
7 17 8 part of the british
8 17 6 throughout the british
9 16 7 for the british
10 15 7 parts of the british
11 11 6 portion of the british
12 9 6 all parts of the british
13 9 2 colonies of the british
14 9 4 up the british
15 9 7 with the british
16 8 2 classes throughout the british
15. Conclusions
● Use of different techniques to create contrast
● Every method has shortcomings
● International outlook of Empire
● We have found strands to further investigate using
directed close-reading keyword searches
Notas del editor
Given what we know from historical research about the "Empire" and the role of newspapers etc , what can digital humanities, or distant reading methods provide in terms of new insights into the reporting of empire?
[01/06/15 10:09:30] Tessa Carin Hauswedell: That way we wouldn't have to have a great argument that we prove but more talk about in the conclusion about whether certain methods are actually useful/insightful.
[01/06/15 10:10:07] Tessa Carin Hauswedell: So that way also the stuff that we did that did not throw up very conclusive results can become worthwhile.
[01/06/15 10:10:33] Tessa Carin Hauswedell: Because we can say that in this case it didn't work out/show what we wanted. But other methods were more useful, etc
Still needs some cleaning up
Increase of references to Eastern Europe
1880-1890 scramble for Africa:
Africa > South Africa, Sudan, Egypt
not a lot of references to South America and the Middle East.
India, Australia are a constant. Australia more regional
Middle East is not very present > references in region Afghanistan.
Very international for a alledged “london-based” newspaper.
Concordancing and text analytics
1878
Two main events seem to appear within these topics. First, the Russo-Turkish War (1877-1878). The British forced the Russian to accept the truce that had been offered by the Ottomon Empire. The second event was the Cyprus Convention of 4 June, 1878 in which Cyprus was given to Great Britain in exchange for support of the Ottomans during the Congress of Berlin.
1890
The first two topics seem to refer to a widely-publicized court case against the Times that was sued by Irish parliamentarian Charles Stewart Parnell. Later Parnell was involved in a divorce case. He met Katharine O’Shea when she was married to MP Dr. O’ Shea. This caused a lot of commotion, and ultimately led to the political downfall of Parnell. The third topic seem to be on the social situation of nurses in the United Kingdom. This was heavily debated in the Newspapers. The reference to Birchall involved the murdered Reginald Birchall who fled the United Kingdom for Canada. In the last topic Chicago most definitely appears because of the 1890 Chicago World Fair.
This list of topics reveals a more tabloid style of journalism, where the focus is more on social life and intrigue, whereas the first two example contained a more straightforward factual style of reporting.
Min = 2 Max = 5 n-gram to the left and right
Thus far we have presented different ways and methods of approaching a diachronic corpus such as this one. Each of these methods has presented both opportunities in that they serve to focus our angle on the newspapers in different ways, and in doing so they all add pieces to the puzzle that we are trying to solve. Yet, they each also come attached with certain shortcomings and failings, which we as researchers always need to correct and counteract. Going back to our research question, how well does each of these methods help us establish the Pall Mall Gazette and its long term trends regarding its reporting on the British Empire? Can we establish relevant pattern, trends or trajectories? Visualisations help us to quickly and meaningfully establish what places the Pall Mall reported on - but as we have mentioned the NER process and is error prone. Further, in order to provide more conclusive res By focusing on specific cities, we can further add complexity to the picture and show some shifts Topic modeling provides us with a useful step to enter the data on a semantic level and to gain a sense of the corpus we are dealing with. It offers ways to find those debates on Empire/ colonies that we are looking for…
Cluster analysis offers a deeper engagement of the text from the topic to the word level. It allows us to show subtle changes in the way the term British Empire is inflected over a given stretch of time. The results are promising and provide us with an indication of renewed close reading of articles which in turn can add to our understanding on the way in which newspapers reported on the Empire for the readers at home.