Exploratory data analysis carried out for Open Data Impacts study, MSc dissertation submitted to Oxford Internet Institute, University of Oxford, July 2010.
See http://www.practicalparticipation.co.uk/odi/survey for details and underlying data.
1. Exploratory data analysis: OGD user motivations
The Potential of Open Government Data (OGD) as a Tool in Democratic
Engagement and Reform of Public Services: The Case of Data.gov.uk
A presentation of data as an online appendix to Section 4.1 of the written
research report. To be used in conjunction with the written report only.
Tim Davies, MSc Student, Oxford Internet Institute
Exploratory Analysis: July 2010.
www.practicalparticipation.co.uk/odi/ | tim@timdavies.org.uk | @timdavies
2. What is this presentation
I have used this presentation as a working notebook whilst bringing together my
exploratory analysis of the motivations of OGD users according to the survey data
from the Open Data Impacts Survey.
It is not a presentation of findings, but shows the process I’ve gone through in
exploratory analysis.
Stages are left in, even if they don’t illuminate the final conclusions in order to help
ensure a transparent process.
R commands
Analysis carried out in R. Relevant commands given throughout.
These are mainly an aide-memoir for my own use, but, in conjunction with the shared data, should
also support replication of the results.
3. Overview:
• Question: What motivates users of Open Government Data (OGD)
• Data source: 72 responses from an opportunistically sampled online survey
delivered between May 11th and June 14th 2010 targeted at users of OGD.
• Data: Responses to a range of ‘Strongly Agree -> Strongly Disagree’ based
likert scales on ‘Attitudes’ towards OGD and ‘Motivations’ for working with
OGD. Also responses to a range of statements about OGD use projects from
44 individuals within the sample.
• Analysis: Using correlation analysis, exploratory factor analysis (Bartholomew
et al. 2008, 7 - 9; Costello & Osborne 2005; Lawley & Maxwell 1962) and
bootstrapped cluster analysis (Suzuki & Shimodaira 2006) in R.
Bartholomew, D.J. et al., 2008. Analysis of Multivariate Social Science Data, Second Edition (Chapman & Hall) 2nd ed., Chapman and Hall.
Becker, R.A., Chambers, J.M. & Wilks, A.R., 1988. The new S language: a programming environment for data analysis and graphics.
Costello, A.B. & Osborne, J.W., 2005. Best practices in exploratory factor analysis: four recommendations for getting the most from your analysis. Practical Assessment,
Research & Evaluation, (7), 1–9.
Lawley, D.N. & Maxwell, A.E., 1962. Factor Analysis as a Statistical Method. Journal of the Royal Statistical Society. Series D (The Statistician), 12(3), 209-229.
Suzuki, R. & Shimodaira, H., 2006. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12), 1540.
4. Caveats
• Sample size: Too small to allow any strong conclusions to be drawn
• Items: Survey items were based on an initial literature review and exploratory
research, but, due to time constraints, were not subjected to prior pilot testing.
Refinements of the items and new data collection would strengthen the available
data for future analysis.
• Holistic interpretation: The interpretation presented draws upon contextual
information gathered during exploratory research.
• Interactive presentation: I’ve tried to include enough detail in this presentation to
account for my analysis in Section 4.1 of the related dissertation. However, if the
reader feels any analysis is not fully explained or does not provide adequate
justification, questions are welcome on the blog post relating to this presentation at
http://www.practicalparticipation.co.uk/odi/
5. 1: Correlation between motivations
• Q: “People are interested in open data for many different reasons. Thinking
about your own engagement with open government data, how important are
the following motivations for working with or using open government data?”
• A: ‘Not at all important’ to ‘Very Important’ coded on a scale -2 to +2.
• Missing answers: Coded as 0 (neutral)
• Data: OpenDataImpacts-SurveyMotivationData.csv (Available online as Google Spreadsheet)
• Analysis: R correlation matrix, coded and ordered for strength or correlation.
Clusters interpreted. R commands
> library(gclus)
> dta<-motivations
> dta.r<-abs(cor(dta))
> dta.col<-dmat.color(dta.r,colorRampPalette(c("white","orange","red"))(100))
> dta.o<-order.single(dta.r)
> cpairs(dta,dta.o,panel.colors=dta.col,gap=.5,main="Variables ordered and colored by correlation")
8. 1: Correlation between motivations
1) Government focussed
Understanding, Efficiency, Accountability
1
9. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
1
10. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
1
11. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
1
2
12. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
1
2
13. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
1
2
3
14. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
1
2
3
15. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
1
2
3
16. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
1
4
2
3
17. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
4) Digitising government focus
A computerisation movement?
1
4
2
3
18. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
4) Digitising government focus
A computerisation movement?
1
4
2
3
5
19. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
4) Digitising government focus
A computerisation movement?
1 5) Problem solving focus
Develop new skills to meet interesting challenge
4
2
3
5
20. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
6 Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
4) Digitising government focus
A computerisation movement?
1 5) Problem solving focus
Develop new skills to meet interesting challenge
4
2
3
5
21. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
6 Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
4) Digitising government focus
A computerisation movement?
1 5) Problem solving focus
Develop new skills to meet interesting challenge
6) Social/public sector enterprise
4 Companies providing local focussed services
2
3
5
22. 1: Correlation between motivations
Also linked to make a
1) Government focussed difference...
Understanding, Efficiency, Accountability
2) Technology innovation focussed
6 Creating platforms, with semantic web
3) Reward focussed In open tech community
recognition is key asset for
Recognition or profit gaining future work...
4) Digitising government focus
A computerisation movement?
1 5) Problem solving focus
Develop new skills to meet interesting challenge
6) Social/public sector enterprise
4 Companies providing local focussed services
E.g. Environmental
2 mapping; Transport
service using OGD.
3
5
23. 2: Cluster analysis of motivations
• Bootstrapped cluster analysis
shows similar patterns.
• But no strong p values. I.e.
weak & overlapping clusters as
seen in correlation analysis.
• Adds curiosity to
government cluster. Fits with
participant-observation
evidence that interest in
government is secondary to
interest in technology for
many OGD users.
R commands
> library(pvclust)
> motclust<-pvclust(motivations,nboot=100000)
> plot(motclust)
24. 3: Additional analysis
R commands: non-graphical screen test solutions
> library(nFactors)
• Whilst a number of exploratory factor analysis solutions > ev<-eigen(cor(motivations))
can offer insights during interactive exploration of the data, > ap<-
parallel(subject=nrow(motivations),var=ncol(motivations),re
the non-graphical scree test solution suggests between 8 p=100,cent=0.05)
and 10 factors, almost as many as there are items: thus > nS <- nScree(ev$values, ap$eigen$qevpea)
Factor analysis is of limited additional use. > plotnScree(nS)
R commands: factor analysis
> factmot3<-factanal(motivations,3, rotation="promax",
• kmeans analysis can be used to look at cluster sizes by scores="regression") # The three factor solution is the
most intuitively interpretable.
discretely dividing the data - but the overlapping nature of > print(factmot3,cutoff=0.3,sort=T,digits=2)
the clusters we are identifying suggests this is also of limited
utility. Interactively exploring different cluster sizes and the R commands: kmeans analysis
size of those clusters does however provide some additional > factkmeans5<-kmeans(t(motivations),5)
> #5 kmeans factors are interpretable
insights which can then be checked against other survey
data.
• Qualitative reading of particular individual cases in each cluster, and additional exploratory statistical analysis
can support further identification & description of the different forms of motivations OGD users have.
• E.g. Those motivated by ‘Specific problems’ want ‘bulk Data’ and are not interested in SPARQL and building
platforms; whereas those interested in platform building and developing new skills have strong interest in
semantic web technologies.
25. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
26. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
1) Government focussed
Understanding, Efficiency, Accountability
27. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
1) Government focussed 2) Technology innovation focussed
Understanding, Efficiency, Accountability Creating platforms, with semantic web
28. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
1) Government focussed 2) Technology innovation focussed
Understanding, Efficiency, Accountability Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
29. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
1) Government focussed 4) Digitising government focus 2) Technology innovation focussed
Understanding, Efficiency, Accountability A political or computerisation movement? Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
30. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
1) Government focussed 4) Digitising government focus 2) Technology innovation focussed
Understanding, Efficiency, Accountability A political or computerisation movement? Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
5) Problem solving focus
Develop new skills to meet interesting challenge
31. Analysis of motivations
• O’Reilly blogger Nat Torkington (2010) has suggested there are five types of people with an interested in
OGD:
“… [1] low-polling governments who want to see a PR win from opening their data, [2] transparency
advocates who want a more efficient and honest government, [3] citizen advocates who want services and
information to make their lives better, [4] open advocates who believe that governments act for the people
therefore government data should be available for free to the people, and [5] wonks who are hoping that
releasing datasets ... will deliver…economic benefits to the country”.
• However, we find quite different clusters of motivations in the survey data:
1) Government focussed 4) Digitising government focus 2) Technology innovation focussed
Understanding, Efficiency, Accountability A political or computerisation movement? Creating platforms, with semantic web
6) Social/public sector enterprise 3) Reward focussed
Companies providing local focussed services Recognition or profit
5) Problem solving focus
Develop new skills to meet interesting challenge
32. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
33. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
1) Government focussed
Understanding, Efficiency, Accountability
34. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
1) Government focussed 2) Technology innovation focussed
Understanding, Efficiency, Accountability Creating platforms, with semantic web
35. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
1) Government focussed 2) Technology innovation focussed
Understanding, Efficiency, Accountability Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
36. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
1) Government focussed 4) Digitising government focus 2) Technology innovation focussed
Understanding, Efficiency, Accountability A computerisation movement... Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
37. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
1) Government focussed 4) Digitising government focus 2) Technology innovation focussed
Understanding, Efficiency, Accountability A computerisation movement... Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
5) Problem solving focus
Develop new skills to meet interesting challenge
38. Further analysis
R commands: identify clusters & create new matrix
> c1<-c(7,9,2)
> c2<-c(13,10,3)
• We can take these clusters and create a new matrix of > c3<-c(3,11,4)
> c4<-c(9,2,13,10)
individuals rankings against them by summing the individual
> c5<-c(14,8)
variables. > c6<-c(12,6)
> mclusts<-
cbind(as.vector(by(motivations[,c1],c(1:72),sum)),as.vector(
• We can then look at correlations between these and other by(motivations[,c2],c(1:72),sum)),as.vector(by(motivations[,
variables. c3],c(1:72),sum)),as.vector(by(motivations[,c4],c(1:72),sum)
),as.vector(by(motivations[,c5],c(1:72),sum)),as.vector(by(m
otivations[,c6],c(1:72),sum)))
> colnames(mclusts)<-c("GovFocus", "TechFocus",
• For example, looking at correlations between preferred "RewardFocus", "DigitalGov", "ProblemSolving",
methods of data access and user motivation. "Enterprise")
The additional data used is available in Google Docs linked from
http://www.practicalparticipation.co.uk/odi/survey
Download the relevant file as CSV and then import into the correct
variable name with the command
> egassociations<-read.csv<-(file=”filename.csv’,headers=T)
To remove the line numbers, you can use:
> egassociations<-egassociations[,2:length(assoctest[1,])]
1) Government focussed 4) Digitising government focus 2) Technology innovation focussed
Understanding, Efficiency, Accountability A computerisation movement... Creating platforms, with semantic web
6) Social/public sector enterprise 3) Reward focussed
Companies providing local focussed services Recognition or profit
5) Problem solving focus
Develop new skills to meet interesting challenge
39. Further analysis: motivations and attitudes
• The attitudes dataset records responses against a number of claims about OGD.
• The table below shows how different motivational clusters correlate with particular statements about
OGD. Note the strong correlation between ‘Reward focus’ and statements about commercial benefit;
and the view in Government focussed and problem solving that OGD can drive reform of public
services.
• Only Government focussed motivations have any real correlation with claims about accountability.
• The ‘Enterprise’ correlations suggest a stronger ‘Social Enterprise’ interpretation may be appropriate.
Correlation matrix of generated motivational clusters and response to ‘attitudes 1’ questions
This row is curious. Little
evidence elsewhere in study that
OGD actually is helping errors to
be identified...
R commands:
> cor(attitudes,mclusts) #display formatted in excel
40. Further analysis: attitudes 2
• The attitudes 2 set of questions have far lower correlation. Overall the consensus on
these items was far stronger, thus the information that can be extracted from their
correlation with particular motivations less.
• The correlations are not easily interpretable - but suggest that the views within each
motivational cluster are not homogenous. The small (but greater than others)
correlation of Tech and DigitalGov clusters against ‘Data only open when clear
demand’ is surprising.
• Attitudes 2 does little to illuminate the interpretation of these motivational clusters.
Correlation matrix of generated motivational clusters and response to ‘attitudes 2’ questions
R commands:
> cor(attitudes2,mclusts) #display formatted in excel
41. Further analysis: associations
• Correlations against associations provide some further confirmation of the
clusters, suggesting the diversity of the ‘Gov Focus’ users, but relatively
strong identity of the tech and reward focussed clusters.
Correlation matrix of generated motivational clusters and response to
‘associations’ questions
R commands:
> cor(associations,mclusts) #display formatted in excel
42. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different
motivational clusters.
• Given the data feeding into the
clusters is overlapping (1 overlaps
4; 2 overlaps 3 and 4) we may
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)
43. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different 1) Government focussed
motivational clusters. Understanding, Efficiency, Accountability
• Given the data feeding into the
clusters is overlapping (1 overlaps
4; 2 overlaps 3 and 4) we may
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)
44. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different 1) Government focussed
motivational clusters. Understanding, Efficiency, Accountability
• Given the data feeding into the
clusters is overlapping (1 overlaps
4; 2 overlaps 3 and 4) we may
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
2) Technology innovation focussed
Creating platforms, with semantic web
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)
45. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different 1) Government focussed
motivational clusters. Understanding, Efficiency, Accountability
• Given the data feeding into the
clusters is overlapping (1 overlaps
4; 2 overlaps 3 and 4) we may
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)
46. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different 1) Government focussed
motivational clusters. Understanding, Efficiency, Accountability
• Given the data feeding into the
clusters is overlapping (1 overlaps 4) Digitising government focus
4; 2 overlaps 3 and 4) we may A computerisation movement?
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
2) Technology innovation focussed
Creating platforms, with semantic web
3) Reward focussed
Recognition or profit
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)
47. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different 1) Government focussed
motivational clusters. Understanding, Efficiency, Accountability
• Given the data feeding into the
clusters is overlapping (1 overlaps 4) Digitising government focus
4; 2 overlaps 3 and 4) we may A computerisation movement?
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
2) Technology innovation focussed
Creating platforms, with semantic web
5) Problem solving focus
Develop new skills to meet interesting challenge
3) Reward focussed
Recognition or profit
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)
48. The digital government connection
• Using multi-dimensional scaling can
support visualisation of the
relationship between the different 1) Government focussed
motivational clusters. Understanding, Efficiency, Accountability
• Given the data feeding into the
clusters is overlapping (1 overlaps 6) Social/public sector enterprise 4) Digitising government focus
4; 2 overlaps 3 and 4) we may Companies providing local focussed services A computerisation movement?
expect some clustering of these
when MDS is applied. However, we
do see (6) closer to (1), (4) and (2)
than may be anticipated, and (5)
distant from Digitising Government
focus.
2) Technology innovation focussed
Creating platforms, with semantic web
5) Problem solving focus
Develop new skills to meet interesting challenge
3) Reward focussed
Recognition or profit
R: Using a manually created distance matrix of 1-correlation matrix value as dist
> fit<-cmdscale(dist,2)
> plot(fit[,2],fit[,1])
> text(fit[,2],fit[,1],labels=colnames(mclusts),pos=4,offset=0.2)