Open Linked Governmental Data for Citizen Engagement – A workshop about the benefits and restrictions of open linked governmental data and the role of metadata in citizen engagement (Anneke Zuiderwijk, Marijn Janssen, Keith Jeffery, Yannis Charalabidis) #cedem12
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Open linked governmental data for citizen engagement
1. Open Linked Governmental Data for Citizen Engagement
A workshop about the benefits and restrictions of open linked governmental data and the role
of metadata in citizen engagement
Anneke Zuiderwijk*, Marijn Janssen*, Keith Jeffery**, Yannis Charalabidis***
*Delft University of Technology, The Netherlands
**Science and Technology Facilities Council, United Kingdom
*** University of AEGEAN, Greece
CEDEM 2012, May 3-4
2. Agenda
0 Introduction
0 The ENGAGE project
0 Questionnaire
0 Discussion about questionnaire and first results
0 Presentations
0 Anneke Zuiderwijk - Benefits and restrictions on the use of open linked
governmental data from the ENGAGE project
0 Keith Jeffery - The use of metadata for citizen engagement
0 Discussion
CEDEM Workshop, Krems, May 3-4, 2012
3. Introduction
0 Considerable attention is paid to open governmental data (e.g.
EC PSI-directives, national open data platforms, local initiatives)
0 EU Public Sector Information (PSI) directive (European Commission,
2003)
0 “A general framework is needed in order to ensure fair, proportionate and
non-discriminatory conditions for the re-use of *PSI+” (p. 1)
0 “PSI is an important primary material for digital content products and
services” (p. 1)
0 Many directives and implementation guidelines followed
0 Obama administration “establishment of an unprecedented level
of openness of the Government” (Obama, 2009)
0 Open Data Strategy for Europe (European Commission, 2011)
0 “It will be made a general rule that all documents that are made accessible
by public sector bodies can be re-used for any purpose, commercial or
non-commercial, unless protected by third party copyright” (p. 1)
0 “public bodies should not be allowed to charge more than costs triggered
by the individual request for data (marginal costs)” (p. 1)
CEDEM Workshop, Krems, May 3-4, 2012
4. Introduction
0 Open governmental data can be defined as “all stored
data of the public sector which could be made accessible
by government in the public interest without any
restrictions on usage and distribution” (Geiger & Von
Lucke, 2011, p. 185).
0 For example, public sector data can be:
0 Geographic data (e.g. cadastral information)
0 Legal data (e.g. courts decisions, legislation)
0 Meteorological data (e.g. climate data, weather forecasts)
0 Social data (e.g. population, public administration)
0 Transport data (e.g. traffic congestion, work on roads)
0 Business data (e.g. chamber of commerce, patents) (MEPSIR
study, Dekkers et al., 2006)
CEDEM Workshop, Krems, May 3-4, 2012
5. Introduction
0 Information and Communication Technologies (ICT) have the
potential to improve the responsiveness of governments to the
needs of citizens and scientific communities
0 Example: feedback loop (derived from Janssen &
Zuiderwijk, forthcoming)
government public
data make available? publishing Open data searching
finding
processing
using
discussing
? participation recommending
CEDEM Workshop, Krems, May 3-4, 2012
6. The ENGAGE project
0 However, significant barriers hinder the effective
exploration, management and distribution of the vast amounts
of available public sector data ENGAGE project
0 ENGAGE (FP7): An Infrastructure for Open, Linked
Governmental Data Provision towards Research Communities
and Citizens (http://www.engage-project.eu)
0 Main goal: the development and use of a data
infrastructure, incorporating distributed and diverse public
sector information (PSI) resources.
CEDEM Workshop, Krems, May 3-4, 2012
7. The ENGAGE project
0 The ENGAGE project:
0 Opens up diverse government data to researchers
0 European Level (all countries of the EU)
0 Establishes Metadata Standardization framework
0 Provides access and discovery on cross-country datasets
0 Provides feedback back to public data agencies
0 The ENGAGE platform will enable researchers and citizens to:
0 Discover and browse datasets across diverse and dispersed
public sector information resources (local, national and
European) in their own language
0 Download the datasets
0 Perform geospatial search of datasets
0 Visualize properly structured datasets in data tables, maps and
charts
CEDEM Workshop, Krems, May 3-4, 2012
8. The ENGAGE Project
0 A European Infrastructure
0 Integrating Public Sector Data
0 Providing Public Sector Information (PSI) to Research Communities
and Citizens
CEDEM Workshop, Krems, May 3-4, 2012
9. The ENGAGE Project - Questionnaire
0 The deposition, access and use of open public sector data should be
improved
we conduct a questionnaire to find out needs regarding to the use of
public sector data (e.g. deposit, access and use needs).
0 You are asked to participate in this survey, because you might
(potentially) use open public sector data
0 The results will be used to develop and further specify the
requirements of the ENGAGE e-infrastructure for open data
CEDEM Workshop, Krems, May 3-4, 2012
10. The ENGAGE Project - Questionnaire
0 The information provided by you participating is treated in a
confidential way
0 Completing the questionnaire will take about 10-25 minutes of your
time (14-23 questions)
0 Receive the results of the questionnaire (please leave your contact
details at the end)
0 Attention: each time that the term 'open data' is used in the
questionnaire, this refers to 'open governmental/public sector data'!
2nd ENGAGEWorkshop, Krems, May 3-4, 2011
CEDEM Meeting, Haifa, Nov 14-15, 2012
11. The ENGAGE Project - Questionnaire
0 Taking the questionnaire.
0 The results of this questionnaire will be used to find out your needs
regarding the use of public sector data and to develop and further
specify the requirements for the ENGAGE e-infrastructure for open
public sector data.
0 Your response is very valuable to us. Thank you very much for
participating in this survey.
CEDEM Workshop, Krems, May 3-4, 2012
12. Questionnaire - Approach
0 Target groups:
0 (Potential) Users of open public sector data
0 Researchers and citizens from all fields of research and all EU-
countries
0 Questions
0 Based on interviews and Unified Theory of Acceptance and Use of
Technology (UTAUT)
0 Background questions (gender, age and function)
0 Current use of open public sector data (use, type of data,
frequency, websites, purpose, ability, usefulness)
0 Metadata (use, benefits, restrictions, needs)
0 Statements about use (perceptions of easiness, expectations,
voluntariness, intentions)
0 Other (comments and contact details)
CEDEM Workshop, Krems, May 3-4, 2012
13. Questionnaire - Approach
0 Questionnaire was put on line and:
0 Sent to e-mail lists of conferences (e.g. the E-GOV list)
0 Put on the ENGAGE project website
0 Sent to contacts via LinkedIn
0 Sent to contacts directly (e.g. via the ENGAGE contact list)
0 Sent to organizations that employ researchers that probably
work with open data
0 Sent to contacts working for open data platforms and asked
them to put the link to the questionnaire on their website
(e.g. EPSI-platform and Dutch governmental open data
websites)
0 Questionnaire was printed on paper and used for:
0 Workshops
0 Handing out to conference participants
CEDEM Workshop, Krems, May 3-4, 2012
14. Questionnaire - Approach
0 Aim: at least 246 respondents finishing the questionnaire
(taking into account the confidence level and a margin of error)
0 On April 19, 2012, 129 people started the questionnaire and 60
people (24% of aim) finished the questionnaire
CEDEM Workshop, Krems, May 3-4, 2012
15. Questionnaire – Results – Background
Gender (%) (N=120)
100
80 75
60
40
25
20
0
Man Woman
Age (%) (N=120)
30 28 27
25
20
16
15 12
9
10 7
5 2
1
0
Under 18 18-21 22-25 26-30 31-40 41-50 51-60 61 or over
CEDEM Workshop, Krems, May 3-4, 2012
16. Questionnaire – Results – Background
Working field (%) (N=118)
45 42
40
35
30
25 20
20 15 14
15
8
10
5
0
Social sciences Natural sciences Non-scientific Non-scientific Other
industry
Social sciences* (N=47)
35% 32%
30% 28%
26%
25%
20%
15%
10%
* Multiple
5%
answers possible
0%
Economics Political science Sociology
CEDEM Workshop, Krems, May 3-4, 2012
17. Questionnaire – Discussion
0 Did you ever use open public sector data?
0 More or less than 75 percent?
0 And which types of open public sector data are used most?
(geographic, legal, meteorological, social, transport, business or
other?)
CEDEM Workshop, Krems, May 3-4, 2012
18. Questionnaire – Results – Type of use
Did you ever use open data? (N=113)
90% 85%
80%
70% Two paths:
0 Potential users
60%
50%
40%
0 Users
30%
20% 13%
10% 2%
0%
Yes No Don't know
Which types of open public sector data?* (N=94)
80% 76%
70% 65%
60% 54%
50% 47%
40% 37%
29%
30%
20% 12%
10% * Multiple
0% answers
Geographic data Legal data Meteorological Social data Transport data Business data Other data possible
data
CEDEM Workshop, Krems, May 3-4, 2012
19. Questionnaire – Discussion
0 How often do you use open public sector data?
0 What did the majority say? (yearly, monthly, weekly, daily?)
CEDEM Workshop, Krems, May 3-4, 2012
20. Questionnaire – Results – Type of use
How often? (N=93)
45%
39%
40%
35%
30%
24% 24%
25%
20%
14%
15%
10%
5%
0% 0%
0%
Never Yearly or a few Monthly or a few Weekly or a few Daily or multiple Don't know
times per year times per month times per week times per day
CEDEM Workshop, Krems, May 3-4, 2012
21. Questionnaire – Discussion
0 Which of the following non-European Union open public
sector data sources/websites have you used in the past?
0 What did the majority say? (E.g. data.gov? Which other websites?)
CEDEM Workshop, Krems, May 3-4, 2012
22. Questionnaire – Results – Websites
Use of non-EU websites (N=78) %
65%
United States: www.data.gov
Canada: www.data.gc.ca 12%
New Zealand: http://www.data.govt.nz/ 6%
Australia: http://data.australia.gov.au 14%
Morocco: http://data.gov.ma/ 0%
Moldova: http://data.gov.md 1%
Albany: http://open.data.al 0%
Israel: http://data.gov.il 0%
Kenya: http://www.opendata.go.ke/ 4%
Other, namely: 12%
None of these 28%
I do not remember 6%
CEDEM Workshop, Krems, May 3-4, 2012
23. Questionnaire – Discussion
0 Which of the following European Union open public sector data
sources/websites have you used in the past?
0 What did the majority say? (E.g. data.gov.uk? Which other websites?)
CEDEM Workshop, Krems, May 3-4, 2012
24. Questionnaire – Results – Websites
Use of EU-Websites (N=80) %
Europe: www.epsiplatform.eu 25%
United Kingdom: www.data.gov.uk 53%
France: www.data.gouv.fr 5%
Greece: www.observatory.gr 4%
Netherlands: www.data.overheid.nl 13%
Luxembourg: www.statistiques.public.lu 0%
Italy: www.dati.gov.it 6%
Belgium: http://data.gov.be 1%
Norway: www.data.norge.no 4%
Denmark: www.digitaliser.dk 1%
Estonia: http://www.opendata.ee/ 1%
Spain: http://datos.gob.es 8%
Other, namely: 13%
None of these 14%
I do not remember 5%
CEDEM Workshop, Krems, May 3-4, 2012
25. Questionnaire – Discussion
0 Which other data sources/websites of open public sector data
have you used in the past?
0 Many other websites? Which websites?
CEDEM Workshop, Krems, May 3-4, 2012
26. Questionnaire – Results – Websites
Use of other websites (N=88)
Eurostat
CBS (Dutch statistics office)
United Nations
World Bank
http://diavgeia.gov.gr
http://geodata.gov.gr
http://www.gsis.gr
OECD
Land Registry
Historic weather data
UNESCO
http://daten.berlin.de/
www.norway.no
http://data.wien.gv.at/
http://www.denhaag.nl/opendata
Openstreetmap.org
www.joinup.eu
http://offeneskoeln.de/
maps.geoportal.gov.pl
www.eubusinessregister.com
www.e-practice.eu
And many other websites…
CEDEM Workshop, Krems, May 3-4, 2012
27. Questionnaire – Discussion
0 To which extent are the following purposes important for your
use of open public sector data?
0 Which purposes were mentioned most?
(academic publications, statistical analysis, policy research, non-
scientific and non-policy investigations, political and policy-making
decisions, data linking, news reporting, daily operation in work,
curiosity/recreation, other?)
CEDEM Workshop, Krems, May 3-4, 2012
28. Questionnaire – Results – Purpose
Which purposes are important for use?
Very unimportant Unimportant Neutral Important Very important Don't know
Academic publications 7% 5% 15% 22% 47% 4%
Statistical analysis 4% 4% 10% 23% 56% 3%
Policy research 3% 9% 13% 33% 40% 3%
Investigations (non-scientific, non-policy) 4% 10% 19% 43% 21% 3%
Political and policy-making decisions 7% 9% 12% 41% 30% 1%
Data linking 1% 7% 14% 29% 47% 3%
News reporting 13% 17% 28% 19% 20% 3%
Daily operation in work 7% 11% 28% 24% 27% 3%
Curiosity/recreation 6% 9% 23% 33% 26% 4%
Other 1% 0% 4% 4% 7% 4%
CEDEM Workshop, Krems, May 3-4, 2012
29. Questionnaire – Discussion
0 To which extent are you currently able to perform the following
actions when you use open data?
0 To which extent do you find the following actions useful for
your use of open public sector data?
0 Which actions were mentioned as difficult/useful by the majority of
respondents?
(searching, searching by using an API, finding, finding by the use of
metadata, finding linked material, discover and browse datasets on
different levels in the own language, downloading, downloading
supplementary data?)
CEDEM Workshop, Krems, May 3-4, 2012
30. Questionnaire – Results - Requirements
Current ability to perform the following actions when using open public sector data (N=70)
Very difficult Difficult Neutral Easy Very easy Don't know
Searching 9% 23% 22% 28% 16% 3%
Searching by using an API 7% 28% 19% 13% 4% 27%
Finding (getting data) 11% 31% 27% 21% 5% 5%
Finding by use of metadata 11% 24% 29% 26% 5% 6%
Finding linked material 17% 21% 30% 15% 6% 11%
Discover and browse datasets (different levels, own language) 18% 26% 25% 19% 4% 7%
Downloading 9% 17% 25% 28% 17% 4%
Downloading supplementary data (e.g. metadata) 16% 19% 34% 16% 4% 10%
Assessment of usefulness of performing the following actions when using open public sector data (N=56)
Very useless Useless Neutral Useful Very useful Don't know
Searching 2% 7% 7% 25% 56% 2%
Searching by using an API 2% 2% 15% 25% 36% 20%
Finding (getting data) 2% 4% 11% 17% 65% 2%
Finding by use of metadata 2% 4% 13% 22% 48% 11%
Finding linked material 0% 5% 11% 39% 43% 2%
Discover and browse datasets (different levels, own language) 5% 5% 15% 29% 44% 2%
Downloading 0% 4% 13% 22% 59% 2%
Downloading supplementary data (e.g. metadata) 2% 0% 16% 29% 42% 11%
CEDEM Workshop, Krems, May 3-4, 2012
31. Questionnaire – Discussion
0 To which extent are you currently able to perform the following
actions when you use open data?
0 To which extent do you find the following actions useful for
your use of open public sector data?
0 Which actions were mentioned as difficult/useful by the majority of
respondents?
(processing, processing by transforming data/linking data/linking
metadata, visualizing, analyzing, feedback by rating, feedback by
putting needs, uploading, uploading processed data, viewing usage
statistics, getting training?)
CEDEM Workshop, Krems, May 3-4, 2012
32. Current ability to perform the following actions when using open public sector data (N=61)
Very difficult Difficult Neutral Easy Very easy Don't know
Processing 6% 15% 32% 25% 13% 9%
Processing by transforming data 5% 18% 15% 32% 13% 17%
Processing by linking data 15% 24% 22% 17% 5% 17%
Processing by linking metadata 16% 23% 21% 11% 12% 18%
Processing by visualising data in tables, maps and charts 5% 13% 25% 27% 23% 7%
Processing by analysing data 5% 20% 20% 38% 12% 5%
Providing feedback by rating the data 20% 27% 25% 12% 3% 12%
Providing feedback to the data producer by putting needs 15% 25% 20% 15% 5% 20%
Uploading datasets 14% 24% 19% 12% 7% 24%
Uploading processed, enhanced, extended, annotated
19% 22% 22% 5% 2% 31%
and/or linked datasets
Viewing usage statistics 12% 25% 25% 8% 7% 22%
Getting training on the use of open data 12% 35% 23% 13% 5% 12%
Assessment of usefulness of performing the following actions when using open public sector data (N=53)
Very useless Useless Neutral Useful Very useful Don't know
Processing 2% 4% 8% 38% 44% 4%
Processing by transforming data 2% 4% 13% 37% 40% 4%
Processing by linking data 2% 4% 20% 33% 37% 4%
Processing by linking metadata 2% 4% 20% 27% 35% 12%
Processing by visualising data in tables, maps and charts 0% 6% 12% 20% 59% 4%
Processing by analysing data 0% 4% 6% 33% 53% 4%
Providing feedback by rating the data 2% 4% 29% 29% 29% 8%
Providing feedback to the data producer by putting needs 4% 8% 19% 23% 40% 6%
Uploading datasets 2% 9% 26% 32% 19% 11%
Uploading processed, enhanced, extended, annotated
2% 9% 19% 38% 19% 13%
and/or linked datasets
Viewing usage statistics 4% 8% 22% 33% 20% 14%
Getting training on the use of open data 8% 10% 23% 29% 25% 6%
CEDEM Workshop, Krems, May 3-4, 2012
33. Questionnaire – Discussion
0 Do you currently use metadata in the context of your work or
for other activities?
0 How many respondents used metadata? More or less than 90%?
CEDEM Workshop, Krems, May 3-4, 2012
35. Questionnaire – Discussion
0 When you use metadata for open public sector data in your current
practice, how often do you personally obtain the following benefits
from it?
0 When you use metadata for open public sector data in your current
practice, how often do you personally notice the following problems?
0 Which benefits are mentioned by the majority of respondents?
(metadata can make reuse, interpretation, searching and browsing and
linking easier)
0 Which problems are mentioned by the majority of respondents?
(difficult to interpret, insufficient data about data quality, data gathering
and data measuring, no structure, difficult to search and browse?)
CEDEM Workshop, Krems, May 3-4, 2012
36. Questionnaire – Results - Metadata
Assessment of noticing benefits when using open public sector data (N=47)
Never Rarely Sometimes ften
O Always Don't know
Metadata can make reusing data easier 0% 2% 18% 42% 36% 2%
Metadata can make interpretation of data easier 0% 2% 13% 33% 50% 2%
Metadata can make searching and browsing data easier 0% 9% 16% 18% 56% 2%
Metadata can make linking data easier 0% 11% 13% 33% 33% 9%
Assessment of noticing problems when using open public sector data (N=53)
Never Rarely Sometimes Often Always Don't know
Insufficient metadata and therefore difficult to interpret the data 0% 7% 25% 57% 5% 7%
Insufficient data about the data quality 0% 0% 35% 47% 14% 5%
Insufficient metadata about data gathering and measuring 0% 0% 30% 49% 16% 5%
Metadata have no structure and are therefore difficult to search and browse 0% 14% 27% 36% 14% 9%
CEDEM Workshop, Krems, May 3-4, 2012
37. Questionnaire – Discussion
0 Which of the following metadata would you like to use when
you use (e.g. search, browse, retrieve and evaluate) open public
sector data?
0 Which metadata?
(e.g. description of
dataset, title, creator, publisher, country, source, type/theme/category,
format, language, keywords/tags, geographical/spatial
coverage, temporal coverage, release data, license, linked
datasets, organizations and persons involved, projects
related, funding, data collection
period, helpdesk, quality, completeness, parameters used by software)
CEDEM Workshop, Krems, May 3-4, 2012
38. Questionnaire – Results - Metadata
Type of metadata % Type of metadata %
Description of dataset 95% Linked datasets 86%
Title of dataset 88% Organizations involved in creating the dataset 70%
Creator of dataset 74% Persons involved in creating the dataset 60%
Publisher of dataset 70% Projects related to the dataset 65%
Country where the dataset was created 79% Funding information of the dataset 47%
Source of dataset 86% Data collection period (from-to) 84%
Type/theme/category of data 74% Helpdesk for the dataset 58%
Format of dataset 81% Quality as declared by the data provider 74%
Language used in dataset 72% Quality as declared by the data user (feedback) 74%
Keywords/tags in dataset 84% Completeness of the dataset 81%
Parameters used by software accessing and
93% 53%
Geographical or spatial coverage of dataset processing the dataset
Temporal coverage of dataset 81% Other metadata, namely: 21%
Release date of dataset 77% Don't know 0%
License of dataset 67%
CEDEM Workshop, Krems, May 3-4, 2012
39. Questionnaire – discussion
0 Statement: Using open public sector data is of benefit for me
CEDEM Workshop, Krems, May 3-4, 2012
40. Questionnaire – first results
0 Statement: Using open public sector data is of benefit for me
Using open public sector data is of
benefit for me (N=56)
80%
67%
70%
60%
50%
40%
31%
30%
20%
10% 2%
0% 0% 0%
0%
Strongly Disagree Neutral Agree Strongly Don't know
disagree agree
CEDEM Workshop, Krems, May 3-4, 2012
41. Questionnaire – discussion
0 Statement: Using open public sector data will enable me to
accomplish my research more quickly
CEDEM Workshop, Krems, May 3-4, 2012
42. Questionnaire – first results
0 Statement: Using open public sector data will enable me to
accomplish my research more quickly
Using open public sector data will enable me
to accomplish my research more quickly
(N=56)
60%
51%
50%
40% 35%
30%
20%
9%
10% 5%
0% 0%
0%
Strongly Disagree Neutral Agree Strongly agree Don't know
disagree
CEDEM Workshop, Krems, May 3-4, 2012
43. Questionnaire – discussion
0 Statement: I have the resources necessary to use open public
sector data
CEDEM Workshop, Krems, May 3-4, 2012
44. Questionnaire – first results
0 Statement: I have the resources necessary to use open public
sector data
I have the resources necessary to use open
public sector data (N=56)
40% 38%
35%
30%
25%
20%
20% 18%
16%
15%
10%
4% 4%
5%
0%
Strongly Disagree Neutral Agree Strongly agree Don't know
disagree
CEDEM Workshop, Krems, May 3-4, 2012
45. Questionnaire – discussion
0 Statement: A specific person or group is available for assistance
with difficulties concerning the use of open public sector data
CEDEM Workshop, Krems, May 3-4, 2012
46. Questionnaire – first results
0 Statement: A specific person or group is available for assistance
with difficulties concerning the use of open public sector data
A specific person or group is available for
assistance with difficulties concerning the
use of open public sector data (N=56)
35% 33%
30%
25%
20%
20% 18%
15%
15%
11%
10%
4%
5%
0%
Strongly Disagree Neutral Agree Strongly agree Don't know
disagree
CEDEM Workshop, Krems, May 3-4, 2012
47. Questionnaire – discussion
0 Statement: It will be easy for me to become skillful at using
open public sector data
CEDEM Workshop, Krems, May 3-4, 2012
48. Questionnaire – first results
0 Statement: It will be easy for me to become skillful at using
open public sector data
It will be easy for me to become skillful at
using open public sector data (N=56)
50%
45%
45%
40%
35%
30%
24%
25%
20%
20%
15%
10% 7%
5% 2% 2%
0%
Strongly Disagree Neutral Agree Strongly agree Don't know
disagree
CEDEM Workshop, Krems, May 3-4, 2012
49. Questionnaire – discussion
0 Statement: People who are important to me (e.g. colleagues)
think that I should use open public sector data
CEDEM Workshop, Krems, May 3-4, 2012
50. Questionnaire – first results
0 Statement: People who are important to me (e.g. colleagues)
think that I should use open public sector data
People who are important to me (e.g.
colleagues) think that I should use open
public sector data (N=56)
35%
31%
30%
24% 24%
25%
20%
15%
11%
10% 7%
4%
5%
0%
Strongly Disagree Neutral Agree Strongly agree Don't know
disagree
CEDEM Workshop, Krems, May 3-4, 2012
51. Questionnaire – discussion
0 Statement: I intend to use open public sector data in the future
CEDEM Workshop, Krems, May 3-4, 2012
52. Questionnaire – first results
0 Statement: I intend to use open public sector data in the future
I intend to use open public sector data in the
future (N=56)
60% 55%
50%
40%
40%
30%
20%
10%
4%
2%
0% 0%
0%
Strongly Disagree Neutral Agree Strongly agree Don't know
disagree
CEDEM Workshop, Krems, May 3-4, 2012
53. Presentations
0 Anneke Zuiderwijk - Benefits and restrictions of the use of
open linked governmental data from the ENGAGE project
0 Keith Jeffery - The use of meta-data for citizen engagement
CEDEM Workshop, Krems, May 3-4, 2012
54. Presentations (1) - Benefits
0 Literature overview and two use-cases to identify benefits of
the use of open linked governmental data for the ENGAGE
project
CEDEM Workshop, Krems, May 3-4, 2012
55. Presentations (1) – Benefits (user perspective)
Category Benefits
Political and social Obtaining new insights in the public sector
Creating new ways of understanding problems and
interpreting data
Easier to participate in policy making
More participation and self-empowerment of users
Improvement of policy-making processes
New (innovative) and/or improved governmental services
for users
Improving citizen satisfaction
Improving life-quality of user
CEDEM Workshop, Krems, May 3-4, 2012
56. Presentations (1) – Benefits (user perspective)
Category Benefits
Economical Economic growth
Stimulating innovation
Stimulating scientific progress
Less dependency on other (governmental) organizations
Development of new products and services
Easier to perform research
Easier to do job
Reuse of data and therefore not having to collect the
same data again
Counteracting unnecessary duplication of costs (public
money
Availability of information for investors and companies
More competition
CEDEM Workshop, Krems, May 3-4, 2012
57. Presentations (1) – Benefits (user perspective)
Category Benefits
Operational and Being able to scrutinize data
technical Creating new data and obtaining new knowledge by
merging, integrating and mashing public and private
data (linked data)
Fair decision-making by enabling comparison
Sustainability of data (no data loss on the long term)
Cooperation with data provider
Ability to use the wisdom of the crowds
CEDEM Workshop, Krems, May 3-4, 2012
58. Presentations (1) – Restrictions (user perspective)
0 However, there are also many restrictions of the use of open
linked governmental data
CEDEM Workshop, Krems, May 3-4, 2012
59. Presentations (1) – Restrictions (user perspective)
Categories Barriers
Task complexity Not able to discover the appropriate data
and access The data are (temporarily) not available/open
restrictions Not having access to the original data (only processed data)
Difficult to search and browse; few central websites
No information about the way access to data may be obtained
No/few central website(s) to request access to data
Prior written permission is required to get access to and reproduce data
Not being free to creatively reuse data because of licences
Registration required before being able to download the data
Having to pay a fee for the data
Language issues
Data about the data (metadata) are not available
Not being aware of the potential use of data
Data are available in various forms resulting in discussing what is the right
source
No tooling support or helpdesk
Focus is on making use of single datasets, whereas the real value might come
from combining various datasets
Contradicting outcomes based on the use of the same data
CEDEM Workshop, Krems, May 3-4, 2012
60. Presentations (1) – Restrictions (user perspective)
Categories Barriers
Use and No incentives for users
participation Public organizations do not react on user input
No time to make use of the open data
Lack of knowledge to make sense and therefore to make use of data
Lack of capabilities – users do not have the information capabilities necessary
No statistical knowledge and understanding of the potential and the limitations
of statistics
Data are poorly annotated
Data format is not reusable
Insufficient metadata available
No explanation of the meaning of data
Invalid conclusions based on the reused data
Data formats and datasets are too complex to handle and use easily
Barriers stemming from laws and guidelines
Risk on privacy violation
Risk on dispute and litigations; threat of lawsuits or other violations
Unclearness because there is no uniform policy for opening data
CEDEM Workshop, Krems, May 3-4, 2012
61. Presentations (1) – Restrictions (user perspective)
Categories Barriers
Information Lack of information
Quality Accuracy/imprecise information
Obsolete data
Information may appear to be irrelevant or benign when viewed in isolation,
but when linked and analyzed collectively it may add value
Too much information to process and not sure what to look at
(Essential) Information is missing
Similar data stored in different systems yield different results
Categories Barriers
Technical Restrictions on data format for deposition and use
Absence of standards (e.g. for architecture)
Lack of metadata standards
No standard software for processing open data
Fragmentation of software and applications
CEDEM Workshop, Krems, May 3-4, 2012
62. Fishbone diagram (Zuiderwijk, Janssen & Choenni, forthcoming)
Data access (A)
Political, Economical, Social, Technical (PEST) Only a part of the
Access requires written
Data are covered by No information about data is available
permission (1)
copyright (act) and structurally updating data in
Access requires registration
other regulations (1) the future Data are currently or becoming a member
not available (2)
Legacy system complicates Threat of lawsuits or other
violations No uniform set of
the opening of data (1) No access to original data licensing terms for reuse
(only processed data) (3)
No or few visualization
Access requires a fee (2)
facilities No access to recent data,
Access requires (filling a only out-dated data
No awareness of data (3) form for) a data request (4)
No funding Access requires
Few central websites accepting a variety
No dialogue between the data-
Data-infrastructure is not (fragmentation of of use agreements
producing public body and data user (4)
easily expandable when sources) (5)
Access is limited to
No information about which data will PSI-amount increases
Data cannot be found (5) professionals
be published in the future massively Impediments of
current open
Little knowledge about data quality (1) Unfamiliar with data format data policies
Deposition requires
registration or becoming Use (especially comparability) requires data (Essential) Information is missing
a member transformations (2)
Language problems
Limited types of data Downloading data requires a lot of disk space (3) Users lack capabilities to use data
formats accepted
Users cannot make sense of data and Data about the same topic are
extract the knowledge contained within (4) displayed in different ways
Data deposition (D) Use (especially linking data) requires domain Difficult to search and browse data
expertise (5)
Losing track because of size of dataset
No tooling support or helpdesk (6)
Insufficient metadata (7) Reproductions must comply with standard conditions
Metadata have no structure (8) Too few concessions to statistical needs
Little attention paid to data gathering Unfamiliar with format of metadata
Unfamiliar with (meta)data language
Complex to understand data provenance
Data use (U)
CEDEM Workshop, Krems, May 3-4, 2012
63. Presentations (1) – Main challenges
0 Rectify fragmentation by creating a single shop for PSI
0 Create open access for all users
0 Create interoperability and provide users with possibilities to
analyse data
0 Create an infrastructure for processing PSI
(Zuiderwijk, Janssen & Choenni, forthcoming)
CEDEM Workshop, Krems, May 3-4, 2012
64. Presentations (2) – Metadata for citizen engagement
0 The survey shows that a key technology for
making open data available is metadata
0 The metadata is used for
0 Discovery (finding appropriate datasets)
0 Contextualising (the data was collected for what purpose,
which project(s), how funded, by whom, which
organisations, any related publications….
0 Data processing: here detailed domain (or even project)
specific metadata is used to link the software used for
analysis / reporting / visualising to the dataset
CEDEM Workshop, Krems, May 3-4, 2012
65. The Vision: Metadata for Data Model
DISCOVERY
Linked
open data (DC, eGMS…)
Generate
CONTEXT
(CERIF)
Formal Point to
Information
Systems DETAIL
(SUBJECT OR TOPIC SPECIFIC)
CEDEM Workshop, Krems, May 3-4, 2012
66. Models for an infrastructure
0 The data model with its metadata described is
only one relevant model
0 The other models are
0 User model
0 Processing model
0 Resource model
CEDEM Workshop, Krems, May 3-4, 2012
67. The Vision: The Models
User Model
Processing
Model
Data Model
Complete cohort of users Complete ICT environment for PSI
CEDEM Workshop, Krems, May 3-4, 2012
68. Models
0 User Model: controls the way in which the end-
user interacts with the e-infrastructure.
0 User profile, security certification, privacy;
0 Device and interaction mode preferences (keyboard/mouse through voice and
gesture to brain-connected), language preference;
0 Resource preferences (including contacts) with directories;
0 METADATA
CEDEM Workshop, Krems, May 3-4, 2012
69. Models
0 Process Model controls the way processes are constructed
and executed in the e-infrastructure.
0 Services
0 Described for discovery, described for functional and
non-functional (security, privacy, performance)
properties
0 Mobile (deployed in distributed / parallel execution
environments)
0 Open source where possible
0 Service composition
0 Dynamically (re-) composable during execution
0 METADATA
CEDEM Workshop, Krems, May 3-4, 2012
70. Models
0 Resource Model catalogs the available computing resources
in the e-infrastructure
0 This allows virtualisation so the user neither knows nor
cares from where the data comes, or where the
processing is done, as long as quality of service is
maintained;
0 Requires updating by resource owners – together with
conditions of use
0 METADATA
CEDEM Workshop, Krems, May 3-4, 2012
71. Discussion
0 I do not have difficulty in explaining why using metadata for
open public sector data may be beneficial
0 I clearly understand how to use metadata for open public
sector data
CEDEM Workshop, Krems, May 3-4, 2012
Notas del editor
- Introduce ourselves.- Let participants shortly introduce themselves? (depends on the amount of participants) Otherwise ask for working fields (e.g. science/universities, government, other)
- Especially in the last years considerable attention is focused on the demand of opening up governmental data within politics, companies, scientific communities, and citizen communities (European_Union 2010). - An important event within the trends of the last years is the release of the EU Public Sector Information (PSI) directive, in which a common legislative framework was presented which regulates making data of public sector bodies available for re-use (European_Commission 2003). - In this report the European Commission argued that a general framework “is needed in order to ensure fair, propotionate and non-discriminatory conditions for the re-use of [PSI]” (p. 1) and that “PSI is an important primary material for digital content products and services” (p. 1). - After the launch of the EU-directive, also referred to as the PSI-directive, many directives and implementation guidelines followed. - For example, in 2006 the European Commission developed a policy for the reuse of her own information sources which includes the statement that all general accessible data of the European Commission should become available for everyone, usually for free (European_Commission 2011a). - Another important event with regard to the development of open data policies is the statement of the Obama Administration in 2009 that has as primary goal the establishment of an unprecedented level of openness of the Government (Obama 2009). The Obama Administration published an Open Government Directive some months afterwards (The_White_House 2009). Building on former policies, the European Commission has recently presented an Open Data Strategy for Europe, in which more evident rules on making the best use of government-held information are presented (European_Commission 2011b). - An important change of the Open Data Strategy of 2011 compared to directives and guidelines that were released by the EC before, is that “it will be made a general rule that all documents that are made accessible by public sector bodies can be re-used for any purpose, commercial or non-commercial, unless protected by third party copyright” (p. 1). Another important change is that “public bodies should not be allowed to charge more than costs triggered by the individual request for data (marginal costs)” (p. 1). The European Commission will lead by example; the EC will open its PSI for free through a new data portal (European_Commission 2011b).
What are open governmental data? Mention definition Geiger & Von Lucke.We adopt this definition as it excludes the publication of data which must remain confidential, are private or contain industrial secrets.Also, this definition shows that open data should be accessible without restrictions on usage and distribution.Examples of open governmental data. The MEPSIR study defined six main domains for investigation:1. Business information, including Chamber of commerce information, official business registers,patent and trademark information and public tender databases;2. Geographic information, including address information, aerial photos, buildings, cadastralinformation, geodetic networks, geology, hydrographical data and topographic information;3. Legal information, including decisions of national, foreign and international courts, nationallegislation and treaties;4. Meteorological information, including climate data and models and weather forecasts;5. Social data, including various types of statistics (economic, employment, health, population,public administration, social);6. Transport information, including information on traffic congestion, work on roads, and publictransport, and vehicle registration.
This potential can be exploited by viewingpublishing open data as a process . - The figure shows the start of the opening of data on the left side resulting in the publishing of open data on a website. - Next, the data are released and can be used. - The public (citizens, businesses, but also other government organizations) takes over the data by searching for it and finding , processing, visualizing and discussing the outcomes of this process. - The outcomes might affect the government, which may result in recommendations for the government. - In turn, the government can listen to the recommendations, become involved in the discussion about what should be done or clarify its point of view. As the figure shows this part is largely underdeveloped and we did not find any clues about these kind of mechanisms.
Although the open data movement is guided by PSI-directives, strategies and national policies, open data policies of organizations are accompanied by many impediments. Current open data policies seem not to facilitate the effective and successful use of open data.The ENGAGE project was started because of these barriers. This happened in June 2011. I will tell more about current barriers in my presentation later during the workshop.Framework Programme 7 shows that attention of the European Commission for Open DataENGAGE is part of FP7Mail goalThe ENGAGE-project aims to create an e-infrastructure to open up public sector data to researchers and citizens. By using the e-infrastructure, researchers will be able to submit, acquire, search and visualize diverse and distributed public sector datasets from all the countries of the European Union.
Framework Programme 7 shows that attention of the European Commission for Open DataENGAGE is part of FP7Mail goalThe ENGAGE-project aims to create an e-infrastructure to open up public sector data to researchers and citizens. By using the e-infrastructure, researchers will be able to submit, acquire, search and visualize diverse and distributed public sector datasets from all the countries of the European Union.
At this moment, the deposition, access and use of open public sector data is often cumbersome and should be improved. The purpose of this survey is therefore to find out your needs regarding to the use of public sector data, such as deposit, access and use needs. You are asked to participate in this survey, because you might (potentially) use open public sector data. Even if you do not use data, you could be a potential user and your answers will be helpful to us.The results of this survey will be used to develop and further specify the requirements of the ENGAGE e-infrastructure for open data.
- Completion of this survey is voluntary and the information provided by you participating in this survey is treated in a confidential way. Completing the survey will take about 10-20 minutes of your time. The survey consists of 14-23 questions.
- Ask all participants to fill out the questionnaire. I will present some first results after they filled out the questionnaire.
Ask which results the participants expect.
Ask which results the participants expect.
Ask which results the participants expect.
Ask which results the participants expect.
Ask which results the participants expect.
Ask which results the participants expect.
- Using open public sector data seemstobevery important foralmostallpurposes.- Especiallystatistical analysis, academicpublicationsand datalinking are seen are very important.- News reportinganddailyoperation in workwereassessed as less important.
Ask which results the participants expect.
Most actions are assessed as ‘difficult’ or as ‘notdifficultbut alsonoteasy’. Onlysearchinganddownloading are seen as easy by the majority of respondents.Nevertheless, all actions are assessed as veryusefulby the majority.
Ask which results the participants expect.
Processing bytransforming, visualisingandanalysing is oftenassessed as easy.Most actions are assessed as difficult.The actions linking,providing feedback by putting needsanduploading are probablynotperformedveryoften, becausequitesomepeoplesaidthattheydon’tknowwheterthis is currenlty easy or difficult.All actions wereassessed as useful or veryuseful.
Ask which results the participants expect.
Ask which results the participants expect.
Ask which results the participants expect.
- The respondentsstatedthattheywouldliketousemany types of metadatawhentheyuse open public sector data.- Percentagesabove 80 are highlighted.- Funding informationand parameters usedby software are seen as less important.
An important event within the trends of the last years is the release of the EU Public Sector Information (PSI) directive, in which a common legislative framework was presented which regulates making data of public sector bodies available for re-use (European_Commission 2003). In this report the European Commission argued that a general framework “is needed in order to ensure fair, propotionate and non-discriminatory conditions for the re-use of [PSI]” (p. 1) and that “PSI is an important primary material for digital content products and services” (p. 1).
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpediments and 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpediments and 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpediments and 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpediments and 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpedimentsand 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpediments and 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Based on a literatureoverview and twouse-cases the impedimentsthat open data policiescurrentlyencounter are analyzed and categorized in fourcategories: 1) political, economical, technical and socialimpediments, 2) data access impediments, 3) data depositionimpedimentsand 4) data useimpediments. - The impediments are categorizedusing a fishbone diagram.
Another way toorganize the restrictions is by the use of a fishbone diagram.In thisdiagrams the impedimentsthat open data policiescurrentlyencounter are categorizedaccordingto the followingcategories:1) political, economical, technical and socialimpediments2) data access impediments3) data depositionimpediments4) data useimpediments
Rectify fragmentation by creating a single shop for PSI. A central, complete overview of data sets should be created. The pure existence of this overview is not sufficient when scientific communities are unaware of it, therefore awareness should be created by using dissemination strategies. Services should include the possibility to request access to PSI on this central website in case that special permission for the access to the PSI is needed.Create open access for all users. Open Data as a philosophy requires that certain data are freely available, without copyright, patents or other mechanisms of control in a timely and accessible way with few or no impediments. Therefore, open data platforms should consist of free access to PSI or access on marginal costs. The access should be realised for all users of open data. When there are issues with access, for instance privacy issues, solutions may be found for these issues. Besides, there should be clear uniform use agreements that do not differ per data set. Furthermore, easy access to all web content should be created, including applications of integrated content. In addition, the future should be taken into account: users should get information about which new data will become available in the near future and about structural updates.Create interoperability and provide users with possibilities to analyse data. It appears to be very important that users of public sector data can obtain metadata to create interoperability; they should be able to obtain data about the data. The metadata are used for discovery, for understanding the data in context and for detailed processing of the dataset(s). These metadata should include clear descriptions of the (quality of the) data and should have an evident structure, so that interpretation issues will be reduced as much as possible. Meta-tagging can be used in order to ensure that PSI can be reused without resource-intensive and cumbersome steps that need to be taken.Create an infrastructure for processing PSI. Data users should be able to use tools to track, (statistically) analyze and visualize the PSI they want to examine. Users should get information about the ontological categories of the data, so that they can make sense of it. The support and advice of experts in the field and other contacts will also contribute to this direction. The infrastructure should be based on a dialogue between data-producing public bodies and data users.