SlideShare una empresa de Scribd logo
1 de 22
Open data quality:
 a practical view
          Sharon Dawes
        www.ctg.albany.edu

        Open Data Roundtable
   St. Petersburg, October 9, 2012
Open data philosophy
• Make “high value” government data sets available
  in structured, machine readable form
• Provide easy one-stop access to all open data sets
  from all departments
• No fees or other requirements for access or use
• Add value through applications, challenges,
  promotions, data sharing communities
However . . .
• The value of open data lies in data use.
• This value of depends on the users’ point of view
   – Users are individuals and organizations outside
     government who use the data directly or through
     applications created by developers
• Value depends on the quality of the data for a given
  use by a given user.
Sidney Harris, 2012
Sources of data problems
           Conventional wisdom
               Provenance
                Practices
Consequences of the problems
                 Underuse
                   Misuse
                  Non-use
     Shifting costs and responsibilities
Where do open data come from?
Administrative systems

                         Embedded in program operations


              Governed by specific policies and laws

                                  Gathered in particular contexts for
                                  certain internal purposes


 By people with different kinds and
 levels of knowledge and expertise
Case 1: Give me shelter
Case 2: Cadastral records
Case 3: Where does the money go?
data
data




technology
data




technology




             management
data




technology                policy




             management
context




               data




technology                policy




             management
Data quality = fitness for use
• Matters most from the user’s point of
  view
• Depends on the user’s purpose
• Four types of quality:
   –   Intrinsic
   –   Contextual
   –   Representational
   –   Access-related
• Usually involves trade offs
   – e.g., timeliness vs. completeness (Wang & Strong, 1996, Ballou & Pazer, 1995)
Dimensions of data quality
Accessibility                Extent to which data is available, or easily and quickly retrievable
Appropriate Amount of Data   Extent to which the volume of data is appropriate for the task at hand
Believability                Extent to which data is regarded as true and credible
                             Extent to which data is not missing and is of sufficient breadth and depth for the
Completeness
                                  task at hand
Concise Representation       Extent to which data is compactly represented
Consistent Representation    Extent to which data is presented in the same format
Ease of Manipulation         Extent to which data is easy to manipulate and apply to different tasks
Free-of-Error                Extent to which data is correct and reliable
                             Extent to which data is in appropriate languages, symbols, and units, and the
Interpretability
                                  definitions are clear
Objectivity                  Extent to which data is unbiased, unprejudiced, and impartial
Relevancy                    Extent to which data is applicable and helpful for the task at hand
Reputation                   Extent to which data is highly regarded in terms of its source or content




                                                                                                                   Pipino, et al, 2002
Security                     Extent to which access to data is restricted appropriately to maintain its security
Timeliness                   Extent to which the data is sufficiently up-to-date for the task at hand
Understandability            Extent to which data is easily comprehended
Value-Added                  Extent to which data is beneficial and provides advantages from its use
Metadata




637 pages
Data quality “tools”
• For open data providers
      •   Appreciate data as an asset, a source of value
      •   Adopt information policies to preserve and enhance value
      •   Create and maintain metadata to support unknown users
      •   Adopt stewardship practices
• For open data users
      •   Be skeptical, ask questions
      •   Understand the nature and context of the data
      •   Use data sets with caution
      •   Combine data sets with great caution
• A good approach: create and support data communities
Summary
•   A philosophy of openness
•   A government-wide strategy for data access
•   A focus on value generation through data use
•   Realistic appreciation for data problems and limitations
•   Support for hard and soft data quality tools plus active
    user engagement strategies
Thank you

www.ctg.albany.edu

Más contenido relacionado

La actualidad más candente

Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...
Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...
Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...open-development
 
Interface interoperability
Interface interoperabilityInterface interoperability
Interface interoperabilitymsdanij
 
Providing and sharing disaster information
Providing and sharing disaster informationProviding and sharing disaster information
Providing and sharing disaster informationAlbert Simard
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data pagetTERN Australia
 
From Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipFrom Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipICPSR
 
Trust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEADTrust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEADBeth Plale
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
Blockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - ManionBlockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - ManionSean Manion PhD
 
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12 SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12 ASIS&T
 
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...ICPSR
 
Inconsistencies in big data
Inconsistencies in big dataInconsistencies in big data
Inconsistencies in big dataminujoseph
 
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...chloejreynolds
 
Data curation
Data curationData curation
Data curationealtmyer
 
The influence of information security on
The influence of information security onThe influence of information security on
The influence of information security onIJCNCJournal
 
Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...Jisc
 

La actualidad más candente (20)

Big data
Big dataBig data
Big data
 
Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...
Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...
Open Development, Innovations and IDS Knowledge Services - #okfest #opendev l...
 
Interface interoperability
Interface interoperabilityInterface interoperability
Interface interoperability
 
Providing and sharing disaster information
Providing and sharing disaster informationProviding and sharing disaster information
Providing and sharing disaster information
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data paget
 
From Data Sharing to Data Stewardship
From Data Sharing to Data StewardshipFrom Data Sharing to Data Stewardship
From Data Sharing to Data Stewardship
 
Trust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEADTrust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEAD
 
Big data
Big dataBig data
Big data
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Blockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - ManionBlockchain in Health Research Overview - Manion
Blockchain in Health Research Overview - Manion
 
The SMART Way to Manage Research Data
The SMART Way to Manage Research DataThe SMART Way to Manage Research Data
The SMART Way to Manage Research Data
 
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12 SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12
SEAD: Sustainable Environment-Actionable Data - Robert McDonald - RDAP12
 
Ib3514141422
Ib3514141422Ib3514141422
Ib3514141422
 
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
Understanding ICPSR - An Orientation and Tours of ICPSR Data Services and Edu...
 
Inconsistencies in big data
Inconsistencies in big dataInconsistencies in big data
Inconsistencies in big data
 
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
Who Owns Faculty Data?: Fairness and transparency in UCLA's new academic HR s...
 
Data curation
Data curationData curation
Data curation
 
The influence of information security on
The influence of information security onThe influence of information security on
The influence of information security on
 
Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...Data and information governance: getting this right to support an information...
Data and information governance: getting this right to support an information...
 

Similar a Sharon Dawes (CTG Albany) Open data quality: a practical view

Foundation of data quality
Foundation of data qualityFoundation of data quality
Foundation of data qualityKhaled Mosharraf
 
How do you assess the quality and reliability of data sources in data analysi...
How do you assess the quality and reliability of data sources in data analysi...How do you assess the quality and reliability of data sources in data analysi...
How do you assess the quality and reliability of data sources in data analysi...Soumodeep Nanee Kundu
 
SIAS Bio-IT Conference_FINAL
SIAS Bio-IT Conference_FINALSIAS Bio-IT Conference_FINAL
SIAS Bio-IT Conference_FINALJohn Koch
 
BigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCHBigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCHJohn Koch
 
Big Data and Goverment Analytics
Big Data and Goverment AnalyticsBig Data and Goverment Analytics
Big Data and Goverment AnalyticsKhaled Ghadban
 
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...Steven Callahan
 
Business Agility Must Be Based on a New Flexible and Agile Data Approach
Business Agility Must Be Based on a New Flexible and Agile Data ApproachBusiness Agility Must Be Based on a New Flexible and Agile Data Approach
Business Agility Must Be Based on a New Flexible and Agile Data ApproachDenodo
 
CARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE
 
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupCrowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupEdward Curry
 
Navigating the Complex Terrain of Data Governance in Data Analysis.pdf
Navigating the Complex Terrain of Data Governance in Data Analysis.pdfNavigating the Complex Terrain of Data Governance in Data Analysis.pdf
Navigating the Complex Terrain of Data Governance in Data Analysis.pdfSoumodeep Nanee Kundu
 
03 keynote dillo
03 keynote dillo03 keynote dillo
03 keynote dilloShareCareX
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityPrecisely
 
Veritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault InfographicVeritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault InfographicIdeba
 
Veritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault InfographicVeritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault InfographicVeritas Technologies LLC
 
Introduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfIntroduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfAhmedHany Sayed
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo
 
HITRUST CSF in the Cloud
HITRUST CSF in the CloudHITRUST CSF in the Cloud
HITRUST CSF in the CloudOnRamp
 

Similar a Sharon Dawes (CTG Albany) Open data quality: a practical view (20)

Foundation of data quality
Foundation of data qualityFoundation of data quality
Foundation of data quality
 
How do you assess the quality and reliability of data sources in data analysi...
How do you assess the quality and reliability of data sources in data analysi...How do you assess the quality and reliability of data sources in data analysi...
How do you assess the quality and reliability of data sources in data analysi...
 
SIAS Bio-IT Conference_FINAL
SIAS Bio-IT Conference_FINALSIAS Bio-IT Conference_FINAL
SIAS Bio-IT Conference_FINAL
 
BigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCHBigDataInPractice_EXLPHARMA_KOCH
BigDataInPractice_EXLPHARMA_KOCH
 
Big Data and Goverment Analytics
Big Data and Goverment AnalyticsBig Data and Goverment Analytics
Big Data and Goverment Analytics
 
Big data
Big dataBig data
Big data
 
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
 
Business Agility Must Be Based on a New Flexible and Agile Data Approach
Business Agility Must Be Based on a New Flexible and Agile Data ApproachBusiness Agility Must Be Based on a New Flexible and Agile Data Approach
Business Agility Must Be Based on a New Flexible and Agile Data Approach
 
CARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practiceCARARE: Can I use this data? FAIR into practice
CARARE: Can I use this data? FAIR into practice
 
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupCrowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
Crowdsourcing Approaches to Big Data Curation - Rio Big Data Meetup
 
ROER4D Open Data Initiative
ROER4D Open Data InitiativeROER4D Open Data Initiative
ROER4D Open Data Initiative
 
Navigating the Complex Terrain of Data Governance in Data Analysis.pdf
Navigating the Complex Terrain of Data Governance in Data Analysis.pdfNavigating the Complex Terrain of Data Governance in Data Analysis.pdf
Navigating the Complex Terrain of Data Governance in Data Analysis.pdf
 
03 keynote dillo
03 keynote dillo03 keynote dillo
03 keynote dillo
 
Data Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data QualityData Profiling: The First Step to Big Data Quality
Data Profiling: The First Step to Big Data Quality
 
Veritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault InfographicVeritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault Infographic
 
Veritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault InfographicVeritas Managed Enterprise Vault Infographic
Veritas Managed Enterprise Vault Infographic
 
Introduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfIntroduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdf
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and Business
 
HITRUST CSF in the Cloud
HITRUST CSF in the CloudHITRUST CSF in the Cloud
HITRUST CSF in the Cloud
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 

Más de Open City Foundation

Открытые данные: лучшие практики (Бишкек, 2015)
Открытые данные: лучшие практики (Бишкек, 2015)Открытые данные: лучшие практики (Бишкек, 2015)
Открытые данные: лучшие практики (Бишкек, 2015)Open City Foundation
 
Vlasov Vitaly - Civic hack night in Bishkek
Vlasov Vitaly - Civic hack night in BishkekVlasov Vitaly - Civic hack night in Bishkek
Vlasov Vitaly - Civic hack night in BishkekOpen City Foundation
 
Open Data Institute - Presentation for workshop
Open Data Institute - Presentation for workshopOpen Data Institute - Presentation for workshop
Open Data Institute - Presentation for workshopOpen City Foundation
 
James Wong - Open data in transport, St.Petersburg
James Wong - Open data in transport, St.PetersburgJames Wong - Open data in transport, St.Petersburg
James Wong - Open data in transport, St.PetersburgOpen City Foundation
 
Jeanne Holm - Using open transport data for innovations
Jeanne Holm - Using open transport data for innovationsJeanne Holm - Using open transport data for innovations
Jeanne Holm - Using open transport data for innovationsOpen City Foundation
 
Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...
Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...
Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...Open City Foundation
 
Яндекс и открытые данные
Яндекс и открытые данныеЯндекс и открытые данные
Яндекс и открытые данныеOpen City Foundation
 
Стив Каддинс - Перспективы создания единой транспортной системы в России
Стив Каддинс - Перспективы создания единой транспортной системы в РоссииСтив Каддинс - Перспективы создания единой транспортной системы в России
Стив Каддинс - Перспективы создания единой транспортной системы в РоссииOpen City Foundation
 
Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...
Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...
Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...Open City Foundation
 
Александр Еремеев (ФРИИ) Customer development: 5 шагов к пользователю
Александр Еремеев (ФРИИ) Customer development: 5 шагов к пользователюАлександр Еремеев (ФРИИ) Customer development: 5 шагов к пользователю
Александр Еремеев (ФРИИ) Customer development: 5 шагов к пользователюOpen City Foundation
 
Hackathon 2014: One day in Saint-Petersburg
Hackathon 2014: One day in Saint-PetersburgHackathon 2014: One day in Saint-Petersburg
Hackathon 2014: One day in Saint-PetersburgOpen City Foundation
 
Hackathon 2014: Собрание дома
Hackathon 2014: Собрание домаHackathon 2014: Собрание дома
Hackathon 2014: Собрание домаOpen City Foundation
 
Hackathon 2014: Друзья времени
Hackathon 2014: Друзья времениHackathon 2014: Друзья времени
Hackathon 2014: Друзья времениOpen City Foundation
 
права учителя на льготы Open data hackathon
права учителя на льготы Open data hackathonправа учителя на льготы Open data hackathon
права учителя на льготы Open data hackathonOpen City Foundation
 
куда отдать Open data hackathon
куда отдать Open data hackathonкуда отдать Open data hackathon
куда отдать Open data hackathonOpen City Foundation
 
живи и работай Open data hackathon
живи и работай Open data hackathonживи и работай Open data hackathon
живи и работай Open data hackathonOpen City Foundation
 
волшебная карта новостроек Open data hackathon
волшебная карта новостроек Open data hackathonволшебная карта новостроек Open data hackathon
волшебная карта новостроек Open data hackathonOpen City Foundation
 

Más de Open City Foundation (20)

Открытые данные: лучшие практики (Бишкек, 2015)
Открытые данные: лучшие практики (Бишкек, 2015)Открытые данные: лучшие практики (Бишкек, 2015)
Открытые данные: лучшие практики (Бишкек, 2015)
 
Vlasov Vitaly - Civic hack night in Bishkek
Vlasov Vitaly - Civic hack night in BishkekVlasov Vitaly - Civic hack night in Bishkek
Vlasov Vitaly - Civic hack night in Bishkek
 
Open Data Institute - Presentation for workshop
Open Data Institute - Presentation for workshopOpen Data Institute - Presentation for workshop
Open Data Institute - Presentation for workshop
 
James Wong - Open data in transport, St.Petersburg
James Wong - Open data in transport, St.PetersburgJames Wong - Open data in transport, St.Petersburg
James Wong - Open data in transport, St.Petersburg
 
Jeanne Holm - Using open transport data for innovations
Jeanne Holm - Using open transport data for innovationsJeanne Holm - Using open transport data for innovations
Jeanne Holm - Using open transport data for innovations
 
Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...
Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...
Михаил Бунчук (Всемирный банк) - Презентация отчета об исследовании открытых ...
 
Яндекс и открытые данные
Яндекс и открытые данныеЯндекс и открытые данные
Яндекс и открытые данные
 
Стив Каддинс - Перспективы создания единой транспортной системы в России
Стив Каддинс - Перспективы создания единой транспортной системы в РоссииСтив Каддинс - Перспективы создания единой транспортной системы в России
Стив Каддинс - Перспективы создания единой транспортной системы в России
 
Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...
Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...
Татьяна Бабурина (Комитет по информатизции и связи ) - Открытые данные Петерб...
 
Александр Еремеев (ФРИИ) Customer development: 5 шагов к пользователю
Александр Еремеев (ФРИИ) Customer development: 5 шагов к пользователюАлександр Еремеев (ФРИИ) Customer development: 5 шагов к пользователю
Александр Еремеев (ФРИИ) Customer development: 5 шагов к пользователю
 
Smooth
SmoothSmooth
Smooth
 
Hackathon 2014: One day in Saint-Petersburg
Hackathon 2014: One day in Saint-PetersburgHackathon 2014: One day in Saint-Petersburg
Hackathon 2014: One day in Saint-Petersburg
 
Hackathon 2014: Folk art
Hackathon 2014: Folk artHackathon 2014: Folk art
Hackathon 2014: Folk art
 
Hackathon 2014: Dropstuff
Hackathon 2014: DropstuffHackathon 2014: Dropstuff
Hackathon 2014: Dropstuff
 
Hackathon 2014: Собрание дома
Hackathon 2014: Собрание домаHackathon 2014: Собрание дома
Hackathon 2014: Собрание дома
 
Hackathon 2014: Друзья времени
Hackathon 2014: Друзья времениHackathon 2014: Друзья времени
Hackathon 2014: Друзья времени
 
права учителя на льготы Open data hackathon
права учителя на льготы Open data hackathonправа учителя на льготы Open data hackathon
права учителя на льготы Open data hackathon
 
куда отдать Open data hackathon
куда отдать Open data hackathonкуда отдать Open data hackathon
куда отдать Open data hackathon
 
живи и работай Open data hackathon
живи и работай Open data hackathonживи и работай Open data hackathon
живи и работай Open data hackathon
 
волшебная карта новостроек Open data hackathon
волшебная карта новостроек Open data hackathonволшебная карта новостроек Open data hackathon
волшебная карта новостроек Open data hackathon
 

Sharon Dawes (CTG Albany) Open data quality: a practical view

  • 1. Open data quality: a practical view Sharon Dawes www.ctg.albany.edu Open Data Roundtable St. Petersburg, October 9, 2012
  • 2. Open data philosophy • Make “high value” government data sets available in structured, machine readable form • Provide easy one-stop access to all open data sets from all departments • No fees or other requirements for access or use • Add value through applications, challenges, promotions, data sharing communities
  • 3.
  • 4. However . . . • The value of open data lies in data use. • This value of depends on the users’ point of view – Users are individuals and organizations outside government who use the data directly or through applications created by developers • Value depends on the quality of the data for a given use by a given user.
  • 6. Sources of data problems Conventional wisdom Provenance Practices Consequences of the problems Underuse Misuse Non-use Shifting costs and responsibilities
  • 7. Where do open data come from? Administrative systems Embedded in program operations Governed by specific policies and laws Gathered in particular contexts for certain internal purposes By people with different kinds and levels of knowledge and expertise
  • 8. Case 1: Give me shelter
  • 10. Case 3: Where does the money go?
  • 11. data
  • 13. data technology management
  • 14. data technology policy management
  • 15. context data technology policy management
  • 16. Data quality = fitness for use • Matters most from the user’s point of view • Depends on the user’s purpose • Four types of quality: – Intrinsic – Contextual – Representational – Access-related • Usually involves trade offs – e.g., timeliness vs. completeness (Wang & Strong, 1996, Ballou & Pazer, 1995)
  • 17. Dimensions of data quality Accessibility Extent to which data is available, or easily and quickly retrievable Appropriate Amount of Data Extent to which the volume of data is appropriate for the task at hand Believability Extent to which data is regarded as true and credible Extent to which data is not missing and is of sufficient breadth and depth for the Completeness task at hand Concise Representation Extent to which data is compactly represented Consistent Representation Extent to which data is presented in the same format Ease of Manipulation Extent to which data is easy to manipulate and apply to different tasks Free-of-Error Extent to which data is correct and reliable Extent to which data is in appropriate languages, symbols, and units, and the Interpretability definitions are clear Objectivity Extent to which data is unbiased, unprejudiced, and impartial Relevancy Extent to which data is applicable and helpful for the task at hand Reputation Extent to which data is highly regarded in terms of its source or content Pipino, et al, 2002 Security Extent to which access to data is restricted appropriately to maintain its security Timeliness Extent to which the data is sufficiently up-to-date for the task at hand Understandability Extent to which data is easily comprehended Value-Added Extent to which data is beneficial and provides advantages from its use
  • 19. Data quality “tools” • For open data providers • Appreciate data as an asset, a source of value • Adopt information policies to preserve and enhance value • Create and maintain metadata to support unknown users • Adopt stewardship practices • For open data users • Be skeptical, ask questions • Understand the nature and context of the data • Use data sets with caution • Combine data sets with great caution • A good approach: create and support data communities
  • 20.
  • 21. Summary • A philosophy of openness • A government-wide strategy for data access • A focus on value generation through data use • Realistic appreciation for data problems and limitations • Support for hard and soft data quality tools plus active user engagement strategies