Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
The Data Driven University
Automating Data Governance & Stewardship in
Autonomous & Decentralized Environments
Pieter De L...
What we talk about when we talk about
no Data Governance
Who approved this?
I wish these guys
spoke our
language
I can’t u...
Glossary Search
• How frequently do you look up a word for your
business?
• To what purpose?
Clarification
Differentiation...
Overview
• Data Governance Operating Framework
Data Governance
Data Stewardship
Data Management
• Implementations
Stanford...
Data Governance Framework
Data Governance Council: Governance Operating Model
Roles &
Responsibilities
Processes &
Workflow...
Stanford University Data Stewardship
(SUDS)
• All Materials available here
dg.stanford.edu
• Establish foundation for
Inst...
SUDS: Approach
• Decentralized
 1 DG coordinator (also show vacancy)
 Project staff
 cross-functional working groups : ...
SUDS: First Success in OBIEE
reporting
REST / JSON / CSV / Excel
DG Operating Model
• What do we want to capture?
Asset Type: Business Terms, Policies, Rules, Code
Values
Attribute/Relati...
SUDS Data Dictionary Example
+4000 data elements
Community context: Finance, HR,
Research and Student
Custom attribute typ...
What attribute- and relation-types do we want to capture?
Out of the box but also custom
attribute types and relation types
What attribute- and relation-types do we want to capture?
• https://stanford.app.box.com/CollibraQuickReference
• https://...
Who is involved in the
process?
• https://compass.collibra.com/display/COOK/Role+Ty
pes
ResponsibleAccountable Informed Co...
Who? User groups and
Dashboards
Who? – User groups and Dashboards
How to execute and monitor?
From Best Practice to Auto-Validation Rules
http://web.stanford.edu/dept/pres-provost/cgi-bin/...
How to execute and monitor?
• Status Types and Workflows
E.g., For Domains, Terms, Users, and later for Issues and Data Sh...
How it it to be governed? Onboarding Workflow
(Not Stanford content - illustrative example only)
How it it to be governed? Approval Workflow
(not Stanford content - illustrative example only)
Stanford DG Program Key Results
(from http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/wp-content/uploads/20...
SUDS Future Directions
• Continue building engagement around
data governance (define policy), in
addition to data stewards...
George Washington University
(by courtesy of Ron Layne, GWU)
• centralized
• run by the DG Office division of IT
• mapping...
Flanders Research Information Space
• Providing Scientific Research Information and
Services
• Easy
• Transparent
• Open
•...
FRIS’ Data-driven Innovation Engine
By courtesy of G. Van Grootel, EWI
The Data providers landscape
25
Universities
Research Institutes
Funders
Others
Strategic Research
Centers
Universitiy Col...
FRIS Metamodel: an example
By courtesy of G. Van Grootel, EWI
Traceability diagram
Node Description
JRC (Joint Research Centre) The Business Term representing the
Funding Source
Zevend...
Conclusions
• Case by Case, success by success
• Identify key events and design workflow
‘chains’ to automate governance
•...
Questions For Audience
• How much % of data user need to look up
the definition of a term?
• How many % wants to know wher...
Próxima SlideShare
Cargando en…5
×

The Data Driven University - Automating Data Governance and Stewardship in Autonomous and Decentralized University Environments

4.652 visualizaciones

Publicado el

Data Governance and Stewardship requires automation of business semantics management at its nucleus, in order to achieve data trust between business and IT communities in the organization. University divisions operate highly autonomously and decentralized, and are often geographically distributed. Hence, they benefit more from an collaborative and agile approach to Data Governance and Stewardship approach that adapts to its nature.

In this lecture, we start by reviewing 'C' in ICT and reflect on the dilemma: what is the most important quality of data being shared: truth or trust? We review the wide spectrum of business semantics. We visit the different phases of growing data pain as an organization expands, and we map each phase on this spectrum of semantics.

Next, we introduce our principles and framework for business semantics management to support Data Governance and Stewardship focusing on the structural (what), processual (how) and organizational (who) components. We illustrate with use cases from Stanford University, George Washington University and Public Science and Innovation Administrations.

Publicado en: Tecnología
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    ¿Estás seguro?    No
    Tu mensaje aparecerá aquí

The Data Driven University - Automating Data Governance and Stewardship in Autonomous and Decentralized University Environments

  1. 1. The Data Driven University Automating Data Governance & Stewardship in Autonomous & Decentralized Environments Pieter De Leenheer, PhD Cofounder and VP Innovation
  2. 2. What we talk about when we talk about no Data Governance Who approved this? I wish these guys spoke our language I can’t understand this report ! I’ve never seen this funding code! Who introduced this ? Are we sure this definition of ‘professor’ is correct ? The Problem This rule is different on our campus! Are we allowed to share this student data with IR?
  3. 3. Glossary Search • How frequently do you look up a word for your business? • To what purpose? Clarification Differentiation • What are your main sources? • Hierarchy-based navigation or key-word based search? • Authoritative Truth or trust?
  4. 4. Overview • Data Governance Operating Framework Data Governance Data Stewardship Data Management • Implementations Stanford University Data Stewardship (SUDS) George Washington University Brigham Young University • The Bigger Picture Inter-university Data Governance in the Flanders Research Information Space
  5. 5. Data Governance Framework Data Governance Council: Governance Operating Model Roles & Responsibilities Processes & Workflow Asset Types & Traceability Data Governance Organization Data Stewardship Activities Data Quality Development IT / Operational Data Management Activities Data Modeling Metadata Lineage Establishes& drives Aligns& Coordinates Reports& Escalates Monitors& Remediates Metadata Scanning Reference Data Authoring Data Integration Collibra Business Semantics Glossary (BSG) Collibra Reference Data Accelerator (RDA) Hierarchy Management Business & Data Definitions Business Traceability Semantic Modeling Mapping Specifications Policy Management Business Rules Data Quality Rules Data Quality Reporting Issue Management Reference Data Crosswalks Master Data Stewardship Data Quality Profiling DQ Defect Resolution Collibra Data Stewardship Manager (DSM) Collibra Platform Other Data Management Vendor products ... https://compass.collibra.com/display/COOK/Data+Governance+Operating+Model
  6. 6. Stanford University Data Stewardship (SUDS) • All Materials available here dg.stanford.edu • Establish foundation for Institutional Research • Data Quality How many faculty do we have? • Context and Meaning What does faculty mean in which context? How is faculty data structured and where is it stored? • Data Usage Request Am I allowed to use faculty or student name and age for external reporting?
  7. 7. SUDS: Approach • Decentralized  1 DG coordinator (also show vacancy)  Project staff  cross-functional working groups : natural scope and resources  focus on BI reporting, with input from above projects  sign off by DG coordinator and end user through usage (full cycle) • Step-by step; success by success
  8. 8. SUDS: First Success in OBIEE reporting REST / JSON / CSV / Excel
  9. 9. DG Operating Model • What do we want to capture? Asset Type: Business Terms, Policies, Rules, Code Values Attribute/Relation Type: Name, Definition, Example, Derivations, Specializations • Who should be involved in this process? Communities: Finance, HR, Student, Research Domains / subject areas: Task Management Users and User groups • How to execute and Monitor the process? Key events and workflow chains Validation rules
  10. 10. SUDS Data Dictionary Example +4000 data elements Community context: Finance, HR, Research and Student Custom attribute types and relation types
  11. 11. What attribute- and relation-types do we want to capture? Out of the box but also custom attribute types and relation types
  12. 12. What attribute- and relation-types do we want to capture? • https://stanford.app.box.com/CollibraQuickReference • https://stanford.box.com/UsingCollibraFields
  13. 13. Who is involved in the process? • https://compass.collibra.com/display/COOK/Role+Ty pes ResponsibleAccountable Informed Consulted
  14. 14. Who? User groups and Dashboards
  15. 15. Who? – User groups and Dashboards
  16. 16. How to execute and monitor? From Best Practice to Auto-Validation Rules http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/?p=577 (generic example – not from SUDS)
  17. 17. How to execute and monitor? • Status Types and Workflows E.g., For Domains, Terms, Users, and later for Issues and Data Sharing Agreements, we first define a “finite state machine” and then a set of workflows that each define a transition between states. This means workflows can trigger each other and form a complex chain. BUSINESS SEMANTICS GLOSSARY Candidate In Progress Under Review Accepted In Revision Rejected Term requested on the domain page 11 1 2 2 3 3 2 3 Depricated 4 5 Workflows 1 2 Propose Business Term Edit Business Term 3 Onboarding Business Term 4 Deprecate Business Term 5 Reactivate Business Term
  18. 18. How it it to be governed? Onboarding Workflow (Not Stanford content - illustrative example only)
  19. 19. How it it to be governed? Approval Workflow (not Stanford content - illustrative example only)
  20. 20. Stanford DG Program Key Results (from http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/wp-content/uploads/2014/11/Stanford_DS_CAIR_v2.pdf • Understand data from multiple perspectives • Central repository of verified information (and better data infrastructure) • Easier access to information; less reliance on ‘oral tradition’ • Improved data quality, consistency • Increased understanding; thoughtful decision-making around data
  21. 21. SUDS Future Directions • Continue building engagement around data governance (define policy), in addition to data stewardship (enforce policy) • Continue building engagement, especially by executive-level leadership • Continue increasing visibility and consumption of definitions and other metadata
  22. 22. George Washington University (by courtesy of Ron Layne, GWU) • centralized • run by the DG Office division of IT • mapping data dictionaries, rules and metrics and data sharing agreements • Integration with Informatica Data Quality
  23. 23. Flanders Research Information Space • Providing Scientific Research Information and Services • Easy • Transparent • Open • Timely • Unambiguous • Supported by Data Governance • Qualitative meta data: e.g., definition for project, funding codes, mappings, classifications, etc. • Roles and responsibilities for Information Providers and Stiweto • Collaborative workflows between Information Providers and Stiweto By courtesy of G. Van Grootel, EWI
  24. 24. FRIS’ Data-driven Innovation Engine By courtesy of G. Van Grootel, EWI
  25. 25. The Data providers landscape 25 Universities Research Institutes Funders Others Strategic Research Centers Universitiy Colleges By courtesy of G. Van Grootel, EWI
  26. 26. FRIS Metamodel: an example By courtesy of G. Van Grootel, EWI
  27. 27. Traceability diagram Node Description JRC (Joint Research Centre) The Business Term representing the Funding Source Zevende Kader Programma.. The Business Term representin the parent Funding Source 3723 Generation 1 Funding Code Value 258 Generation 2 Funding Code Value G3 The Funding Stream Code Value By courtesy of G. Van Grootel, EW
  28. 28. Conclusions • Case by Case, success by success • Identify key events and design workflow ‘chains’ to automate governance • To support your specific use case and the growing DG platform you need extend asset, relation, attribute types • Collaboration and business user friendliness • BOK http://compass.collibra.com
  29. 29. Questions For Audience • How much % of data user need to look up the definition of a term? • How many % wants to know where data around a term is stored. • How many business terms do you have? • Who is in charge for data quality / governance ? • How much % of data definition decisions depends on business?

×