Más contenido relacionado La actualidad más candente (20) Similar a The Changing Data Quality & Data Governance Landscape (20) Más de Trillium Software (6) The Changing Data Quality & Data Governance Landscape1. Be Certain, Be Trillium Certain
The Changing Data Quality &
Data Governance Landscape
a survival guide for data governance & data quality
professionals
Trillium Software webinar – Wednesday 12 December
Nigel Turner, VP Information Management Strategy
2. The traditional DQ & Data Governance
Landscape?
2 © Copyright 2012, Trillium Software, Inc. All rights reserved.
3. The future DQ & Data Governance
Landscape?
© Copyright 2012, Trillium Software, Inc. All rights reserved.3
4. The changing landscape:
potential disruptive eruptions
BIG
DATA
CLOUD
COMPUTING
DATA
VIRTUALIZATION
4 © Copyright 2012, Trillium Software, Inc. All rights reserved.
6. Big Data – what is it?
Set of new concepts, practices & technologies to
manage & exploit digital data
Can be defined as:
“Data that exceeds the processing capability of conventional
database systems. The data is too big, moves too fast, or
doesn’t fit the strictures of your database architecture”
(Source: Ed Dumbill – O’Reilly Community)
Its key premise is that all data has potential value if it
can be collected, analysed and used to generate
actionable insight
6 © Copyright 2012, Trillium Software, Inc. All rights reserved.
7. The characteristics of Big Data - the 3Vs
• Reflects exponential growth of data – predicted 40-60% per
annum
• Today 2.5 quintillion bytes of data are created every day
• 90% of all digital data was created in the last two years
• Data generated more varied and complex than before:
– Text, Audio, Images, Machine Generated etc.
• Much of this data is semi-structured or unstructured
• Traditional IT techniques ill equipped to process & analyse it
• Data often generated in real time
• Analysis and response needs to be rapid, often also real time
• Traditional BI / DW environments becoming obsolescent –
new approaches are needed
7 © Copyright 2012, Trillium Software, Inc. All rights reserved.
8. What’s different about Big Data?
New technologies which enable distributed & highly
scalable MPP (Massively Parallel Processing), e.g.
Apache Hadoop
MapReduce
NoSQL databases
Strong emphasis on analytical approaches
Emergence of “data science”
Predictive Analytics
Data Mining
The “democratisation” of data
Data made available to all (cf Cloud Computing)
Business and not IT led BI
8 © Copyright 2012, Trillium Software, Inc. All rights reserved.
9. Where does Big Data come from?
SOCIAL
MEDIA &
SOCIAL
NETWORKS
MACHINE
GENERATED
WIDELY KNOWN
SOURCES
9 © Copyright 2012, Trillium Software, Inc. All rights reserved.
10. Big Data – Foundations of Success
Identifying the right data to solve the business problem
or opportunity
The ability to integrate & match varied data from multiple
data sources
structured, semi-structured, unstructured
Building the right IT infrastructure to support Big Data
applications
Having the right capabilities & skills to exploit the data
10 © Copyright 2012, Trillium Software, Inc. All rights reserved.
11. Big Data – Barriers & Pitfalls
The sheer volume of data – what’s worth using?
Data extraction challenges
The ability to match data from disparate sources /
formats / media
The time taken to integrate new data sources
The risks of mismatching and incorrect identification of
individuals
Legal & regulatory pitfalls
Security concerns – corporate & individual
Lack of skills & expertise
11 © Copyright 2012, Trillium Software, Inc. All rights reserved.
12. Big Data – the data integration challenge
SOCIAL
MEDIA
SENSORS
CS
DATA
EMAIL
MOBILES
EXTERNALDATASOURCES
INTERNALDATASOURCES
CRM
BILLING
OPS
SALES
PRODS
ANALYTICS PLATFORM 1
ANALYTICS PLATFORM 2
ANALYTICS PLATFORM 3
ANALYTICS PLATFORM n
ACTIONABLE INSIGHT
& KNOWLEDGE
12 © Copyright 2012, Trillium Software, Inc. All rights reserved.
13. Big Data – DQ as the key enabler
SOCIAL
MEDIA
SENSOR
S
CS
DATA
EMAIL
EXTERNALDATASOURCES
INTERNALDATASOURCES
CRM
BILLING
OPS
SALES
PRODS
ANALYTICS PLATFORM 1
ANALYTICS PLATFORM 2
ANALYTICS PLATFORM 3
ANALYTICS PLATFORM n
ACTIONABLE INSIGHT
& KNOWLEDGE
PROFILE
PARSE
STANDARDISE
MATCH
ENRICH
DATA QUALITY PLATFORM
PROFILE
PARSE
STANDARDISE
MATCH
ENRICH
MOBILES
13 © Copyright 2012, Trillium Software, Inc. All rights reserved.
14. Big Data – the DG & DQ impact
• Big Data will depend on data
quality to reap its claimed
benefits – the GIGO truism
• The democratization of data
will expose poor DQ
• The need for Data
Governance increases as
data becomes more
accessible
• Data skills will become more
valued for ‘data science’
• Big Data will increase the
3Vs of data
• Control of data becomes
more difficult – scope and
variety of use increases
• Data standards & business
rules become more complex
• Potential legal & regulatory
minefield
14 © Copyright 2012, Trillium Software, Inc. All rights reserved.
15. Disruptive eruption 2 –
Cloud Computing
15 © Copyright 2012, Trillium Software, Inc. All rights reserved.
16. Cloud Computing – Alternative Definitions
“Cloud computing is the delivery of computing as a
service rather than a product, whereby shared
resources, software, and information are provided to
computers and other devices as a metered service over
a network (typically the Internet).” (Wikipedia)
“Marketing term for the technologies that provide
computation, software, data access, and storage
services that do not require end-user knowledge of the
physical location or configuration of the system that
delivers the services.” (Trillium Software)
16 © Copyright 2012, Trillium Software, Inc. All rights reserved.
17. Cloud Computing – the Wikipedia view
17 © Copyright 2012, Trillium Software, Inc. All rights reserved.
18. Cloud Computing – Key Elements
Provision of services via the Internet / network
Virtual not physical allocation of resources
Multi-tenanted hosting
Pay as you use - not outright purchase (cf utilities)
Cloud is a disruptive technology as it provides a clear
alternative model to outright purchase of hardware,
platforms & applications
18
18 © Copyright 2012, Trillium Software, Inc. All rights reserved.
19. Types of clouds & services
Public/private/hybrid options
Public – via the internet
Private – via an intranet
Hybrid – combination
Cloud services
Infrastructure as a service (IaaS)
Platform as a service (PaaS)
Software as a service (SaaS)
et al (XaaS)
19 © Copyright 2012, Trillium Software, Inc. All rights reserved.
20. Cloud Computing: potential benefits (1)
Speed to deploy new applications & services
Greater standardisation
Scalability & elasticity
Lower initial implementation costs – CAPEX to OPEX
Better cost control and lower internal IT costs (e.g.
help desks)
20 © Copyright 2012, Trillium Software, Inc. All rights reserved.
21. Cloud Computing: potential benefits (2)
Benefits to SMEs who cannot afford to purchase
Try before you buy options – benefits both
customers & suppliers
Self-service and self-configuration of services
Better and faster user adoption
Potentially improved performance
Automatic data back ups
21 © Copyright 2012, Trillium Software, Inc. All rights reserved.
22. Cloud Computing –
barriers & risks
DATADATA
SECURITYSECURITY
& PRIVACY& PRIVACY
CONCERNSCONCERNS
COMMERCIALCOMMERCIAL
& OPERATIONAL& OPERATIONAL
FACTORSFACTORS
APPLICATIONAPPLICATION
& DATA& DATA
INTEGRATIONINTEGRATION
CHALLENGESCHALLENGES
LEGAL &LEGAL &
REGULATORYREGULATORY
RESTRICTIONSRESTRICTIONS
22 © Copyright 2012, Trillium Software, Inc. All rights reserved.
23. Preparing data for migration
• Scoping and scaling data to be migrated
• Evaluating its suitability for integration with other data sources
• Undertaking source data rationalization & cleanse
Migrating to the cloud environment
• Profiling data in advance of data migration
• Enhancing data in preparation for migration
• Maintaining DQ during ETL processes
Managing data in the cloud
• Enforcing business rules to be applied in the Cloud environment
• Auditing data to ensure security, adherence and quality
• Supporting data governance activities
Cloud – the role of DQ & DG
23 © Copyright 2012, Trillium Software, Inc. All rights reserved.
24. Cloud Computing – the DG / DQ impact
• DQ / DG will be key to
Cloud migration success –
before, during and after
migration
• Internal and external data
integration will become key
• Could improve DQ as fewer
devices will hold data
• DQ host and application
companies may offer
DQaaS
• Cloud will require an
enhanced focus on data
governance – within and
outside the enterprise
• Organisations may lose
physical control of data
• DQ SLAs will be needed
with data hosts / suppliers
• Legal & regulatory
compliance becomes a
major challenge
24 © Copyright 2012, Trillium Software, Inc. All rights reserved.
25. Disruptive eruption 3 –
Data Virtualization
25 © Copyright 2012, Trillium Software, Inc. All rights reserved.
26. Data virtualization – a simple view
26 © Copyright 2012, Trillium Software, Inc. All rights reserved.
27. Data Virtualization – a less simple view
27 © Copyright 2012, Trillium Software, Inc. All rights reserved.
28. Data virtualization – the essentials
Data is held in a variety of internal and external sources (e.g.
DBMS, DW, Excel etc.)
A middleware layer sits above the data sources
Creates a virtual view at run time and creates temporary
tables in a dedicated server
Processes, assembles and presents the data to the application
layer / device
Benefits claimed:
Hides complexity from users
Flexibility
Speed - as data can be cached in memory
28 © Copyright 2012, Trillium Software, Inc. All rights reserved.
29. Data virtualization – the DG / DQ impact
• Will put the focus on DQ & data
standardisation as a key
enabler to DV interoperability
• To work will require the
deployment of both real time
and batch DQ capability
• Will require a Shared Business
Vocabulary (SBV) for shared
data model and data standards
across an organisation
• Need for better DQ in source
systems to enable run time
integration
• Data is physically held in a
wide variety of sources so
makes coherent Data
Governance more difficult
• Data at source will be used for
multiple applications so
common business rules harder
to agree
• Run time integration requires
real time DQ – many
organisations do not have this
capability
29 © Copyright 2012, Trillium Software, Inc. All rights reserved.
31. So what’s the impact of all this on DQ /
DG practitioners?
New Data
Quality & Data
Governance
challenges
What do we
need to do?
Changing DQ
and DG roles
& skills
31 © Copyright 2012, Trillium Software, Inc. All rights reserved.
32. New DQ & Data Governance challenges
PREDOMINANTLY
BATCH DQ
CUSTOMER
ORGANISATION
FOCUS
PROCEDURAL
FOCUS MAINLY
WITHIN
THE ENTERPRISE
THE TRADITIONAL
LANDSCAPE
SUPPLIER
ORGANISATION
FOCUS
PREDOMINANTLY
REAL TIME DQ
GROWING FOCUS
OUTSIDE
THE ENTERPRISE
COMMERCIAL
THE CHANGING
LANDSCAPE
32 © Copyright 2012, Trillium Software, Inc. All rights reserved.
33. Changing DQ and DG roles
DQ and Data Governance roles will become more ‘beyond
organisation’ facing – into hosting companies, data &
application suppliers etc.
Many data management and DQ specialists will work with or
evolve into data scientists
DQ and DG people will need to enhance their understanding
of global legal and regulatory environments
Commercial and negotiation skills will become more
important
33 © Copyright 2012, Trillium Software, Inc. All rights reserved.
34. What action should we take?
Identify and get involved in any current or planned Big Data,
Cloud or Data Virtualization initiatives within our
organisations
Ensure that the DQ and DG implications & imperatives of
these initiatives are understood
Participate in any due diligence of potential third party
vendors & providers
Plan for the new DQ and DG challenges that these trends will
pose
34 © Copyright 2012, Trillium Software, Inc. All rights reserved.
35. The changing landscape
Better DQ needs to be achieved in an environment where data will
continue to increase by 50% per annum
The claimed benefits of Big Data, Cloud & Data Virtualisation cannot be
achieved without renewed emphasis on data quality management & data
governance
Data governance becomes increasingly challenging & extends within and
outside the enterprise
DQ services will increasingly be offered as DQaaS by vendors and data
hosts, and more DQ / DG roles may be outsourced
As DQ practitioners we need to understand, educate and get involved
with those in our organisations who are creating the new landscape
35 © Copyright 2012, Trillium Software, Inc. All rights reserved.
36. A final thought…
“It’s not the will to win
but the will to prepare to
win that makes the
difference”
Bear Bryant –
US Football Coach
1913 – 1983
36 © Copyright 2012, Trillium Software, Inc. All rights reserved.