SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
Best Practices in Database Archiving
     and Information Lifecycle

        An InformationWeek Webcast
                Sponsored by
Webcast Logistics
Today’s Presenter




                   Carl Olofson,
              Research Vice President,
     Application Development and Deployment,
                        IDC
Best Practices in Database Archiving
and Information Lifecycle Management
How ILM Saves Money, Reduces Risk

Carl Olofson
Research Vice President
IDC

May 2011


Copyright IDC. Reproduction is forbidden unless authorized. All rights reserved.
Agenda
The Problem
    Unchecked database growth

    Hidden costs of large databases

    Security and privacy in test data
Information Lifecycle Management
    What is ILM?

    Database archiving
       – Requirements of database archiving
       – Benefits of database archiving

    Test data masking
       – How data is masked
       – Benefits of data masking
Conclusions / Recommendations
© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   Source:/Notes:   May-11   5
Unchecked Database Growth

As a database grows…
    It requires larger indices

    It consumes more storage

    It requires specialized administration to tune

    It needs more processor power to execute queries and updates

The hidden costs include
    More storage administration

    More downtime for reorgs

    Larger batch windows for backups

© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11   6
Polling Question #1

How rapidly is your main production database growing?
    Under 10% per year

    10% per year

    25% per year

    Over 25% per year




© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11   7
Elements of Test Data Management

Selecting the data
    Must be referentially complete subset of the database

    Must reflect realistic patterns of data to ensure valid testing

Protecting sensitive data
    Sensitive data must be masked to prevent unauthorized viewing

    Masked data needs to make sense to the test system.




© IDC   Visit us at IDC.com and follow us on Twitter: @IDC             May-11   8
Security and Privacy in Test Data

Normal Security Is Often Suspended for Test Data
    Confidential data could be compromised

    Privacy requirements could be breached

    Corporate policies may be violated

    Contractual requirements and government regulations could lead
        to legal culpability
In-House Masking Is Inadequate
    Simplistic results create unrealistic test data

    Code must be changed as the database changes, an
        unreasonable burden on in-house IT
© IDC    Visit us at IDC.com and follow us on Twitter: @IDC   May-11   9
Polling Question #2

In what role is the person in your organization primarily
responsible for refreshing test data?
    DBA

    Development Manager

    Project Leader

    Developer

    Other




© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11   10
Information Lifecycle Management
(ILM)

                                                              Define




                               Archive                       Manage    Protect




                                                               Test


© IDC   Visit us at IDC.com and follow us on Twitter: @IDC                       May-11   11
The Basic Elements of ILM

Definition
    Policies governing data creation, management, removal

Security
    Encryption and access control at a granular level

Protection
    Blocking access to sensitive data, including test data

    Data test data protection done through data masking

Archiving
    Removal of inactive data from the live database

    Storage in a compressed, read-only datastore
© IDC   Visit us at IDC.com and follow us on Twitter: @IDC    May-11   12
The Data Masking Challenge

Application testing requirements
    Using simple XXXX or #### or “Ipsum lorem” usually not
        adequate for robust application testing.

    Data must be representative of actual data in value range and
        distribution.

    Masked data must “make sense”; zip codes correlate to city and
        state, for instance.

    Secured information, such as personal identification, should not
        be inferable from the masked data.

    The fake data should be consistent.



© IDC    Visit us at IDC.com and follow us on Twitter: @IDC      May-11   13
Archiving: Types of Data
Reference
    Created in response to a stand-alone event.

    Randomly retrieved without requiring context

    Active until a special event

    Examples: Customer, Patient, Product
Transactional
    Created at the start of a business process.

    Retrieved in the context of a transaction

    Deactivated at the end of a business process.

    Examples: Sales order, treatment, shipment
Streaming
    Created at reception of a streamed item

    Inactive immediately (cannot be updated)


© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11   14
Classes of Data

Active
    Data that is still being updated.

    Includes reference and transactional data.

Inactive
    Data no longer active, but retained for query and reporting

    Includes historical and streamed data

    Historical data is inactive transaction data
           – Sales order completed, revenue recognized
           – Inventory item sold and picked up
           – Patient treatment completed, patient discharged

© IDC   Visit us at IDC.com and follow us on Twitter: @IDC         May-11   15
Buildup of Inactive Data

Hypothetical Example
    Suppose we have a sales order table

    We start the year with 10,000 orders per month

    Orders grow at 1% per month

    Each order takes 60 days to complete (recognize revenue)

    Orders in process are active data

    Completed orders are inactive data




© IDC   Visit us at IDC.com and follow us on Twitter: @IDC      May-11   16
Buildup of Inactive Transaction Data

                                                                Sales Order Table
        160,000


        140,000


        120,000


        100,000
 Rows




         80,000

                                                                                                            Inactive
         60,000

                                                                                                            Active
         40,000


         20,000


             0
                  Jan    Feb        Mar        Apr       May      Jun   Jul   Aug   Sep   Oct   Nov   Dec


Inactive %




© IDC      Visit us at IDC.com and follow us on Twitter: @IDC                                                  May-11   17
Inactive Data Clogs the Database

DBMS Overhead
    Big Indexes

    Storage demand

    Slower queries

    Slower transaction processing

Operational Overhead
    DBA tuning

    Disruption for unload/reload and reorg

    Longer backup batch windows

© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11   18
Polling Question # 3

Think of transaction data that you retain. What is your required
retention period?
    3-5 years

    6-10 years

    Over 10 years

    We don’t have a retention policy




© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11   19
Approaches to “Aging Out” Data
Partitioning
    Move data to low frequency partition on 2nd or 3rd tier storage

    Use local partition indexes to avoid growth of global table indexes

    Perform maintenance operations by physical partition

    Problem: this approach impacts the whole table, and creates a complex
        operational and management challenge that extends across the
        database
Archiving
    Select referentially complete subsets of inactive data

    Move the inactive data to an archiving system outside the database

    Ensure that the archive can support SQL and that queries can, if
        necessary, be executed in an integrated manner with those of the live
        database.

© IDC    Visit us at IDC.com and follow us on Twitter: @IDC                May-11   20
Benefits of Archiving

Database benefits
    Faster queries

    Less index maintenance overhead

    Smaller dataspaces and simpler schema than partitioning option

    Requires less CPU; license/maintenance savings for DB and
        applications
Operational benefits
    Less schema maintenance than partitioning option

    Stable backup windows

    Much less data reorganization
© IDC    Visit us at IDC.com and follow us on Twitter: @IDC   May-11   21
Application Retirement

Inactive Applications
    Applications become inactive when they are no longer used, and their
        functions have been migrated elsewhere.

    They commonly still have data that must be retained for corporate
        policy or legal reasons.

    For this reason, enterprises keep them running, maintaining them, and
        paying fees for them even though they are inactive.
Retiring Inactive Applications
    All their data is inactive, so it may be archived altogether

    The archiving system must retain the ability to report on the data.

    The savings in servers, storage, software, and operations costs can be
        very significant.
© IDC    Visit us at IDC.com and follow us on Twitter: @IDC                May-11   22
Critical Requirements of Database
Archiving
DBMS Support
    Must support ongoing versions of major RDBMS including DB2,
        Informix, Oracle, Sybase ASE, Microsoft SQL Server, and
        MySQL
    Must record schema and schema changes to support data
        retrieval even after data definitions have changed.
    Must support SQL and ODBC/JDBC used by applications.

Technical requirements
    Random data retrieval

    Compressed, optimized based on read-only access

    Reasonable performance on 2nd and 3rd tier storage

© IDC    Visit us at IDC.com and follow us on Twitter: @IDC       May-11   23
Data Governance
Purpose is to ensure that data is trustworthy
    Data is well defined, and maintenance is rational

    Original source is known

    Sequence and agents of update are known (provenance)

    Data is valid and consistent

    No unauthorized access has happened

    No sensitive data is visible to unauthorized personnel

    Data is retained as required without compromising performance
Business Benefits
    Database development and management addresses known business needs

    Trade secrets are not exposed and confidences are not compromised

    Ensures contractual and legal requirements compliance

    Reduces risk of actual or opportunity cost due to data-driven application error


© IDC   Visit us at IDC.com and follow us on Twitter: @IDC                         May-11   24
ILM and Data Governance

                                              Data Governance
                            Uniform Data Definition & Policy Management


Information Lifecycle
                                                                            Trust Management
    Management


                                                                  Validity and
   Managed Data                         Data
                                                                  Consistency                 Security & Monitoring
Selection & Retention                 Protection
                                                                  Assurance




                                                               Data                     Access
Database          Database              Test Data                            Data                    Provenance   Access Log
                                                             Quality and               Control and
Subsetting        Archiving             Masking                            Cleansing                  Tracking     Analysis
                                                              Profiling                Encryption



© IDC   Visit us at IDC.com and follow us on Twitter: @IDC                                                          May-11   25
ILM and Database Development and
Management Tools
Database Development and Management Tools (DDMT)
    Software used by DBAs and data managers to manage the size,
        performance, and reliability/recoverability of databases

    Includes DBA tools, database replication software, development
        and optimization software, and database archiving / ILM.
The ILM Segment of the DDMT Market
    Just 4.6% in 2009, but the fastest growing segment; the only
        segment to show positive growth in that tough economic year.

    Projected to show the greatest growth of all DDMT segments to
        2014, with a forecast CAGR of 9.9% from $90 m to $188 m.


© IDC    Visit us at IDC.com and follow us on Twitter: @IDC        May-11   26
What’s IBM’s Share in the ILM Market
Segment
                                                             Revenue ($M)


                   Solix
                CA 4%                                        Other
                4%                                           12%

                       HP
                      11%
                                                                             IBM
                            Informatica                                      56%
                                13%




 Source: IDC, 2010
                                                     Total = $89.9 Million

© IDC   Visit us at IDC.com and follow us on Twitter: @IDC                         May-11   27
Conclusions and Recommendations
Conclusions
    Data governance is critical because the utility and trustworthiness of
        enterprise data cannot be left to chance.
    ILM addresses the key dimension of data size management in relation to
        data retention, and test data management.
    These functions cannot be developed and maintained in-house.
Recommendations
    Users should carefully review their data access and retention policies and
        ensure that those policies are carried out.
    In most cases, the best approach to ensuring data retention without
        bloating the databases is to employ database archiving.
    Test data management is not trivial; find professionally developed data
        masking and subsetting tools.
    IBM’s InfoSphere Optim leads the market in addressing these key ILM
        requirements.
© IDC    Visit us at IDC.com and follow us on Twitter: @IDC                   May-11   28
© IDC   Visit us at IDC.com and follow us on Twitter: @IDC   May-11
Information Management


IBM InfoSphere Optim solutions
Managing data throughout its lifecycle in heterogeneous environments
                                                Discover
             Retire                                    Speed understanding and project time through
                                                        relationship discovery within and across data sources
                                                       Understand sensitive data to protect and secure it


                                   Training     Test Data Management
                                                       Easily refresh & maintain right sized non-production
       Discover                                         environments, while reducing storage costs
      Understand                                       Improve application quality and deploy new
       Classify          Subset                         functionality more quickly

                                                Data Masking
                                  Development          Protect sensitive information from misuse & fraud
      Production         Mask                          Prevent data breaches and associated fines

                                                Data Growth Management
                                                       Reduce hardware, storage & maintenance costs
                                     Test              Streamline application upgrades and improve
                                                        application performance


                                                Application Retirement
                                                        Safely retire legacy & redundant applications while
                                                         retaining the data
              Archive                                   Ensure application-independent access to archive
                                                         data                                 © 2011 IBM Corporation
Information Management


Managing Data Across its Lifecycle




            Discover where            Develop database       Enhance performance
             data resides             structures & code


        Classify & define data       Create & refresh test                             Rationalize application
                                                              Manage data growth              portfolio
          and relationships                  data

                                                                                        Enable compliance
                                                               Report & retrieve        with retention & e-
            Define policies          Validate test results      archived data                discovery

            Discover &                  Develop &            Optimize, Archive          Consolidate &
              Define                      Test                   & Access                  Retire

                                            Information Governance
                                 Quality Management – Lifecycle – Security & Privacy




                                                                                                      © 2011 IBM Corporation
Information Management


You can’t govern what you don’t understand                                              Discover &
                                                                                          Define


                                                        Define business objects for archival and
                                               ?         test data applications
                   ?               ?
                                               ?           – Automation of manual activities
                               ?                             accelerates time to value
                   ? ?                 ?   ?            Discover data transformation rules and
            ?                                      ?
    ?                                                    heterogeneous relationships
                    ?              ?       ?               – Business insight into data
           ?
                                   ? ?                       relationships reduces project risk
                                                   ?
                ?                                       Identify hidden sensitive data for privacy
                               ?       ?                   – Provides consistency across
      ?
                                                             information agenda projects
               ?           ?           ?

                       ?       ?               ?
           ?
                Distributed Data Landscape


                                                                                          © 2011 IBM Corporation
Information Management


Employ effective test data management practices                         Develop &
                                                                          Test

    Production or Production Clone

                            Subset & Mask

                    2TB



                                                                        25 GB
   • Create targeted, right-sized test
     environments                               25 GB                 Development
   • Substitute sensitive data with            Unit Test
     fictionalized yet contextually accurate
     data
   • Easily refresh, reset and maintain test
                                                                             50 GB
     environments                                          100 GB
   • Compare data to pinpoint and resolve                                   Training
     application defects faster                         Integration
                                                            Test
   • Accelerate release schedules
                                                                          © 2011 IBM Corporation
Information Management


 Archive historical data for data growth management                                                Optimize, Archive
                                                                                                       & Access

                         Production                                                        Data
                                                                                         Archives
                                                         Archive
                                                                                       Reference Data

                    Restored Data
                      Historical                     Retrieve                          Historical Data
                                                   Can selectively
                          Current                  restore archived
                                                   data records



                                            Universal Access to Application Data



                Mashup Center       Application     Data Find      ODBC / JDBC   XML        Report Writer


      Data Archiving is an intelligent process for moving inactive or infrequently
      accessed data that still has value, while providing the ability to search and
      retrieve the data
                                                                                                            © 2011 IBM Corporation
Information Management


Retire redundant and legacy applications                                                Consolidate &
                                                                                           Retire

 Preserve application data in its business context
    – Capture all related data, including transaction details, reference data & associated
      metadata
    – Capture any related reference data may reside in other application databases
 Retire out-of-date packaged applications as well as legacy custom applications
   – Leverage out-of-box support of packaged applications to quickly identify & extract the
     complete business object
 Shut down legacy system without a replacement
   – Provide fast and easy retrieval of data for research and reporting, as well as audits
     and e-discovery requests

          Infrastructure before Retirement                 Archived Data after Consolidation

                   `                                               `

         User          Application   Database   Data        User



               `                                                   `

        User           Application   Database   Data        User       Archive Engine     Archive Data



               `                                                   `

        User           Application   Database   Data        User


                                                                                            © 2011 IBM Corporation
Information Management



Resources to Learn More!


InfoSphere Optim Solutions page:
     http://www-01.ibm.com/software/data/optim/

     –IDC Worldwide Database Development and
      Management Tools 2009 Vendor and Segment Analysis
      Report
     –Whitepaper: Control Application Data Growth Before It
      Controls Your Business
     –Whitepaper: Enterprise Strategies to Improve
      Application Testing
     –InfoSphere Optim Solutions for Custom and Packaged
      Applications Solution Brief

                                                    © 2011 IBM Corporation
Q&A Session


  Please Submit Your Questions Now

Más contenido relacionado

La actualidad más candente

BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
Dragan Kinkela
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
Vinny (Gurvinder) Ahuja
 
E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...
E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...
E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...
InSync2011
 

La actualidad más candente (20)

Case Studies in Improving Application Performance With Solix Database Archivi...
Case Studies in Improving Application Performance With Solix Database Archivi...Case Studies in Improving Application Performance With Solix Database Archivi...
Case Studies in Improving Application Performance With Solix Database Archivi...
 
Estuate EDM Checklist
Estuate EDM ChecklistEstuate EDM Checklist
Estuate EDM Checklist
 
BizDataX White paper Test Data Management
BizDataX White paper Test Data ManagementBizDataX White paper Test Data Management
BizDataX White paper Test Data Management
 
Faw
FawFaw
Faw
 
Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...Integrating BigInsights and Puredata system for analytics with query federati...
Integrating BigInsights and Puredata system for analytics with query federati...
 
MDM Institute: Why is Reference data mission critical now?
MDM Institute: Why is Reference data mission critical now?MDM Institute: Why is Reference data mission critical now?
MDM Institute: Why is Reference data mission critical now?
 
Data vault: What's Next
Data vault: What's NextData vault: What's Next
Data vault: What's Next
 
593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward593 Managing Enterprise Data Quality Using SAP Information Steward
593 Managing Enterprise Data Quality Using SAP Information Steward
 
E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...
E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...
E-Business Suite 2 _ Ben Davis _ Achieving outstanding optim data management ...
 
How to Approach Tool Integrations
How to Approach Tool IntegrationsHow to Approach Tool Integrations
How to Approach Tool Integrations
 
Content Centric Applications
Content Centric ApplicationsContent Centric Applications
Content Centric Applications
 
How Real TIme Data Changes the Data Warehouse
How Real TIme Data Changes the Data WarehouseHow Real TIme Data Changes the Data Warehouse
How Real TIme Data Changes the Data Warehouse
 
Data Flux
Data FluxData Flux
Data Flux
 
Benefits of data_archiving_in_data _warehouses
Benefits of data_archiving_in_data _warehousesBenefits of data_archiving_in_data _warehouses
Benefits of data_archiving_in_data _warehouses
 
Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3
 
Dynamic Data Masking - Breakthrough Innovation in Application Security
Dynamic Data Masking - Breakthrough Innovation in Application SecurityDynamic Data Masking - Breakthrough Innovation in Application Security
Dynamic Data Masking - Breakthrough Innovation in Application Security
 
Akili Data Integration using PPDM
Akili Data Integration using PPDMAkili Data Integration using PPDM
Akili Data Integration using PPDM
 
Implementing BI & DW Governance
Implementing BI & DW GovernanceImplementing BI & DW Governance
Implementing BI & DW Governance
 
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis ArnaudièsIBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
 
Understanding Reference Data with Aaron Zornes
Understanding Reference Data with Aaron ZornesUnderstanding Reference Data with Aaron Zornes
Understanding Reference Data with Aaron Zornes
 

Similar a 525 ibm optim

Turning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable InsightsTurning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable Insights
G3 Communications
 

Similar a 525 ibm optim (20)

Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
 
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
How to Merge the Data Lake and the Data Warehouse: The Power of a Unified Ana...
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Driving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data AssetsDriving Business Value Through Agile Data Assets
Driving Business Value Through Agile Data Assets
 
A Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of ThingsA Connected Data Landscape: Virtualization and the Internet of Things
A Connected Data Landscape: Virtualization and the Internet of Things
 
Guardium Data Activiy Monitor For C- Level Executives
Guardium Data Activiy Monitor For C- Level ExecutivesGuardium Data Activiy Monitor For C- Level Executives
Guardium Data Activiy Monitor For C- Level Executives
 
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
Oracle databáze - zkonsolidovat, ochránit a ještě ušetřit! (1. část)
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
How to build a successful data lake Presentation.pptx
How to build a successful data lake Presentation.pptxHow to build a successful data lake Presentation.pptx
How to build a successful data lake Presentation.pptx
 
Shield db data security
Shield db   data securityShield db   data security
Shield db data security
 
Shield db data security
Shield db   data securityShield db   data security
Shield db data security
 
Shield db data security
Shield db   data securityShield db   data security
Shield db data security
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
 
Innovation Without Compromise: The Challenges of Securing Big Data
Innovation Without Compromise: The Challenges of Securing Big DataInnovation Without Compromise: The Challenges of Securing Big Data
Innovation Without Compromise: The Challenges of Securing Big Data
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Redefine Your Datacenter Infrastructure by 3rd Platform
Redefine Your Datacenter Infrastructure by 3rd PlatformRedefine Your Datacenter Infrastructure by 3rd Platform
Redefine Your Datacenter Infrastructure by 3rd Platform
 
Turning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable InsightsTurning Business Intelligence Into Actionable Insights
Turning Business Intelligence Into Actionable Insights
 

Más de Accenture

Certify 2014trends-report
Certify 2014trends-reportCertify 2014trends-report
Certify 2014trends-report
Accenture
 
Tier 2 net app baseline design standard revised nov 2011
Tier 2 net app baseline design standard   revised nov 2011Tier 2 net app baseline design standard   revised nov 2011
Tier 2 net app baseline design standard revised nov 2011
Accenture
 
Perf stat windows
Perf stat windowsPerf stat windows
Perf stat windows
Accenture
 
Performance problems on ethernet networks when the e0m management interface i...
Performance problems on ethernet networks when the e0m management interface i...Performance problems on ethernet networks when the e0m management interface i...
Performance problems on ethernet networks when the e0m management interface i...
Accenture
 
NetApp system installation workbook Spokane
NetApp system installation workbook SpokaneNetApp system installation workbook Spokane
NetApp system installation workbook Spokane
Accenture
 
Migrate volume in akfiler7
Migrate volume in akfiler7Migrate volume in akfiler7
Migrate volume in akfiler7
Accenture
 
Migrate vol in akfiler7
Migrate vol in akfiler7Migrate vol in akfiler7
Migrate vol in akfiler7
Accenture
 
Data storage requirements AK
Data storage requirements AKData storage requirements AK
Data storage requirements AK
Accenture
 
C mode class
C mode classC mode class
C mode class
Accenture
 
Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012
Accenture
 
Reporting demo
Reporting demoReporting demo
Reporting demo
Accenture
 
Net app virtualization preso
Net app virtualization presoNet app virtualization preso
Net app virtualization preso
Accenture
 
Providence net app upgrade plan PPMC
Providence net app upgrade plan PPMCProvidence net app upgrade plan PPMC
Providence net app upgrade plan PPMC
Accenture
 
WSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutionsWSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutions
Accenture
 
50,000-seat_VMware_view_deployment
50,000-seat_VMware_view_deployment50,000-seat_VMware_view_deployment
50,000-seat_VMware_view_deployment
Accenture
 
Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...
Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...
Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...
Accenture
 
Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11
Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11
Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11
Accenture
 
Snap mirror source to tape to destination scenario
Snap mirror source to tape to destination scenarioSnap mirror source to tape to destination scenario
Snap mirror source to tape to destination scenario
Accenture
 

Más de Accenture (20)

Certify 2014trends-report
Certify 2014trends-reportCertify 2014trends-report
Certify 2014trends-report
 
Calabrio analyze
Calabrio analyzeCalabrio analyze
Calabrio analyze
 
Tier 2 net app baseline design standard revised nov 2011
Tier 2 net app baseline design standard   revised nov 2011Tier 2 net app baseline design standard   revised nov 2011
Tier 2 net app baseline design standard revised nov 2011
 
Perf stat windows
Perf stat windowsPerf stat windows
Perf stat windows
 
Performance problems on ethernet networks when the e0m management interface i...
Performance problems on ethernet networks when the e0m management interface i...Performance problems on ethernet networks when the e0m management interface i...
Performance problems on ethernet networks when the e0m management interface i...
 
NetApp system installation workbook Spokane
NetApp system installation workbook SpokaneNetApp system installation workbook Spokane
NetApp system installation workbook Spokane
 
Migrate volume in akfiler7
Migrate volume in akfiler7Migrate volume in akfiler7
Migrate volume in akfiler7
 
Migrate vol in akfiler7
Migrate vol in akfiler7Migrate vol in akfiler7
Migrate vol in akfiler7
 
Data storage requirements AK
Data storage requirements AKData storage requirements AK
Data storage requirements AK
 
C mode class
C mode classC mode class
C mode class
 
Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012
 
NA notes
NA notesNA notes
NA notes
 
Reporting demo
Reporting demoReporting demo
Reporting demo
 
Net app virtualization preso
Net app virtualization presoNet app virtualization preso
Net app virtualization preso
 
Providence net app upgrade plan PPMC
Providence net app upgrade plan PPMCProvidence net app upgrade plan PPMC
Providence net app upgrade plan PPMC
 
WSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutionsWSC Net App storage for windows challenges and solutions
WSC Net App storage for windows challenges and solutions
 
50,000-seat_VMware_view_deployment
50,000-seat_VMware_view_deployment50,000-seat_VMware_view_deployment
50,000-seat_VMware_view_deployment
 
Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...
Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...
Tr 3998 -deployment_guide_for_hosted_shared_desktops_and_on-demand_applicatio...
 
Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11
Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11
Tr 3749 -net_app_storage_best_practices_for_v_mware_vsphere,_dec_11
 
Snap mirror source to tape to destination scenario
Snap mirror source to tape to destination scenarioSnap mirror source to tape to destination scenario
Snap mirror source to tape to destination scenario
 

525 ibm optim

  • 1. Best Practices in Database Archiving and Information Lifecycle An InformationWeek Webcast Sponsored by
  • 3. Today’s Presenter Carl Olofson, Research Vice President, Application Development and Deployment, IDC
  • 4. Best Practices in Database Archiving and Information Lifecycle Management How ILM Saves Money, Reduces Risk Carl Olofson Research Vice President IDC May 2011 Copyright IDC. Reproduction is forbidden unless authorized. All rights reserved.
  • 5. Agenda The Problem  Unchecked database growth  Hidden costs of large databases  Security and privacy in test data Information Lifecycle Management  What is ILM?  Database archiving – Requirements of database archiving – Benefits of database archiving  Test data masking – How data is masked – Benefits of data masking Conclusions / Recommendations © IDC Visit us at IDC.com and follow us on Twitter: @IDC Source:/Notes: May-11 5
  • 6. Unchecked Database Growth As a database grows…  It requires larger indices  It consumes more storage  It requires specialized administration to tune  It needs more processor power to execute queries and updates The hidden costs include  More storage administration  More downtime for reorgs  Larger batch windows for backups © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 6
  • 7. Polling Question #1 How rapidly is your main production database growing?  Under 10% per year  10% per year  25% per year  Over 25% per year © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 7
  • 8. Elements of Test Data Management Selecting the data  Must be referentially complete subset of the database  Must reflect realistic patterns of data to ensure valid testing Protecting sensitive data  Sensitive data must be masked to prevent unauthorized viewing  Masked data needs to make sense to the test system. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 8
  • 9. Security and Privacy in Test Data Normal Security Is Often Suspended for Test Data  Confidential data could be compromised  Privacy requirements could be breached  Corporate policies may be violated  Contractual requirements and government regulations could lead to legal culpability In-House Masking Is Inadequate  Simplistic results create unrealistic test data  Code must be changed as the database changes, an unreasonable burden on in-house IT © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 9
  • 10. Polling Question #2 In what role is the person in your organization primarily responsible for refreshing test data?  DBA  Development Manager  Project Leader  Developer  Other © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 10
  • 11. Information Lifecycle Management (ILM) Define Archive Manage Protect Test © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 11
  • 12. The Basic Elements of ILM Definition  Policies governing data creation, management, removal Security  Encryption and access control at a granular level Protection  Blocking access to sensitive data, including test data  Data test data protection done through data masking Archiving  Removal of inactive data from the live database  Storage in a compressed, read-only datastore © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 12
  • 13. The Data Masking Challenge Application testing requirements  Using simple XXXX or #### or “Ipsum lorem” usually not adequate for robust application testing.  Data must be representative of actual data in value range and distribution.  Masked data must “make sense”; zip codes correlate to city and state, for instance.  Secured information, such as personal identification, should not be inferable from the masked data.  The fake data should be consistent. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 13
  • 14. Archiving: Types of Data Reference  Created in response to a stand-alone event.  Randomly retrieved without requiring context  Active until a special event  Examples: Customer, Patient, Product Transactional  Created at the start of a business process.  Retrieved in the context of a transaction  Deactivated at the end of a business process.  Examples: Sales order, treatment, shipment Streaming  Created at reception of a streamed item  Inactive immediately (cannot be updated) © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 14
  • 15. Classes of Data Active  Data that is still being updated.  Includes reference and transactional data. Inactive  Data no longer active, but retained for query and reporting  Includes historical and streamed data  Historical data is inactive transaction data – Sales order completed, revenue recognized – Inventory item sold and picked up – Patient treatment completed, patient discharged © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 15
  • 16. Buildup of Inactive Data Hypothetical Example  Suppose we have a sales order table  We start the year with 10,000 orders per month  Orders grow at 1% per month  Each order takes 60 days to complete (recognize revenue)  Orders in process are active data  Completed orders are inactive data © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 16
  • 17. Buildup of Inactive Transaction Data Sales Order Table 160,000 140,000 120,000 100,000 Rows 80,000 Inactive 60,000 Active 40,000 20,000 0 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Inactive % © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 17
  • 18. Inactive Data Clogs the Database DBMS Overhead  Big Indexes  Storage demand  Slower queries  Slower transaction processing Operational Overhead  DBA tuning  Disruption for unload/reload and reorg  Longer backup batch windows © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 18
  • 19. Polling Question # 3 Think of transaction data that you retain. What is your required retention period?  3-5 years  6-10 years  Over 10 years  We don’t have a retention policy © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 19
  • 20. Approaches to “Aging Out” Data Partitioning  Move data to low frequency partition on 2nd or 3rd tier storage  Use local partition indexes to avoid growth of global table indexes  Perform maintenance operations by physical partition  Problem: this approach impacts the whole table, and creates a complex operational and management challenge that extends across the database Archiving  Select referentially complete subsets of inactive data  Move the inactive data to an archiving system outside the database  Ensure that the archive can support SQL and that queries can, if necessary, be executed in an integrated manner with those of the live database. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 20
  • 21. Benefits of Archiving Database benefits  Faster queries  Less index maintenance overhead  Smaller dataspaces and simpler schema than partitioning option  Requires less CPU; license/maintenance savings for DB and applications Operational benefits  Less schema maintenance than partitioning option  Stable backup windows  Much less data reorganization © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 21
  • 22. Application Retirement Inactive Applications  Applications become inactive when they are no longer used, and their functions have been migrated elsewhere.  They commonly still have data that must be retained for corporate policy or legal reasons.  For this reason, enterprises keep them running, maintaining them, and paying fees for them even though they are inactive. Retiring Inactive Applications  All their data is inactive, so it may be archived altogether  The archiving system must retain the ability to report on the data.  The savings in servers, storage, software, and operations costs can be very significant. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 22
  • 23. Critical Requirements of Database Archiving DBMS Support  Must support ongoing versions of major RDBMS including DB2, Informix, Oracle, Sybase ASE, Microsoft SQL Server, and MySQL  Must record schema and schema changes to support data retrieval even after data definitions have changed.  Must support SQL and ODBC/JDBC used by applications. Technical requirements  Random data retrieval  Compressed, optimized based on read-only access  Reasonable performance on 2nd and 3rd tier storage © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 23
  • 24. Data Governance Purpose is to ensure that data is trustworthy  Data is well defined, and maintenance is rational  Original source is known  Sequence and agents of update are known (provenance)  Data is valid and consistent  No unauthorized access has happened  No sensitive data is visible to unauthorized personnel  Data is retained as required without compromising performance Business Benefits  Database development and management addresses known business needs  Trade secrets are not exposed and confidences are not compromised  Ensures contractual and legal requirements compliance  Reduces risk of actual or opportunity cost due to data-driven application error © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 24
  • 25. ILM and Data Governance Data Governance Uniform Data Definition & Policy Management Information Lifecycle Trust Management Management Validity and Managed Data Data Consistency Security & Monitoring Selection & Retention Protection Assurance Data Access Database Database Test Data Data Provenance Access Log Quality and Control and Subsetting Archiving Masking Cleansing Tracking Analysis Profiling Encryption © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 25
  • 26. ILM and Database Development and Management Tools Database Development and Management Tools (DDMT)  Software used by DBAs and data managers to manage the size, performance, and reliability/recoverability of databases  Includes DBA tools, database replication software, development and optimization software, and database archiving / ILM. The ILM Segment of the DDMT Market  Just 4.6% in 2009, but the fastest growing segment; the only segment to show positive growth in that tough economic year.  Projected to show the greatest growth of all DDMT segments to 2014, with a forecast CAGR of 9.9% from $90 m to $188 m. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 26
  • 27. What’s IBM’s Share in the ILM Market Segment Revenue ($M) Solix CA 4% Other 4% 12% HP 11% IBM Informatica 56% 13% Source: IDC, 2010 Total = $89.9 Million © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 27
  • 28. Conclusions and Recommendations Conclusions  Data governance is critical because the utility and trustworthiness of enterprise data cannot be left to chance.  ILM addresses the key dimension of data size management in relation to data retention, and test data management.  These functions cannot be developed and maintained in-house. Recommendations  Users should carefully review their data access and retention policies and ensure that those policies are carried out.  In most cases, the best approach to ensuring data retention without bloating the databases is to employ database archiving.  Test data management is not trivial; find professionally developed data masking and subsetting tools.  IBM’s InfoSphere Optim leads the market in addressing these key ILM requirements. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11 28
  • 29. © IDC Visit us at IDC.com and follow us on Twitter: @IDC May-11
  • 30. Information Management IBM InfoSphere Optim solutions Managing data throughout its lifecycle in heterogeneous environments Discover Retire  Speed understanding and project time through relationship discovery within and across data sources  Understand sensitive data to protect and secure it Training Test Data Management  Easily refresh & maintain right sized non-production Discover environments, while reducing storage costs Understand  Improve application quality and deploy new Classify Subset functionality more quickly Data Masking Development  Protect sensitive information from misuse & fraud Production Mask  Prevent data breaches and associated fines Data Growth Management  Reduce hardware, storage & maintenance costs Test  Streamline application upgrades and improve application performance Application Retirement  Safely retire legacy & redundant applications while retaining the data Archive  Ensure application-independent access to archive data © 2011 IBM Corporation
  • 31. Information Management Managing Data Across its Lifecycle Discover where Develop database Enhance performance data resides structures & code Classify & define data Create & refresh test Rationalize application Manage data growth portfolio and relationships data Enable compliance Report & retrieve with retention & e- Define policies Validate test results archived data discovery Discover & Develop & Optimize, Archive Consolidate & Define Test & Access Retire Information Governance Quality Management – Lifecycle – Security & Privacy © 2011 IBM Corporation
  • 32. Information Management You can’t govern what you don’t understand Discover & Define  Define business objects for archival and ? test data applications ? ? ? – Automation of manual activities ? accelerates time to value ? ? ? ?  Discover data transformation rules and ? ? ? heterogeneous relationships ? ? ? – Business insight into data ? ? ? relationships reduces project risk ? ?  Identify hidden sensitive data for privacy ? ? – Provides consistency across ? information agenda projects ? ? ? ? ? ? ? Distributed Data Landscape © 2011 IBM Corporation
  • 33. Information Management Employ effective test data management practices Develop & Test Production or Production Clone Subset & Mask 2TB 25 GB • Create targeted, right-sized test environments 25 GB Development • Substitute sensitive data with Unit Test fictionalized yet contextually accurate data • Easily refresh, reset and maintain test 50 GB environments 100 GB • Compare data to pinpoint and resolve Training application defects faster Integration Test • Accelerate release schedules © 2011 IBM Corporation
  • 34. Information Management Archive historical data for data growth management Optimize, Archive & Access Production Data Archives Archive Reference Data Restored Data Historical Retrieve Historical Data Can selectively Current restore archived data records Universal Access to Application Data Mashup Center Application Data Find ODBC / JDBC XML Report Writer Data Archiving is an intelligent process for moving inactive or infrequently accessed data that still has value, while providing the ability to search and retrieve the data © 2011 IBM Corporation
  • 35. Information Management Retire redundant and legacy applications Consolidate & Retire  Preserve application data in its business context – Capture all related data, including transaction details, reference data & associated metadata – Capture any related reference data may reside in other application databases  Retire out-of-date packaged applications as well as legacy custom applications – Leverage out-of-box support of packaged applications to quickly identify & extract the complete business object  Shut down legacy system without a replacement – Provide fast and easy retrieval of data for research and reporting, as well as audits and e-discovery requests Infrastructure before Retirement Archived Data after Consolidation ` ` User Application Database Data User ` ` User Application Database Data User Archive Engine Archive Data ` ` User Application Database Data User © 2011 IBM Corporation
  • 36. Information Management Resources to Learn More! InfoSphere Optim Solutions page: http://www-01.ibm.com/software/data/optim/ –IDC Worldwide Database Development and Management Tools 2009 Vendor and Segment Analysis Report –Whitepaper: Control Application Data Growth Before It Controls Your Business –Whitepaper: Enterprise Strategies to Improve Application Testing –InfoSphere Optim Solutions for Custom and Packaged Applications Solution Brief © 2011 IBM Corporation
  • 37. Q&A Session Please Submit Your Questions Now