SlideShare una empresa de Scribd logo
1 de 62
The Future of Cummins Data
Warehousing Architecture and
Strategy
Pragnya Balamurukesan
Graham Cenko
Michael Khamis
Pavithra Thevasenapathy
1
Crimson 3
Agenda
Crimson 3
2
Our Understanding
Data Warehousing Trends
Recommendations
Risks and Mitigations
Financials
Implementation Timeline
Conclusion
Our Understanding
Crimson 3
3
Cummins has six Data
Warehouses on the
Oracle Exadata
platform, a Data Lake
environment in
Hadoop and a
Teradata warehousing
appliance, which are
not integrated
The current Data
Warehouse
architecture and
strategy does not
meet the business
intelligence or future
needs of the company
What Data
Warehouse
architecture and
strategy would meet
Cummins’ needs and
support future growth
initiatives?
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Future trends that should be incorporated into Cummins’
Data Warehousing strategy
Crimson 3
4
Cloud
Data
Warehouse
Business
Intelligence
Tools
Big Data
Big Data
Analytics
Hadoop
Platform
Real-Time
Data
Streaming
Analytics &
Reporting
Consolidation
Physical
Logical
Foley, John. “The Top 10 Trends in Data Warehousing.” Forbes. 10 March 2014
Satell, Matt. “The Future of Data Warehousing: 7 Industry Experts Share Their Predictions. BetterBuys. 5 November 2014
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Cummins should adopt this Data Warehouse architecture
to satisfy future trends and growth initiatives
Crimson 3
5
Cloud Files Office files Web services Social Feeds Sensor Web logs
Data
Sources
Enterprise
Information
Management BPM ECM CEM Discovery Info exchange
Data Warehouse Hadoop
Stream
Computing
Master Data Management
Data
Virtualization
Reporting Statistical analysis Visualization
Business
Intelligence Tools
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Cummins should take these five actions to achieve the
recommended Data Warehouse architecture
Crimson 3
6
Governance
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Move certain databases from Oracle Data
Warehouse to Teradata Active Data Warehouse
Private Cloud
Implement Hadoop-as-a-Service using Google
Compute Engine and MapR
Adopt Cisco Composite Data Virtualization
Platform
Add IBM InfoSphere Stream, Tableau and Spotfire
to the Business Intelligence & Analytics tools
Crimson 3
7
TERADATA ADW PRIVATE CLOUD
EDW
Components
Power
Gen
Engine Distribution
Active events
Customer-sales representative interaction, worker in
shipping & receiving
Active load
Arrival of damaged critical supplies
Active enterprise integration
Fitting into existing portals, Web services, SOA
components
Active workload management
Controlling mixed workloads
Active availability
Increasing the DW availability from business critical to
mission critical
Active access
Out-of-stock situation, inventory manager makes
decisions
ORACLE
CorporateComponents
Engine
Power
Gen
Distribution
Supply chain, Logistics, Sales, Marketing, Inventory & Operational data
Cummins should move certain Databases from Oracle
Exadata to Teradata Active Data Warehouse Private Cloud
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
BENEFITS
Teradata(2015) “Enabling the Agile Enterprise with Active Data Warehousing”
Cummins should adopt Teradata private cloud for the
following reasons
Crimson 3
8
Challenges in
Public Cloud
Worldwide private cloud
adoption- Forbes
Consolidate to Teradata
private ADW
Reduced costs through
server utilization
Pay what you use
,when you need
Faster less than
five minutes
Elastic
performance
Quick decision
making
Leading Healthcare
company saves 4.3
billion, delivering
250,000 self service
reports, improving
performance by 10x
Government agency
which took 20
hours for running
queries can run in
15 minutes
Why private cloud model ?
• High Active Performance
• Effortless Scalability
• Operational Availability
• Enterprise Concurrency
• Investment Protection
Success
stories
Characteristics of Teradata ADW private cloud
Benefits of Teradata ADW private cloud
Teradata News Release (2012) Teradata Active Data Warehouses Provide Private Cloud Benefits
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cummins should implement Hadoop-as-a-Service using
Google Compute Engine and MapR
Crimson 3
9
Google Cloud Storage
MapR
MapR CLDB
(Container Location Database)
<cluster> [Master] MapR
MapR FileServer
<cluster> 000 [Worker]
<cluster> 001 [Worker]
<cluster> nnn [Worker]
MapR
MapR FileServer
MapR
MapR FileServer
1
1 An application downloads data
file from Google Cloud Storage
and pushes it MapR-FS2
2 The CLDB distributes the file to
MapR-FS based on the query
3
3 The result of the query is written
to the file on Google Cloud
Storage
DATA FLOW
FEATURES
1
2
3
4
5
Operational Intelligence
Enterprise Data Hub
Internet of Things
Security and Risk Management
Marketing Optimization
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cummins should implement Hadoop-as-a-Service
using MapR for the following reasons
Crimson 3
10
Cost Scalability
Enhanced
productivity
Collaboration
Elasticity Efficiency
MapR Cloudera Hortonworks
Data Ingest Batch and
streaming
writes
Batch Batch
Hbase
Performance
Consistent
low latency
Latency spikes Latency spikes
High
Availability
Self healing
across
multiple
failures
Single failure
recovery
Single failure
recovery
Replication Data +
metadata
Data Data
File IO Read/write Append only Append only
Write level
authentication
Kerberos,
Native
Kerberos Kerberos
Vendor
Criteria
Robert D. Schneider (2014) “Hadoop Buyer’s Guide, Ubantu”
Why we chose cloud
deployment ?
Why we chose MapR ?
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cummins should implement Composite Data
Virtualization Platform to provide a unified logical view
of all the data
Crimson 3
Operational
Stores
SaaS
Applications
Data Warehouses
and Marts
Data Virtualization Platform
Abstra
ct
Federate Cache
CacheOptimizer
Discovery
Traditional,
Big data &
cloud
sources
Cisco Information Server
Instant
Access to all data
End-End data
management
Faster response to
BI & Analytics
Features
BI & Analytic
tools
Logical view of Cisco Composite
Unified logical enterprise view of all the data
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
David Bescmer. Jan 2014. Cisco Data Virtualization
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cummins should install Composite Data Virtualization
Platform for the following reasons
Crimson 3
12
Composite Informatica IBM
Federated
Query
language
3 2 2
Caching 3 2 2
Profiling 3 1 2
Metadata
support
3 1 1
Customer base 3 2 2
Compatibility
with existing
technologies
3 2 2
Total 18/18 10/18 11/18
Vendor
Criteria
Profit Growth
Risk Reduction
Technology Optimization
Staff Productivity
Time-to-Solution Acceleration
Benefits of Virtualization
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cisco “Data Virtualization”
Cummins should reevaluate their existing BI Toolset
and purchase Tableau and Spotfire for visualization and
analytics
Crimson 3
13
Existing - Reporting
•Action: Continue Using
OBIEE and MSBI for
reporting. Phase out the
other four traditional
platforms
•Benefit: Reduced licensing
and training costs,
standardized reports and
less complexity
Tableau - Visualization
•Action: Purchase Tableau
Online for an easy to use
data visualization platform
that is designed for end
business users
•Benefit Enables self-service
BI to the entire
organization, no support
from IT needed
Tibco Spotfire – Statistical
Analysis
•Action: Purchase Tibco
Spotfire Platform for
advanced analytical
capabilities to be used by
business analysts
•Benefit: Predictive and
Prescriptive analytical
capabilities and ability to
consume structured and
unstructured data
Tibco Software Company. “Tibco Spotfire Platform.” 15 December 2015
Tableau. “Tableau Online.” 15 December 2105
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cummins should adopt IBM InfoSphere Streams to
enable real time business intelligence
Crimson 3
14
Avadhoot Patwardhan (2015) “Introduction: Real-Time Analytics on Data in Motion”
Aladdabigdata (2015)Real-time Analytics using IBM InfoSphere Streams
ACQUIRE
Real time data from
several different streams
having different formats
ANALYZE
The data in real time
using applications
developed by either
Cummins or IBM
ACT
On the Business
Intelligence delivered in
real time
Integrated Development
Environment
Scale – Out Runtime Analytic Toolkits
Benefits of Stream Computing
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
Cummins should establish the following teams for effective
governance over the Data Warehouse initiative
Crimson 3
15
Change Management
• Comprised of
senior managers
and supervisors
of each business
unit
• Communicate
change to the
company and each
business unit
• Manage training
of employees
Vendor Management
• Comprised of
Cummins IT
professionals
• Assigns tasks to
vendors while
monitoring the
performance of
each vendor
• Re-negotiating
contracts
Support Team
• Comprised of
Cummins IT
technicians for
each business
unit
• Groups will be
assigned to each
layer of the
architecture
BICC Team
• Comprised of
business managers
from each business
unit
• Champion BI
technologies
defining standards,
business alignment,
project prioritization
and management
Information Governance
• Comprised of C-suite
member, IT
professionals,
business managers,
paralegal, and
members from each
business unit
• Manage information
throughout its
lifecycle
IT Steering Committee
Business & IT Leaders
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
It will take 3 years for Cummins to implement the
recommended Data Warehouse strategy
Crimson 3
16
Year 2Year 1 Year 3
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
The project will costs Cummins $11,370,000 and result
in the following benefits
Crimson 3
17
Emission control
Using real time data to track
emission of engines,
Increasing the quality of
Cummins engines
Investment in the right
technologies
Using BI tools to predict where
market trends in engine
technology are headed
Leading projects in major markets
Using BI tools to improve alignment with
organization strategy
Benefits
Business Value is derived from the actions
taken as a result of the analysis enabled by
the BI tools
Cost Savings: ~$2 million
Cloud storage, Operating Expense, and People
Software
Hardware
Cloud Storage
Tools
End user Training
Cost of Administration
Maintenance Support
External Contract
Total Costs
$ 1,400,000
$ 675,000
$ 65,000
$ 5,750,000
$ 200,000
$ 200,000
$ 2,680,000
$ 400,000
$ 11,370,000
*See appendix for detailed cost description and more sources
Cost
Global expansion
Using BI tools to find existing and
potentially new areas with demand that is
not being exploited
Potential Business Value Benefits
Sallem,Rita. Sept. 2012, “Customer rate their BI /vendors on Costs.”
Sheffield, Glen. March 2015, “How much does Teradata warehouse Cost.”
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Risks and Mitigations
Crimson 3
18
Risk Mitigation
Data maybe breached when we store it in the Teradata cloud Teradata is partnered with Protegrity and utilizes Tokenization
technology which is applied to data before entering into the
warehouse
Data virtualization Cisco platform can bring up data security
concerns because the all the business data is used by this
platform
1.The manager that resides in the Cisco Information Server
takes care of security, metadata , source code and more.
2.The IT security team of Cummins will be given training on the
new security policies and data governance, data standards.
3. Change management team will make sure that there is
effective communication between the vendor management, in-
house IT teams and C-suite level about security measures
The data stored in Google Compute Engine or being used by
MapR’s services maybe breached
MapR is equipped with authentication mechanisms (Kerberos,
Native), authorization mechanisms (Access Control
Expressions, Unix File Permissions, Access Control Lists)
encryption mechanisms (Over-the-Wire Encryption, Encryption
at Rest, Field-Level Encryption, Format-preserving Encryption
and Masking) and governance guidelines
Employees responsible for reporting, visualization or analytics
may become dissatisfied while learning new tools
Reporting tools will remain the same and it will be the Change
Management Team’s responsibility to set the tone from the top
Inconsistent data from legacy systems will remain in the new
Data Warehousing Architecture
Information Governance Team and MDM tool will ensure
consistent and reliable data across platforms and databases
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Teradata. “Our Partners.” 2015
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine
Following these recommendations will lead to a successful
data warehouse architecture that has the capabilities to
allow users to make intelligent business decisions
Crimson 3
19
Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
Data Warehouse
architecture and
strategy that
meets business
needs and future
trends
Move certain Databases
from Oracle to Teradata
Active Data Warehouse
Private Cloud
Re-evaluate existing BI
Toolset and purchase
Tableau and Spotfire for
visualization and analytics
Establish robust governance
for effective use of the Data
Warehouse initiative
Implement Cisco Composite
Data Virtualization Platform
to provide unified logical
view of all the data
Implement Hadoop-as-a-
Service using Google
Compute Engine and MapR
Appendix
Crimson 3
20
Hadoop
Why MapR?
Why Hadoop-as-a-Service?
Security
MapR Architecture
Enterprise Information Management
Capabilities
Architecture
Why OpenText?
Master Data Management
Business Intelligence Tools
Vendor Matrix
Analytical maturity model
IBM InfoSphere Streams
Why InfoSphere?
Security
CISCO Composite Virtualization layer
Functionalities
Why virtualization?
Why Composite?
CISCO Architectures
Success stories
Teradata
Characteristics
Why Private Cloud?
Operational Intelligence
Security
Information Governance team
Costs
Components
Tools
Category
Savings
Why not the Oracle Exadata proposal
Comparative study of MapR, Cloudera,
Hortonworks and Forrester’s ranking
Crimson 3
21
Robert D. Schneider (2014) “Hadoop Buyer’s Guide, Ubantu”
Experfy.com
Benefits of moving Hadoop to the
cloud
Crimson 3
22
1. Cost : The on-premise model for deploying Hadoop would require a large number of servers,
electricity as well as a housing facility. Whereas the cloud deployment would be more cost
effective since it offers better scalability and pay only for what you use.
2. Scalability : The on-premise model would require time consuming addition of physical
servers. The cloud offers massively scalable services extremely quickly
3. Enhanced productivity : Using a cloud based Hadoop platform would enable data access
anytime from anywhere, therefore providing greater and faster access to data
4. Collaboration : A cloud based Hadoop platform would enable seamless collaboration across
the business units. Since syncing and sharing of files would be simultaneous, the collaboration
would be real time
5. Elasticity : Hadoop clusters cannot be added or removed quickly, whereas Hadoop-as-a-
service has the ability to increase or decrease number of clusters (instances) as per demand
6. Handling Batch jobs : The on-premise Hadoop model has scheduled jobs that process the
incoming data on a fixed, temporal basis. The Hadoop-as-a-Service can be optimized by having
the appropriate sized clusters available for the jobs to run
7. Simplifying Hadoop operations : In the on-premise model, as clusters are consolidated there
is no resource isolation for different users. Hadoop-as-a-Service allows provisioning of clusters
with different configurations and characteristics. Therefore management of a multi-tenant
environment is simplified
Hadoop Security
Crimson 3
23
MapR offers several capabilities to help Cummins secure their data. At the product level MapR
prevents unauthorized access to secure the Hadoop and NoSQL data. At the solution level MapR
offers deployment of a large-scale anomaly detection solution that alerts you to network intrusion,
phishing, and other cyberattacks.
Authentication is performed through
1. Kerberos Integration
2. Native authentication
Authorization is the configuration of permissions for users. The authorization mechanisms offered
by MapR are
1. Access Control Expressions
2. Unix File Permissions
3. Access Control Lists
Hadoop Security
Crimson 3
24
MapR also accounts for regulatory compliance and therefore provides four types of auditing which
are
1. maprcli commands that are related to cluster management
2. Authentications to the MapR Control System (MCS)
3. Operations on directories and files and Operations on MapR-DB tables.
As an additional means of preventing unauthorized access of sensitive data, MapR supports
encryption. The encryption mechanisms available are
1. Over-the-Wire Encryption
2. Encryption at Rest
3. Field-Level Encryption
4. Format-preserving Encryption and Masking
MapR also supports features that facilitate effective data governance. Among these are
1. Data Integration
2. Security
3. Data Lineage
4. Information Lifecycle Management
5. Auditing.
Security in MapR
Crimson 3
25
Kerberos Authentication Native Authentication
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Crimson 3
26
Security in MapR
Authorization
Auditing
Encryption
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Crimson 3
27
Security in MapR
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Detailed MapR architecture
Crimson 3
28
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Crimson 3
29
Detailed MapR architecture
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Crimson 3
30
Detailed MapR architecture
MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
Capabilities of the Enterprise
Information Management suite
Crimson 3
31
Enterprise Content management : Information management of
all types and sources of data, throughout it’s life cycle
Business Process Management : Rapid modeling and automation
of process applications and the ability to constantly improve them
Customer Experience Management : Using information to build rich
customer experiences that support collaboration, build relationships
and provides support on any channel such as web, mobile etc.
Information exchange : Exchanging information with any party
and system securely and verifiably
Discovery : Ability to find and learn about the right
information at the right time and place, independent of it’s
locationOpenText (2015) “OpenText Process Suite Platform Architecture”
Four layers of the EIM solution
Crimson 3
32
Gartner declares OpenText to be a leader in
Enterprise Content Management
Crimson 3
33
https://en.wikipedia.org/wiki/Enterprise_information_management
Master Data Management
Crimson 3
34
5 Steps to implementing MDM
1. Document: identify sources while
defining master data
2. Analyze: Evaluate the way the data
flows in addition to defining
transformation rules
3. Construction: Building the actual
MDM warehouse according to the
architecture/rules created
4. Implement: Population the data
warehouse
5. Sustain: Make sure policies and
compliance are upheld through
Cummins governance structure
Reasons for having Master data
Management
• Standardization of data
• Source identification
• Data classification
• Employee information management
• Product information management
• Eliminate duplicated data
Added business value because it organizes
master data, making it possible to have
effective BI tools. This then enable tools
(being used properly) to receive
information on business decisions.
https://www.quora.com/What-is-the-best-master-data-management-software
Buyer’s Matrix for BI Tools
Crimson 3
35
Solutions Review. “2016 Solutions Review Matrix Report.” 2015
Analytical Maturity Model
Crimson 3
36
“As an analytics platform, Spotfire
offers you a variety of add-on
capabilities as the sophistication
of your environment grows, or as
you climb up the analytics
maturity curve, so to speak.”
- Rishi Bhatnagar from Syntelli
Solutions
Analytics Maturity Curve from Tom Davenport
Bhatnagar, Rishi. “How Much Does Spotfire Cost?” Syntelli Solutions. 25 July 2015
IBM InfoSphere Stream example
Crimson 3
37
Example of streaming data sources
associated with smart meters
Typical Streams runtime deployment of a
streaming application
IBM Analytics (2015) “Top industry use cases for stream computing”
IBM Analytics (2015) “IBM Streams”
Forrester gives IBM high scores
Crimson 3
38
Forrester Wave : Big Data Streaming Analytics Platforms, Q3 ‘14
Mike G., Rowan C. (2014) “The Forrester Wave™: Big Data Streaming Analytics Platforms, Q3 2014”
InfoSphere Security
Crimson 3
39
Security is provided in InfoSphere Streams through user authorization and
authentication.
User authorization is managed through Access Control Lists which contains the
roles and their access rights.
User authentication is done either using an LDAP server or PAM authentication
service.
Authentication keys, session time outs and client authentication for web
management services are some of the mechanisms adopted.
Crimson 3
40
CEP vs IBMInfosphere
Discovery, optimize and caching for
composite
Crimson 3
41
Discovery:
1. Introspect available data
2. Discover hidden relationships
3. Model individual view/service
4. Validate view/service
5. Modify as required
Benefits
• Automates difficult work
• Improves time to solution
• Increases object reuse
Optimization :
1. Application invokes request
2. Optimized query (single statement) executes
3. Deliver data in proper form
Benefits:
• Up-to-the-minute data
• Optimized performance
• Less replication required
Caching :
1. Cache essential data
2. Application invokes request
3. Optimized query (leveraging cached data) executes
4. Deliver data in proper form
http://www.compositesw.com/products-services/data-discovery/
Crimson 3
42
Business case for virtualization
• Profit Growth – Data virtualization delivers the information your
organization requires to increase revenue and reduce costs.
• Risk Reduction – Data virtualization’s up-to-the-minute business
insights help you manage business risk and reduce compliance
penalties. Plus data virtualization’s rapid development and quick
iterations lower your IT project risk.
• Technology Optimization – Data virtualization improves utilization
of existing server and storage investments. And with less storage
required, hardware and governance savings are substantial.
• Staff Productivity – Data virtualization’s easy-to-use, high-
productivity design and development environments improve your
staff effectiveness and efficiency.
• Time-to-Solution Acceleration – Your data virtualization projects
are completed faster so business benefits are derived sooner. Lower
project costs are an additional agility benefit
http://www.compositesw.com/data-virtualization/
Crimson 3
43
Virtualization versus Cloud
• Security – Data integration in cloud , putting
the entire data of the business in cloud is a
huge risk.
• Capacity management – Peak times, Holiday
sales
• Redundancy of data without complete
utilization of hardware resources
• In- house capabilities to handle
http://www.businessnewsdaily.com/5791-virtualization-vs-cloud-computing.html
Crimson 3
44
Key benefits of composite
PROVIDES INSTANT ACCESS TO ALL DATA:
• Complete information – Business needs the complete picture. Cisco’s data federation technology virtually
integrates data from multiple sources, without the cost and overhead of physical data consolidation.
• Up-to-the-minute information – Cisco’s query optimization algorithms and techniques are fastest in the industry,
delivering the timely information business requires without impacting source system performance.
• Fit-for-purpose information – Cisco’s powerful data abstraction functions simplify complex data, transforming it
from native structures and syntax into easy-to-understand business views and data services
RESPOND FASTER TO ANALYTIC AND BI TRENDS:
• Streamlined process – Building business views and data services in Cisco is far faster, with far fewer moving parts,
than building physical data stores and filling them using ETL.
• Rapid IT response – Cisco’s reusable views and services, flexible data virtualization architecture, and automated
impact analysis provide the IT agility required to keep pace with business change.
• Quick iterations – Prototyping new solutions is far faster with Cisco DV. Cisco’s rapid development tools surface
live data in just minutes, enabling extraordinary business and IT collaboration.
END TO END DATA MANAGEMENT :
• Data Discovery – Cisco’s introspection and unique-in-the-industry data discovery uncover existing information
assets, unlocking them for valuable new uses.
• Standards-based – Cisco’s numerous standards-based access and delivery options support all the information
types business users require.
• Data Governance – Information is a critical asset. To maximize control, Cisco’s data governance centralizes
metadata management, ensures data security, improves data quality and provides full auditability and lineage
http://www.compositesw.com/products-services/data-virtualization-platform/
Crimson 3
45
Criteria Composite Informatica IBM Denedo
Federated query technology 5 4 3 1
Scalability 5 4 5 4
Data quality 4 5 5 4
Maintenance and support 4 5 4 4
Caching 5 4 4 2
Profiling 5 4 3 2
Costs 3 1 1 4
Version upgrades 4 3 2 3
Complexity of integrated
Portfolio management
4 3 2 3
Metadata support 5 4 4 2
Area of skills and Best practice
documentation
4 3 3 2
Customer base 5 4 4 3
Agility 5 4 4 3
Time to value 5 4 4 3
Compatibility with existing
technologies
5 4 4 4
Forrester ranking 5 4 4 3
Master data management 4 5 5 4
Total 72 65 61 55
Vendor evaluation matrix for composite
Crimson 3
46
Cisco’s Data Virtualization Platform
Development Environment
Cisco Information Server
Runtime Server Environment Management Environment
XML
Packaged Apps RDBMS Excel Files Data Warehouse OLAP Cubes Hadoop / “Big Data” XML Docs Flat Files Web Services
Data Warehouse
Extend / Offload
Governance, Risk
& Compliance
Business
Intelligence
Customer Experience
Management
Mergers &
Acquisitions
Single View of
Enterprise Data
Supply Chain
Management
Analytics
Discovery
Studio
Adapters
Manager
Monitor
Active Cluster
http://www.compositesw.com/products-services/data-virtualization-platform/
Crimson 3
47
Cisco’s Data Virtualization Platform
http://www.compositesw.com/products-services/data-virtualization-platform/
Composite creates virtual marts, views
and services
Crimson 3
48
http://www.compositesw.com/data-virtualization/virtual-data-marts/
http://www.compositesw.com/data-virtualization/operational-data-stores/
Crimson 3
49
Packaged Apps Web Services
Success stories of Composite
Company Before After
Qualcomm
BI projects took
3 - 4 months
Days/Weeks
Pfizer
Management requests
for data took weeks
Hours/Days
Northern Trust
100% data replication 20% replication
http://www.slideshare.net/CiscoPublicSector/composite-data-virtualization
Characteristics of Teradata ADW
Private cloud
Crimson 3
50
Main characteristics of Teradata ADW Private cloud include :
Virtualized resources – Teradata virtualizes all processing and storage so users do not
have to be concerned about the location or availability of system resources – only that they
are getting timely answers to all their business questions automatically without
performance penalty.
• Business analytics – a Teradata Data Lab makes it easier for business users to explore
unique data sets or prototype new analytic ideas.
• Consistent performance – enables IT to meet business user service level agreements
and to ensure user satisfaction by leveraging Teradata’s industry leading workload
management as well as key technologies such as hybrid storage and columnar.
• Elasticity – delivers the analytic resources dynamically and in real time as business user
demand increases and decreases.
• Scalability – enables the environment to scale seamlessly across multiple dimensions
including number of users, number of queries, and data volumes with support for data
scalability up to 92 petabytes.
http://www.teradata.com/News-Releases/2012/Teradata-Active-Data-Warehouses-
Provide-Private-Cloud-Benefits-Today/?LangType=1033&LangSelect=true
Crimson 3
51
Features of Teradata ADW private
cloud
• Active access – high-speed inquiries, analysis, or alerts retrieved from the
ADW and delivered to operational users, devices, or systems.
• Active events – operational events that need to be continuously
monitored, filtered, and alerts sent based on business rules.
• Active load – high-frequency data loading throughout the business day to
ensure data are fresh enough to support active access and active events.
• Active enterprise integration – links the ADW to existing applications,
portals, Web services, service-oriented architectures, and the enterprise
service bus.
• Active workload management – dynamic management of operational and
strategic workloads in the same database, ensuring response times and
maximum throughput.
• Active availability – increasing the data warehouse availability from
business critical to mission critical.
http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true
Private cloud adoption
Crimson 3
52
http://www.datamation.com/cloud-computing/what-is-private-cloud.html
Teradata provides operational
intelligence
Crimson 3
53
http://www.teradata.com/resources/white-
papers/Enabling-the-Agile-Enterprise-with-
Active-Data-Warehousing-
eb4931/?LangType=1033&LangSelect=true
http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-
with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true
Teradata provides operational
intelligence - Framework
Crimson 3
54
http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-
with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true
Security in Teradata
Teradata’s Active Data Warehouse can make data available predictably and
securely by leveraging Protegrity’s Vaultless Tokenization technology.
Tokenization is applied to the sensitive data before it enters the
warehouse, using the enterprise’s own security policies. This provides a
security layer for all information in the database wherever it flows,
without affecting the business’s ability to perform rapid analysis on that
data. The solution relies upon Protegrity’s patent-pending Vaultless
Tokenization, which deploys a very small set of lookup tables of random
values without having to store either the sensitive data or the tokens.
Tokenized data can be mined and manipulated by business processes
without having to return the data to its original form, improving
accessibility and performance while keeping the data protected.
Crimson 3
55
http://www.teradata.com/partners/Protegrity-USA/?LangType=1033&LangSelect=true
Information Governance Team
• Legal: Department works with IT. Driven by policy
issues such as compliance and privacy
• Records/compliance/audit: Deal with record
compliance, document workflow, and archiving
strategies. Also make sure that policy is carried out
enterprise wide
• IT: Helps with more technical issues making sure policies
are configured in systems architecture.
• Info Security: assures that sensitive data is being
held in secure repositories and the data does not leak
into unsecure areas.
• Business Unit: Help to spread the policy and
compliance information to the rest of their BU.
Crimson 3
56
Managing information through its
lifecycle and supporting the
organization’s strategy, operations,
regulatory, legal, risk and
environmental requirements.
This team will manage records,
business intelligence and MDM
policies, rules and
Cost of each component
Crimson 3
57
Hadoop $4000 per node for support
• Software is one time cost
• Cloud is ~$600 per TB
MDM
• $13,000 per collaboration server user (2) assuming $500 per user assuming 20 users
Teradata $2000 per TB
• $2.5 million for in house support
Opentext
• $2000 per user
Cost of tools
Crimson 3
58
Cost of each catagory
Crimson 3
59
Cost Savings
Crimson 3
60
These cost savings are based on how much cheaper it is to store data on the cloud as
opposed to not
Also Operating expenses is an estimate that is derived from the increased amount of
projects Cummins will be able to do with proper BI tools
People cost savings are derived from the less amount of people that will have to
provide support
Cost Sources
Crimson 3
61
Components
http://googlecloudplatform.blogspot.com/2015/07/understanding-
https://blogs.oracle.com/datawarehousing/entry/updated_price_com
http://estore.gemini-systems.com/ibm/software-
http://sheffieldview.com/2015/03/11/how-much-does-a-teradata-data-warehouse-
appliance-cost/
https://core.opentext.com/pricing.html
Tools
http://www.ciosummits.com/Online_Assets_IT_Central_Station_Business_Intelligence_To
ols_Report.pdf
http://www.tableau.com/gartner-business-intelligence-costs
http://www.practicaldb.com/data-visualization-consulting/tableau-vs-spotfire/
http://www.practicaldb.com/data-visualization-consulting/tableau-vs-spotfire/
https://www.betterbuys.com/bi/roi-business-intelligence/
Our recommended solutions is better than the
previously proposed Oracle Exadata solution for the
following reasons
• Future trends like Cloud, Big data, consolidation across platforms and real time
analytics is not supported by Oracle Exadata.
• High Scalability
• High Availability
• 90-95% Resource utilization
• Data management
• Easily can respond to changing BI and analytic trends
• Cost savings – cut on maintenance and support costs, hardware costs, labor costs
etc
• Hadoop Cloud with MapR technologies has huge advantages – efficiency,
collaboration and scalability etc
• Moving operational data to Teradata can provide near- real time data
warehousing which helps intelligent business decisions
• Cummins end goal is to have single truth of data with availability, data quality,
usability which is met by Cisco composite data virtualization platform.
Crimson 3
62

Más contenido relacionado

La actualidad más candente

GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017
Jeremy Maranitch
 
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
jdijcks
 

La actualidad más candente (20)

(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Fast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow PresentationFast Data Strategy Houston Roadshow Presentation
Fast Data Strategy Houston Roadshow Presentation
 
GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017
 
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & BénéficesVirtualisation de données : Enjeux, Usages & Bénéfices
Virtualisation de données : Enjeux, Usages & Bénéfices
 
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
Data Federation
Data FederationData Federation
Data Federation
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
 
Sn wf12 amd fabric server (satheesh nanniyur) oct 12
Sn wf12 amd fabric server (satheesh nanniyur) oct 12Sn wf12 amd fabric server (satheesh nanniyur) oct 12
Sn wf12 amd fabric server (satheesh nanniyur) oct 12
 
Best Practices: Data Virtualization Perspectives and Best Practices
Best Practices: Data Virtualization Perspectives and Best PracticesBest Practices: Data Virtualization Perspectives and Best Practices
Best Practices: Data Virtualization Perspectives and Best Practices
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump In
 
Big data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data VirtualizationBig data insights with Red Hat JBoss Data Virtualization
Big data insights with Red Hat JBoss Data Virtualization
 
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
 
(BI Advanced) Hiram Fleitas - SQL Server Machine Learning Predict Sentiment O...
(BI Advanced) Hiram Fleitas - SQL Server Machine Learning Predict Sentiment O...(BI Advanced) Hiram Fleitas - SQL Server Machine Learning Predict Sentiment O...
(BI Advanced) Hiram Fleitas - SQL Server Machine Learning Predict Sentiment O...
 
Simplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data VirtualizationSimplifying Cloud Architectures with Data Virtualization
Simplifying Cloud Architectures with Data Virtualization
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
SQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery ImplementationSQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery Implementation
 

Similar a Crimson 3 - Final case presentation

SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
Amazon Web Services
 

Similar a Crimson 3 - Final case presentation (20)

Knowledge is Power - Richard May, Raritan
Knowledge is Power - Richard May, RaritanKnowledge is Power - Richard May, Raritan
Knowledge is Power - Richard May, Raritan
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
About CDAP
About CDAPAbout CDAP
About CDAP
 
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...Cloud Computing Realities - Getting past the hype and setting your cloud stra...
Cloud Computing Realities - Getting past the hype and setting your cloud stra...
 
AWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions ShowcaseAWS Webcast - Informatica - Big Data Solutions Showcase
AWS Webcast - Informatica - Big Data Solutions Showcase
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
How to develop a multi cloud strategy to accelerate digital transformation - ...
How to develop a multi cloud strategy to accelerate digital transformation - ...How to develop a multi cloud strategy to accelerate digital transformation - ...
How to develop a multi cloud strategy to accelerate digital transformation - ...
 
Slides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-CloudSlides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-Cloud
 
Hybrid Cloud Journey - Maximizing Private and Public Cloud
Hybrid Cloud Journey - Maximizing Private and Public CloudHybrid Cloud Journey - Maximizing Private and Public Cloud
Hybrid Cloud Journey - Maximizing Private and Public Cloud
 
Logicalis Cloud Briefing
Logicalis Cloud BriefingLogicalis Cloud Briefing
Logicalis Cloud Briefing
 
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin MotgiWhither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Hadoop in the Cloud
Hadoop in the CloudHadoop in the Cloud
Hadoop in the Cloud
 
A perspective on cloud computing and enterprise saa s applications
A perspective on cloud computing and enterprise saa s applicationsA perspective on cloud computing and enterprise saa s applications
A perspective on cloud computing and enterprise saa s applications
 
Microsoft SQL Server 2012 Data Warehouse on Hitachi Converged Platform
Microsoft SQL Server 2012 Data Warehouse on Hitachi Converged PlatformMicrosoft SQL Server 2012 Data Warehouse on Hitachi Converged Platform
Microsoft SQL Server 2012 Data Warehouse on Hitachi Converged Platform
 
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
 

Crimson 3 - Final case presentation

  • 1. The Future of Cummins Data Warehousing Architecture and Strategy Pragnya Balamurukesan Graham Cenko Michael Khamis Pavithra Thevasenapathy 1 Crimson 3
  • 2. Agenda Crimson 3 2 Our Understanding Data Warehousing Trends Recommendations Risks and Mitigations Financials Implementation Timeline Conclusion
  • 3. Our Understanding Crimson 3 3 Cummins has six Data Warehouses on the Oracle Exadata platform, a Data Lake environment in Hadoop and a Teradata warehousing appliance, which are not integrated The current Data Warehouse architecture and strategy does not meet the business intelligence or future needs of the company What Data Warehouse architecture and strategy would meet Cummins’ needs and support future growth initiatives? Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
  • 4. Future trends that should be incorporated into Cummins’ Data Warehousing strategy Crimson 3 4 Cloud Data Warehouse Business Intelligence Tools Big Data Big Data Analytics Hadoop Platform Real-Time Data Streaming Analytics & Reporting Consolidation Physical Logical Foley, John. “The Top 10 Trends in Data Warehousing.” Forbes. 10 March 2014 Satell, Matt. “The Future of Data Warehousing: 7 Industry Experts Share Their Predictions. BetterBuys. 5 November 2014 Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
  • 5. Cummins should adopt this Data Warehouse architecture to satisfy future trends and growth initiatives Crimson 3 5 Cloud Files Office files Web services Social Feeds Sensor Web logs Data Sources Enterprise Information Management BPM ECM CEM Discovery Info exchange Data Warehouse Hadoop Stream Computing Master Data Management Data Virtualization Reporting Statistical analysis Visualization Business Intelligence Tools Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
  • 6. Cummins should take these five actions to achieve the recommended Data Warehouse architecture Crimson 3 6 Governance Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Move certain databases from Oracle Data Warehouse to Teradata Active Data Warehouse Private Cloud Implement Hadoop-as-a-Service using Google Compute Engine and MapR Adopt Cisco Composite Data Virtualization Platform Add IBM InfoSphere Stream, Tableau and Spotfire to the Business Intelligence & Analytics tools
  • 7. Crimson 3 7 TERADATA ADW PRIVATE CLOUD EDW Components Power Gen Engine Distribution Active events Customer-sales representative interaction, worker in shipping & receiving Active load Arrival of damaged critical supplies Active enterprise integration Fitting into existing portals, Web services, SOA components Active workload management Controlling mixed workloads Active availability Increasing the DW availability from business critical to mission critical Active access Out-of-stock situation, inventory manager makes decisions ORACLE CorporateComponents Engine Power Gen Distribution Supply chain, Logistics, Sales, Marketing, Inventory & Operational data Cummins should move certain Databases from Oracle Exadata to Teradata Active Data Warehouse Private Cloud Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5 BENEFITS Teradata(2015) “Enabling the Agile Enterprise with Active Data Warehousing”
  • 8. Cummins should adopt Teradata private cloud for the following reasons Crimson 3 8 Challenges in Public Cloud Worldwide private cloud adoption- Forbes Consolidate to Teradata private ADW Reduced costs through server utilization Pay what you use ,when you need Faster less than five minutes Elastic performance Quick decision making Leading Healthcare company saves 4.3 billion, delivering 250,000 self service reports, improving performance by 10x Government agency which took 20 hours for running queries can run in 15 minutes Why private cloud model ? • High Active Performance • Effortless Scalability • Operational Availability • Enterprise Concurrency • Investment Protection Success stories Characteristics of Teradata ADW private cloud Benefits of Teradata ADW private cloud Teradata News Release (2012) Teradata Active Data Warehouses Provide Private Cloud Benefits Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 9. Cummins should implement Hadoop-as-a-Service using Google Compute Engine and MapR Crimson 3 9 Google Cloud Storage MapR MapR CLDB (Container Location Database) <cluster> [Master] MapR MapR FileServer <cluster> 000 [Worker] <cluster> 001 [Worker] <cluster> nnn [Worker] MapR MapR FileServer MapR MapR FileServer 1 1 An application downloads data file from Google Cloud Storage and pushes it MapR-FS2 2 The CLDB distributes the file to MapR-FS based on the query 3 3 The result of the query is written to the file on Google Cloud Storage DATA FLOW FEATURES 1 2 3 4 5 Operational Intelligence Enterprise Data Hub Internet of Things Security and Risk Management Marketing Optimization MapR (2014) “MapR, Hive, and Pig on Google Compute Engine” Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 10. Cummins should implement Hadoop-as-a-Service using MapR for the following reasons Crimson 3 10 Cost Scalability Enhanced productivity Collaboration Elasticity Efficiency MapR Cloudera Hortonworks Data Ingest Batch and streaming writes Batch Batch Hbase Performance Consistent low latency Latency spikes Latency spikes High Availability Self healing across multiple failures Single failure recovery Single failure recovery Replication Data + metadata Data Data File IO Read/write Append only Append only Write level authentication Kerberos, Native Kerberos Kerberos Vendor Criteria Robert D. Schneider (2014) “Hadoop Buyer’s Guide, Ubantu” Why we chose cloud deployment ? Why we chose MapR ? Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 11. Cummins should implement Composite Data Virtualization Platform to provide a unified logical view of all the data Crimson 3 Operational Stores SaaS Applications Data Warehouses and Marts Data Virtualization Platform Abstra ct Federate Cache CacheOptimizer Discovery Traditional, Big data & cloud sources Cisco Information Server Instant Access to all data End-End data management Faster response to BI & Analytics Features BI & Analytic tools Logical view of Cisco Composite Unified logical enterprise view of all the data Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion David Bescmer. Jan 2014. Cisco Data Virtualization Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 12. Cummins should install Composite Data Virtualization Platform for the following reasons Crimson 3 12 Composite Informatica IBM Federated Query language 3 2 2 Caching 3 2 2 Profiling 3 1 2 Metadata support 3 1 1 Customer base 3 2 2 Compatibility with existing technologies 3 2 2 Total 18/18 10/18 11/18 Vendor Criteria Profit Growth Risk Reduction Technology Optimization Staff Productivity Time-to-Solution Acceleration Benefits of Virtualization Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5 Cisco “Data Virtualization”
  • 13. Cummins should reevaluate their existing BI Toolset and purchase Tableau and Spotfire for visualization and analytics Crimson 3 13 Existing - Reporting •Action: Continue Using OBIEE and MSBI for reporting. Phase out the other four traditional platforms •Benefit: Reduced licensing and training costs, standardized reports and less complexity Tableau - Visualization •Action: Purchase Tableau Online for an easy to use data visualization platform that is designed for end business users •Benefit Enables self-service BI to the entire organization, no support from IT needed Tibco Spotfire – Statistical Analysis •Action: Purchase Tibco Spotfire Platform for advanced analytical capabilities to be used by business analysts •Benefit: Predictive and Prescriptive analytical capabilities and ability to consume structured and unstructured data Tibco Software Company. “Tibco Spotfire Platform.” 15 December 2015 Tableau. “Tableau Online.” 15 December 2105 Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 14. Cummins should adopt IBM InfoSphere Streams to enable real time business intelligence Crimson 3 14 Avadhoot Patwardhan (2015) “Introduction: Real-Time Analytics on Data in Motion” Aladdabigdata (2015)Real-time Analytics using IBM InfoSphere Streams ACQUIRE Real time data from several different streams having different formats ANALYZE The data in real time using applications developed by either Cummins or IBM ACT On the Business Intelligence delivered in real time Integrated Development Environment Scale – Out Runtime Analytic Toolkits Benefits of Stream Computing Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 15. Cummins should establish the following teams for effective governance over the Data Warehouse initiative Crimson 3 15 Change Management • Comprised of senior managers and supervisors of each business unit • Communicate change to the company and each business unit • Manage training of employees Vendor Management • Comprised of Cummins IT professionals • Assigns tasks to vendors while monitoring the performance of each vendor • Re-negotiating contracts Support Team • Comprised of Cummins IT technicians for each business unit • Groups will be assigned to each layer of the architecture BICC Team • Comprised of business managers from each business unit • Champion BI technologies defining standards, business alignment, project prioritization and management Information Governance • Comprised of C-suite member, IT professionals, business managers, paralegal, and members from each business unit • Manage information throughout its lifecycle IT Steering Committee Business & IT Leaders Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Rec 1 Rec 2 Rec 3 Rec 4 Rec 5
  • 16. It will take 3 years for Cummins to implement the recommended Data Warehouse strategy Crimson 3 16 Year 2Year 1 Year 3 Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
  • 17. The project will costs Cummins $11,370,000 and result in the following benefits Crimson 3 17 Emission control Using real time data to track emission of engines, Increasing the quality of Cummins engines Investment in the right technologies Using BI tools to predict where market trends in engine technology are headed Leading projects in major markets Using BI tools to improve alignment with organization strategy Benefits Business Value is derived from the actions taken as a result of the analysis enabled by the BI tools Cost Savings: ~$2 million Cloud storage, Operating Expense, and People Software Hardware Cloud Storage Tools End user Training Cost of Administration Maintenance Support External Contract Total Costs $ 1,400,000 $ 675,000 $ 65,000 $ 5,750,000 $ 200,000 $ 200,000 $ 2,680,000 $ 400,000 $ 11,370,000 *See appendix for detailed cost description and more sources Cost Global expansion Using BI tools to find existing and potentially new areas with demand that is not being exploited Potential Business Value Benefits Sallem,Rita. Sept. 2012, “Customer rate their BI /vendors on Costs.” Sheffield, Glen. March 2015, “How much does Teradata warehouse Cost.” Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion
  • 18. Risks and Mitigations Crimson 3 18 Risk Mitigation Data maybe breached when we store it in the Teradata cloud Teradata is partnered with Protegrity and utilizes Tokenization technology which is applied to data before entering into the warehouse Data virtualization Cisco platform can bring up data security concerns because the all the business data is used by this platform 1.The manager that resides in the Cisco Information Server takes care of security, metadata , source code and more. 2.The IT security team of Cummins will be given training on the new security policies and data governance, data standards. 3. Change management team will make sure that there is effective communication between the vendor management, in- house IT teams and C-suite level about security measures The data stored in Google Compute Engine or being used by MapR’s services maybe breached MapR is equipped with authentication mechanisms (Kerberos, Native), authorization mechanisms (Access Control Expressions, Unix File Permissions, Access Control Lists) encryption mechanisms (Over-the-Wire Encryption, Encryption at Rest, Field-Level Encryption, Format-preserving Encryption and Masking) and governance guidelines Employees responsible for reporting, visualization or analytics may become dissatisfied while learning new tools Reporting tools will remain the same and it will be the Change Management Team’s responsibility to set the tone from the top Inconsistent data from legacy systems will remain in the new Data Warehousing Architecture Information Governance Team and MDM tool will ensure consistent and reliable data across platforms and databases Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Teradata. “Our Partners.” 2015 MapR (2014) “MapR, Hive, and Pig on Google Compute Engine
  • 19. Following these recommendations will lead to a successful data warehouse architecture that has the capabilities to allow users to make intelligent business decisions Crimson 3 19 Our Understanding DW trends Recommendations Timeline Financials Risks and Mitigations Conclusion Data Warehouse architecture and strategy that meets business needs and future trends Move certain Databases from Oracle to Teradata Active Data Warehouse Private Cloud Re-evaluate existing BI Toolset and purchase Tableau and Spotfire for visualization and analytics Establish robust governance for effective use of the Data Warehouse initiative Implement Cisco Composite Data Virtualization Platform to provide unified logical view of all the data Implement Hadoop-as-a- Service using Google Compute Engine and MapR
  • 20. Appendix Crimson 3 20 Hadoop Why MapR? Why Hadoop-as-a-Service? Security MapR Architecture Enterprise Information Management Capabilities Architecture Why OpenText? Master Data Management Business Intelligence Tools Vendor Matrix Analytical maturity model IBM InfoSphere Streams Why InfoSphere? Security CISCO Composite Virtualization layer Functionalities Why virtualization? Why Composite? CISCO Architectures Success stories Teradata Characteristics Why Private Cloud? Operational Intelligence Security Information Governance team Costs Components Tools Category Savings Why not the Oracle Exadata proposal
  • 21. Comparative study of MapR, Cloudera, Hortonworks and Forrester’s ranking Crimson 3 21 Robert D. Schneider (2014) “Hadoop Buyer’s Guide, Ubantu” Experfy.com
  • 22. Benefits of moving Hadoop to the cloud Crimson 3 22 1. Cost : The on-premise model for deploying Hadoop would require a large number of servers, electricity as well as a housing facility. Whereas the cloud deployment would be more cost effective since it offers better scalability and pay only for what you use. 2. Scalability : The on-premise model would require time consuming addition of physical servers. The cloud offers massively scalable services extremely quickly 3. Enhanced productivity : Using a cloud based Hadoop platform would enable data access anytime from anywhere, therefore providing greater and faster access to data 4. Collaboration : A cloud based Hadoop platform would enable seamless collaboration across the business units. Since syncing and sharing of files would be simultaneous, the collaboration would be real time 5. Elasticity : Hadoop clusters cannot be added or removed quickly, whereas Hadoop-as-a- service has the ability to increase or decrease number of clusters (instances) as per demand 6. Handling Batch jobs : The on-premise Hadoop model has scheduled jobs that process the incoming data on a fixed, temporal basis. The Hadoop-as-a-Service can be optimized by having the appropriate sized clusters available for the jobs to run 7. Simplifying Hadoop operations : In the on-premise model, as clusters are consolidated there is no resource isolation for different users. Hadoop-as-a-Service allows provisioning of clusters with different configurations and characteristics. Therefore management of a multi-tenant environment is simplified
  • 23. Hadoop Security Crimson 3 23 MapR offers several capabilities to help Cummins secure their data. At the product level MapR prevents unauthorized access to secure the Hadoop and NoSQL data. At the solution level MapR offers deployment of a large-scale anomaly detection solution that alerts you to network intrusion, phishing, and other cyberattacks. Authentication is performed through 1. Kerberos Integration 2. Native authentication Authorization is the configuration of permissions for users. The authorization mechanisms offered by MapR are 1. Access Control Expressions 2. Unix File Permissions 3. Access Control Lists
  • 24. Hadoop Security Crimson 3 24 MapR also accounts for regulatory compliance and therefore provides four types of auditing which are 1. maprcli commands that are related to cluster management 2. Authentications to the MapR Control System (MCS) 3. Operations on directories and files and Operations on MapR-DB tables. As an additional means of preventing unauthorized access of sensitive data, MapR supports encryption. The encryption mechanisms available are 1. Over-the-Wire Encryption 2. Encryption at Rest 3. Field-Level Encryption 4. Format-preserving Encryption and Masking MapR also supports features that facilitate effective data governance. Among these are 1. Data Integration 2. Security 3. Data Lineage 4. Information Lifecycle Management 5. Auditing.
  • 25. Security in MapR Crimson 3 25 Kerberos Authentication Native Authentication MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
  • 26. Crimson 3 26 Security in MapR Authorization Auditing Encryption MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
  • 27. Crimson 3 27 Security in MapR MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
  • 28. Detailed MapR architecture Crimson 3 28 MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
  • 29. Crimson 3 29 Detailed MapR architecture MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
  • 30. Crimson 3 30 Detailed MapR architecture MapR (2014) “MapR, Hive, and Pig on Google Compute Engine”
  • 31. Capabilities of the Enterprise Information Management suite Crimson 3 31 Enterprise Content management : Information management of all types and sources of data, throughout it’s life cycle Business Process Management : Rapid modeling and automation of process applications and the ability to constantly improve them Customer Experience Management : Using information to build rich customer experiences that support collaboration, build relationships and provides support on any channel such as web, mobile etc. Information exchange : Exchanging information with any party and system securely and verifiably Discovery : Ability to find and learn about the right information at the right time and place, independent of it’s locationOpenText (2015) “OpenText Process Suite Platform Architecture”
  • 32. Four layers of the EIM solution Crimson 3 32
  • 33. Gartner declares OpenText to be a leader in Enterprise Content Management Crimson 3 33 https://en.wikipedia.org/wiki/Enterprise_information_management
  • 34. Master Data Management Crimson 3 34 5 Steps to implementing MDM 1. Document: identify sources while defining master data 2. Analyze: Evaluate the way the data flows in addition to defining transformation rules 3. Construction: Building the actual MDM warehouse according to the architecture/rules created 4. Implement: Population the data warehouse 5. Sustain: Make sure policies and compliance are upheld through Cummins governance structure Reasons for having Master data Management • Standardization of data • Source identification • Data classification • Employee information management • Product information management • Eliminate duplicated data Added business value because it organizes master data, making it possible to have effective BI tools. This then enable tools (being used properly) to receive information on business decisions. https://www.quora.com/What-is-the-best-master-data-management-software
  • 35. Buyer’s Matrix for BI Tools Crimson 3 35 Solutions Review. “2016 Solutions Review Matrix Report.” 2015
  • 36. Analytical Maturity Model Crimson 3 36 “As an analytics platform, Spotfire offers you a variety of add-on capabilities as the sophistication of your environment grows, or as you climb up the analytics maturity curve, so to speak.” - Rishi Bhatnagar from Syntelli Solutions Analytics Maturity Curve from Tom Davenport Bhatnagar, Rishi. “How Much Does Spotfire Cost?” Syntelli Solutions. 25 July 2015
  • 37. IBM InfoSphere Stream example Crimson 3 37 Example of streaming data sources associated with smart meters Typical Streams runtime deployment of a streaming application IBM Analytics (2015) “Top industry use cases for stream computing” IBM Analytics (2015) “IBM Streams”
  • 38. Forrester gives IBM high scores Crimson 3 38 Forrester Wave : Big Data Streaming Analytics Platforms, Q3 ‘14 Mike G., Rowan C. (2014) “The Forrester Wave™: Big Data Streaming Analytics Platforms, Q3 2014”
  • 39. InfoSphere Security Crimson 3 39 Security is provided in InfoSphere Streams through user authorization and authentication. User authorization is managed through Access Control Lists which contains the roles and their access rights. User authentication is done either using an LDAP server or PAM authentication service. Authentication keys, session time outs and client authentication for web management services are some of the mechanisms adopted.
  • 40. Crimson 3 40 CEP vs IBMInfosphere
  • 41. Discovery, optimize and caching for composite Crimson 3 41 Discovery: 1. Introspect available data 2. Discover hidden relationships 3. Model individual view/service 4. Validate view/service 5. Modify as required Benefits • Automates difficult work • Improves time to solution • Increases object reuse Optimization : 1. Application invokes request 2. Optimized query (single statement) executes 3. Deliver data in proper form Benefits: • Up-to-the-minute data • Optimized performance • Less replication required Caching : 1. Cache essential data 2. Application invokes request 3. Optimized query (leveraging cached data) executes 4. Deliver data in proper form http://www.compositesw.com/products-services/data-discovery/
  • 42. Crimson 3 42 Business case for virtualization • Profit Growth – Data virtualization delivers the information your organization requires to increase revenue and reduce costs. • Risk Reduction – Data virtualization’s up-to-the-minute business insights help you manage business risk and reduce compliance penalties. Plus data virtualization’s rapid development and quick iterations lower your IT project risk. • Technology Optimization – Data virtualization improves utilization of existing server and storage investments. And with less storage required, hardware and governance savings are substantial. • Staff Productivity – Data virtualization’s easy-to-use, high- productivity design and development environments improve your staff effectiveness and efficiency. • Time-to-Solution Acceleration – Your data virtualization projects are completed faster so business benefits are derived sooner. Lower project costs are an additional agility benefit http://www.compositesw.com/data-virtualization/
  • 43. Crimson 3 43 Virtualization versus Cloud • Security – Data integration in cloud , putting the entire data of the business in cloud is a huge risk. • Capacity management – Peak times, Holiday sales • Redundancy of data without complete utilization of hardware resources • In- house capabilities to handle http://www.businessnewsdaily.com/5791-virtualization-vs-cloud-computing.html
  • 44. Crimson 3 44 Key benefits of composite PROVIDES INSTANT ACCESS TO ALL DATA: • Complete information – Business needs the complete picture. Cisco’s data federation technology virtually integrates data from multiple sources, without the cost and overhead of physical data consolidation. • Up-to-the-minute information – Cisco’s query optimization algorithms and techniques are fastest in the industry, delivering the timely information business requires without impacting source system performance. • Fit-for-purpose information – Cisco’s powerful data abstraction functions simplify complex data, transforming it from native structures and syntax into easy-to-understand business views and data services RESPOND FASTER TO ANALYTIC AND BI TRENDS: • Streamlined process – Building business views and data services in Cisco is far faster, with far fewer moving parts, than building physical data stores and filling them using ETL. • Rapid IT response – Cisco’s reusable views and services, flexible data virtualization architecture, and automated impact analysis provide the IT agility required to keep pace with business change. • Quick iterations – Prototyping new solutions is far faster with Cisco DV. Cisco’s rapid development tools surface live data in just minutes, enabling extraordinary business and IT collaboration. END TO END DATA MANAGEMENT : • Data Discovery – Cisco’s introspection and unique-in-the-industry data discovery uncover existing information assets, unlocking them for valuable new uses. • Standards-based – Cisco’s numerous standards-based access and delivery options support all the information types business users require. • Data Governance – Information is a critical asset. To maximize control, Cisco’s data governance centralizes metadata management, ensures data security, improves data quality and provides full auditability and lineage http://www.compositesw.com/products-services/data-virtualization-platform/
  • 45. Crimson 3 45 Criteria Composite Informatica IBM Denedo Federated query technology 5 4 3 1 Scalability 5 4 5 4 Data quality 4 5 5 4 Maintenance and support 4 5 4 4 Caching 5 4 4 2 Profiling 5 4 3 2 Costs 3 1 1 4 Version upgrades 4 3 2 3 Complexity of integrated Portfolio management 4 3 2 3 Metadata support 5 4 4 2 Area of skills and Best practice documentation 4 3 3 2 Customer base 5 4 4 3 Agility 5 4 4 3 Time to value 5 4 4 3 Compatibility with existing technologies 5 4 4 4 Forrester ranking 5 4 4 3 Master data management 4 5 5 4 Total 72 65 61 55 Vendor evaluation matrix for composite
  • 46. Crimson 3 46 Cisco’s Data Virtualization Platform Development Environment Cisco Information Server Runtime Server Environment Management Environment XML Packaged Apps RDBMS Excel Files Data Warehouse OLAP Cubes Hadoop / “Big Data” XML Docs Flat Files Web Services Data Warehouse Extend / Offload Governance, Risk & Compliance Business Intelligence Customer Experience Management Mergers & Acquisitions Single View of Enterprise Data Supply Chain Management Analytics Discovery Studio Adapters Manager Monitor Active Cluster http://www.compositesw.com/products-services/data-virtualization-platform/
  • 47. Crimson 3 47 Cisco’s Data Virtualization Platform http://www.compositesw.com/products-services/data-virtualization-platform/
  • 48. Composite creates virtual marts, views and services Crimson 3 48 http://www.compositesw.com/data-virtualization/virtual-data-marts/ http://www.compositesw.com/data-virtualization/operational-data-stores/
  • 49. Crimson 3 49 Packaged Apps Web Services Success stories of Composite Company Before After Qualcomm BI projects took 3 - 4 months Days/Weeks Pfizer Management requests for data took weeks Hours/Days Northern Trust 100% data replication 20% replication http://www.slideshare.net/CiscoPublicSector/composite-data-virtualization
  • 50. Characteristics of Teradata ADW Private cloud Crimson 3 50 Main characteristics of Teradata ADW Private cloud include : Virtualized resources – Teradata virtualizes all processing and storage so users do not have to be concerned about the location or availability of system resources – only that they are getting timely answers to all their business questions automatically without performance penalty. • Business analytics – a Teradata Data Lab makes it easier for business users to explore unique data sets or prototype new analytic ideas. • Consistent performance – enables IT to meet business user service level agreements and to ensure user satisfaction by leveraging Teradata’s industry leading workload management as well as key technologies such as hybrid storage and columnar. • Elasticity – delivers the analytic resources dynamically and in real time as business user demand increases and decreases. • Scalability – enables the environment to scale seamlessly across multiple dimensions including number of users, number of queries, and data volumes with support for data scalability up to 92 petabytes. http://www.teradata.com/News-Releases/2012/Teradata-Active-Data-Warehouses- Provide-Private-Cloud-Benefits-Today/?LangType=1033&LangSelect=true
  • 51. Crimson 3 51 Features of Teradata ADW private cloud • Active access – high-speed inquiries, analysis, or alerts retrieved from the ADW and delivered to operational users, devices, or systems. • Active events – operational events that need to be continuously monitored, filtered, and alerts sent based on business rules. • Active load – high-frequency data loading throughout the business day to ensure data are fresh enough to support active access and active events. • Active enterprise integration – links the ADW to existing applications, portals, Web services, service-oriented architectures, and the enterprise service bus. • Active workload management – dynamic management of operational and strategic workloads in the same database, ensuring response times and maximum throughput. • Active availability – increasing the data warehouse availability from business critical to mission critical. http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise-with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true
  • 52. Private cloud adoption Crimson 3 52 http://www.datamation.com/cloud-computing/what-is-private-cloud.html
  • 53. Teradata provides operational intelligence Crimson 3 53 http://www.teradata.com/resources/white- papers/Enabling-the-Agile-Enterprise-with- Active-Data-Warehousing- eb4931/?LangType=1033&LangSelect=true http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise- with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true
  • 54. Teradata provides operational intelligence - Framework Crimson 3 54 http://www.teradata.com/resources/white-papers/Enabling-the-Agile-Enterprise- with-Active-Data-Warehousing-eb4931/?LangType=1033&LangSelect=true
  • 55. Security in Teradata Teradata’s Active Data Warehouse can make data available predictably and securely by leveraging Protegrity’s Vaultless Tokenization technology. Tokenization is applied to the sensitive data before it enters the warehouse, using the enterprise’s own security policies. This provides a security layer for all information in the database wherever it flows, without affecting the business’s ability to perform rapid analysis on that data. The solution relies upon Protegrity’s patent-pending Vaultless Tokenization, which deploys a very small set of lookup tables of random values without having to store either the sensitive data or the tokens. Tokenized data can be mined and manipulated by business processes without having to return the data to its original form, improving accessibility and performance while keeping the data protected. Crimson 3 55 http://www.teradata.com/partners/Protegrity-USA/?LangType=1033&LangSelect=true
  • 56. Information Governance Team • Legal: Department works with IT. Driven by policy issues such as compliance and privacy • Records/compliance/audit: Deal with record compliance, document workflow, and archiving strategies. Also make sure that policy is carried out enterprise wide • IT: Helps with more technical issues making sure policies are configured in systems architecture. • Info Security: assures that sensitive data is being held in secure repositories and the data does not leak into unsecure areas. • Business Unit: Help to spread the policy and compliance information to the rest of their BU. Crimson 3 56 Managing information through its lifecycle and supporting the organization’s strategy, operations, regulatory, legal, risk and environmental requirements. This team will manage records, business intelligence and MDM policies, rules and
  • 57. Cost of each component Crimson 3 57 Hadoop $4000 per node for support • Software is one time cost • Cloud is ~$600 per TB MDM • $13,000 per collaboration server user (2) assuming $500 per user assuming 20 users Teradata $2000 per TB • $2.5 million for in house support Opentext • $2000 per user
  • 59. Cost of each catagory Crimson 3 59
  • 60. Cost Savings Crimson 3 60 These cost savings are based on how much cheaper it is to store data on the cloud as opposed to not Also Operating expenses is an estimate that is derived from the increased amount of projects Cummins will be able to do with proper BI tools People cost savings are derived from the less amount of people that will have to provide support
  • 62. Our recommended solutions is better than the previously proposed Oracle Exadata solution for the following reasons • Future trends like Cloud, Big data, consolidation across platforms and real time analytics is not supported by Oracle Exadata. • High Scalability • High Availability • 90-95% Resource utilization • Data management • Easily can respond to changing BI and analytic trends • Cost savings – cut on maintenance and support costs, hardware costs, labor costs etc • Hadoop Cloud with MapR technologies has huge advantages – efficiency, collaboration and scalability etc • Moving operational data to Teradata can provide near- real time data warehousing which helps intelligent business decisions • Cummins end goal is to have single truth of data with availability, data quality, usability which is met by Cisco composite data virtualization platform. Crimson 3 62

Notas del editor

  1. g
  2. g
  3. g
  4. pb
  5. pb
  6. pb
  7. pt
  8. pt
  9. g
  10. pb
  11. m
  12. M http://www.gartner.com/newsroom/id/2970917 http://datadoghouse.typepad.com/data_doghouse/2015/01/bi-analytic-trends-of-2015-best-business-value-storyboarding-becomes-best-practice-for-bi-design.html
  13. g
  14. g
  15. g