8. Take a look a data processing “pipeline”
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
9. What has changed in this pipeline
Data is available
everywhere, contains
customer insight and
costs little to generate,
but..,
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
11. Big Gap in turning data into actionable
information
12. The Explosion of Data
Existing Challenges with Analytics
The Cloud
13. Challenge 1: Capex Intensive
Provision all your infrastructure and tools before you get results
Cost of your infrastructure dictates what analytics you can perform
Source: Oracle technology global price list 11/1/2012
14. Most data never makes it to a data warehouse
The Data Analysis Gap
Enterprise Data is growing at over 50%
yearly
Data Warehousing growing at less than
10% yearly
1990
2000
2010
2020
Enterprise Data
Data in Warehouse
Sources:
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Most data is left on the floor
15. Challenge 2: Hard to setup, manage and scale
Setup takes months of planning and work
Extending your data-warehouse can be heavy on time and cost
Managing a data analytics platform requires expensive staff
Complex tuning and management skills required
Enterprises average between 3 and 4 DBAs per data
warehouse
Gartner: Critical factors in calculating the data warehouse TCO, July 2009
16. Very hard to move up the stack
These make it extremely hard to
move up the Business Intelligence
Maturity Stack
17. The Explosion of Data
Existing Challenges with Analytics
The Cloud
24. Value proposition of the AWS cloud
No Upfront Investment
Low ongoing cost
Flexible capacity
Replace capital expenditure with
variable expense
Customers leverage our
economies of scale
No need to guess capacity
requirements and overprovision
37
PRICE
REDUCTIONS
Speed and agility
Focus on business
Global Reach
Infrastructure in minutes not
weeks
Not undifferentiated heavy
lifting
Go global in minutes and reach
a global audience
25. Architected for Enterprise Security Requirements
“The Amazon Virtual Private Cloud
[Amazon VPC] was a unique option that
offered an additional level of security and
an ability to integrate with other aspects of
our infrastructure.”
Dr. Michael Miller, Head of HPC for R&D
26. Gartner Magic Quadrant for Cloud Infrastructure as a Service
(August 19, 2013)
Gartner “Magic Quadrant for Cloud Infrastructure as a Service,” Lydia Leong, Douglas Toombs, Bob Gill, Gregor Petri, Tiny Haynes, August 19, 2013. This Magic Quadrant graphic was published by Gartner, Inc. as part of a
larger research note and should be evaluated in the context of the entire report.. The Gartner report is available upon request from Steven Armstrong (asteven@amazon.com). Gartner does not endorse any vendor, product or
service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization
and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
27. Summarizing the problem and the opportunity
The Explosion of Data
Data is a competitive edge
Existing challenges with
analytics
Hard and expensive to setup,
manage and scale
The Cloud
Lowers cost and improves
agility
28. The Solution
Data Analytics in the Cloud
Easy and inexpensive to get started
Easy to setup, scale and manage
Low cost to enable analytics on all your data
Open and flexible
29. Technology Process View
Data
source 1
Data
Data
source n
source 1
Extract Transform,
Load and Cleanse
Data
warehouse
Analytics
Analytics
Unstructur
ed data
sources
The diagram above shows functional architecture components of any data warehousing
project.
30. Source systems
Data
source 1
Data
Data
source n
source 1
Extract Transform,
Load and Cleanse
Data
warehouse
Analytics
Analytics
Unstructur
ed data
sources
The diagram above shows functional architecture components of any data warehousing
project.
31. Data Integration
Data
source 1
Data
Data
source n
source 1
Extract Transform,
Load and Cleanse
Data
warehouse
Analytics
Analytics
Unstructur
ed data
sources
The diagram above shows functional architecture components of any data warehousing
project.
32. The Data Warehouse
Data
source 1
Data
Data
source n
source 1
Extract Transform,
Load and Cleanse
Data
warehouse
Analytics
Analytics
Unstructur
ed data
sources
The diagram above shows functional architecture components of any data warehousing
project.
33. Business Intelligence and Analytics
Data
source 1
Data
Data
source n
source 1
Extract Transform,
Load and Cleanse
Data
warehouse
Analytics
Analytics
Unstructur
ed data
sources
The diagram above shows functional architecture components of any data warehousing
project.
34. Data Analytics -Technology Stack
Amazon Redshift
Data
Integration
Data
Warehouse
AWS Cloud
Business
Intelligence
36. Data warehousing done the AWS way
Deploy
• Easy to provision
• Pay as you go, no up front costs
• Fast, cheap, easy to use
• SQL
37. Customer quotes
“Queries that used to take hours came back in seconds. Our analysts
are orders of magnitude more productive.”
“Redshift is twenty times faster than Hive…The cost saving is even
more impressive…Our analysts like [it] so much they don’t want to go
back.”
“[Amazon Redshift] took an industry famous for its opaque pricing,
high TCO and unreliable results and completely turned it on its head.”
“Team played with Redshift today and concluded it is awesome. Unindexed complex queries returning in < 10s.”
38. Amazon Redshift lets you start small and grow big
Extra Large Node (HS1.XL)
Eight Extra Large Node (HS1.8XL)
3 spindles, 2 TB, 16 GB RAM, 2 cores
24 spindles, 16 TB, 128 GB RAM, 16 cores, 10 GigE
Single Node (2 TB)
Cluster 2-100 Nodes (32 TB – 1.6 PB)
Cluster 2-32 Nodes (4 TB – 64 TB)
Note: Nodes not to scale
39. Amazon Redshift Pricing – Singapore & Sydney
Price Per Hour for
XL Node ($US)
On-Demand
$ 1.25
1 Year Reservation
$ 0.75
3 Year Reservation
$ 0.45
Simple Pricing
Number of Nodes x Cost per Hour
No charge for Leader Node
Pay as you go
40. So for example…….
•
1 XL node reserved for 3 years:
= 0.45c x number of hours in a month
=
$340 per month
• 1 XL node cluster gives you:
• 2 Cores
• 16 GB RAM
• 2 TB Disk
• Plus 2 TB storage in S3 for backups & snapshots
41. Amazon Redshift is easy to use
•
Provision in minutes
•
Monitor query performance
•
Point and click resize
•
Built in security
•
Automatic backups
42. Use cases
• Reporting Data-warehouse behind an OLTP system
• Data Mart to take load off the existing data warehouse
• Log file analysis for clickstream or gaming data (e.g.
Advertising, Retail, Gaming)
• Query-able archive for data compliance (e.g. Telco - Call
detail Records)
• Machine generated sensor data analysis (e.g. Utility smart meters, Resources - equipment failure prediction)
• As a data analytics system for live data (Gaming,
Advertising)
43. Flexibility & choice are key in the Cloud
Amazon Partner Network
(Technology Partners)
Deployment & Administration
Application Services
Compute
Storage
Database
Networking
AWS Global Infrastructure
47. Informatica:
The Industry Leader in Cloud Integration
#1 by Customer Count
2000+ companies
#1 by Customers/Analysts
AppExchange
Gartner
#1 by Data Processed
+40B transactions/month
#1 by Connectivity
Informatica Cloud Marketplace
52. Cloud Integration Customer Success Stories
Data Migration
App Integration
Consolidated Smith
Barney and Morgan
Stanley data on
Day 1
of merger
Synchronizing
Salesforce CRM
with Netsuite and
other business apps
Managers didn’t
lose momentum in
ongoing recruiting
efforts
1.5M rows of data
synchronized daily
iPaaS *(Build)
Extend
PowerCenter
Decreased
operational issues
from 70% to 30%
of IT workload
Reduce time to
build and distribute
connectivity to 3rd
party data sources
Enabled faster, more
accurate decisionmaking based on
timely, trusted data
Customize cloud
integration
templates to execute
sophisticated
integration workflows
Hybrid deployment
gives integration
flexibility and
scalability to meet
various use cases
Data Replication
Lowered time and
resources needed for
integrations by 80%
53. Informatica Cloud
The Industry’s Most Comprehensive Cloud Integration
and Data Management Solution
Cloud Process Automation
Guiding users to work efficiently with the data
Cloud Data Quality and MDM
Delivering the “Single Customer View”
Cloud Integration
Connecting your cloud apps
57. Challenges with Traditional Approaches to Cloud Integration
Mainframe based
Integration
Prism
ETI
Client / Server based
Integration
Cloud based
Integration
58. Move to the Cloud…
IT transitions from skeptic to partner to driver
Cloud First
(IT Led)
Increasing IT
involvement
in Cloud
decision
making
Business-IT
Collaboration
LOB Led
(IT Approved)
LOB Owned
(Outside of IT)
2012-2013
Pre-2010
2010-2012
2013
59. Cloud is the Reality in the Enterprise
Large, Accelerating Market
4-6x
growth rate of
on-premise IT
20-27% CAGR
$20-40B market
SaaS
largest category
PaaS
fastest growing
(Forrester)
Led by Large
Enterprises
76%
enterprises
have a formal
cloud strategy
(Forrester)
(Forrester, IDC, Gartner, 451Group)
Driven by IT
90%
Cloud decisions
and operations
involve IT
(IDC)
60%
84%
of all companies
using SaaS w/in 12
months
of net new
software is
now SaaS
(Forrester)
(IDC)
74%
using cloud
will increase cloud
spend
> 20%
(IDC)
66%
SaaS POs
signed by IT
(IDC)
60. Informatica Cloud and Amazon Redshift:
Enabling cost-effective data warehousing
•
•
Redshift Connector pre-release announced in February
General availability in August 2013
InformaticaCloud.com/Amazon-Redshift
61. What did it use to take…
•
•
•
•
•
•
Budget large capital expenditure
Schedule a sales meeting with Oracle, IBM, Teradata, etc…
Formal POC (Proof of Concept)
Procure software and hardware
Install and setup
Start project
62. What it takes now…
•
•
Go to the web and sign-up
Start project!
64. Informatica Cloud Amazon Redshift demonstration
6
Metadata Mappings
4
5
1
Firewall
1
Build mapping and execute job
2
Retrieve Account Data
3
Put Account Data into Flat File
4
Transfer compressed Flat File to S3
5
Initiate copy from S3
6
Load data into Amazon Redshift
3
Informatica Cloud
Secure Agent
2
65. Best practices to remember…
•
The Amazon S3 bucket that holds the data files must be created in the same
region as your cluster
– Files are deleted from Amazon S3 bucket when upload is complete
•
Choose a batch size where the number of batches matches the number of
slices in your cluster
– Each XL node has 2 slices, each 8XL node has 16
– If you have a 2 node XL cluster and 40,000 rows of data, choose a batch size of
10,000
– The Informatica Cloud Redshift connector can maximize Amazon’s parallel
processing capabilities this way
66. Next Steps
•
Get started with Amazon Redshift
•
Get started with Informatica Cloud
– InformaticaCloud.com
•
Learn more about our Redshift Connector
– InformaticaCloud.com/Amazon-Redshift
85. Jaspersoft: The Intelligence Inside
Embeddable Architecture
Cloud Ready
Open web standard
architecture makes
integration with any
app easy to perform
Multi-tenant architecture,
100’s of SaaS
customers, top selling BI
solution on Amazon
Full Self-Service BI Suite
Address all user requirements with
interactive reports, dashboards,
analysis, and data integration
Affordable
Proven Platform
Up to 80% less than
traditional BI platforms
while delivering significant
power & capabilities
Millions of users,
380,000 community
members, deployed in
130,000+ applications
93. … with a World-Class BI Platform
Reporting, Dashboards, Visualization, OLAP
Analysis
Columnar-Based In-Memory Engine
Business Metadata Layer
Data
Integration
Data
Virtualization
Direct
Extensive APIs: HTTP, SOAP, REST
100% Web Standards: CSS, .JS, .JSP, Java
HTML5 Browser, Native Mobile Apps
Data Connectivity to Any Data
RDS
Redshift
EMR
SaaS
On-Premises
94
100. Jaspersoft Pro on AWS
•
Jaspersoft is the first BI service that you can buy per hour
– No user limitations, no monthly fee,
– less than $1 per hour
•
First BI service to automatically
connect to your AWS data
– 10 minutes from launch to visualizing your data in RDS or Redshift
– AWS Security Integration
•
Released February, 2013
– Over 500 customers
101