2. THE DISRUPTION OF AMAZON REDSHIFT
Source: Amazon Redshift
A fast and powerful, petabyte-scale data warehouse that is:
A Lot Faster
A Lot Cheaper
A Lot Simpler
Runs on SQL
3. TYPICAL REDSHIFT USE-CASE
▪ Data from production databases and
other sources are imported into Amazon
Redshift on an hourly/daily basis
▪ Data then gets aggregated, summarized,
and derived inside Amazon Redshift
▪ The transformed data is then being
analyzed, explored and presented to
business users through some internal
web interfaces.
TRANSFORM
USER 1 USER 2 USER 3
Event Based
Data
Third Party
APIs
Production
DBs
IMPORT
PRESENT
4. CHALLENGES FOR REDSHIFT USERS
Data Import
▪ Write script to load data into Redshift (split data,
upload into S3, load into tables)
▪ Manually manage & performance-tune Redshift
tables (sortkeys, distkey, compression types)
Data Transformation:
▪ Write scripts to create aggregated/summarized
tables from existing tables
▪ Manage the scheduling of the tasks
Data Extraction:
▪ Spend time building web interface front-end to
present data
▪ Spend time extracting and emailing data to users
TRANSFORM
USER 1 USER 2 USER 3
Event Based
Data
Third Party
APIs
Production
DBs
IMPORT
PRESENT
5. THE HOLISTICS APPROACH
HOLISTICS DATA PRESENTATION AND EXPLORATION
Access Control
Report Filters
Audit Trails
Report Charts
Report Dashboards
Pivot Table
HOLISTICS REDSHIFT IN-MEMORY DATA TRANSFORMATION
Aggregation
Materialization
HOLISTICS DATA INGESTION (Source Data)
Import Production Databases
Import Spreadsheet from Users
Import
Transform
Present Report Creation
Scheduled Emails
Annotations
An end-to-end UI on top of Amazon Redshift to
manage your entire data pipeline process.
Summarization
Icon made by Freepik from www.flaticon.com
6. FEATURES HIGHLIGHTS
▪ Imports:
▪ Loading of production databases (PostgreSQL) into Redshift
▪ Uploading of user-generated data (CSV, Excel) into Redshift
▪ Manage & Transformations:
▪ Scheduled transformations within Redshift
▪ Manage Redshift tables with simple UI
▪ Reporting & Presentation:
▪ SQL-friendly reports & dashboards, KPI reporting
▪ Annotations
▪ Self-serving Pivot Table
▪ Automated report emails
▪ Powerful Access Control Management
7. FEATURES – BY ROLE
▪ Analysts & Admins:
▪ Flexible report creations: Create reports, charts and dashboards for business users (using
SQL)
▪ Data transformations: create and manage scheduled aggregations within Redshift
▪ Business Users:
▪ View Reports: Browse and view reports
▪ Schedule report emails: receive regular report updates via emails
▪ Custom Data Uploads: Get custom data (Excel, CSV) into Redshift
▪ Pivot Table: Explore and interact with prepared data-sets
8. HIGHLIGHTS & BENEFITS
1. Focus on your core activities
2. Runs on SQL
3. Powerful Data Access Control
4. Fully Utilize Power of Amazon Redshift
5. Cost-effective
9. FOCUS ON YOUR CORE ACTIVITIES
▪ Engineers working on core product development instead of peripheral
reporting tools
▪ Analysts spend more time providing new insights into data, and less time on
manual repetitive information requests from users.
▪ Business users spend less time waiting for information and more time
exploring what they need.
10. RUNS ON SQL
▪For Report Creation
▪ Friendly for novice SQL analysts to create simple reports
▪ Flexible for power SQL analysts to create complex reports
▪ Easy to hire and train new analysts
▪Extended syntax to complement SQL
▪ Report Filters
▪ Advanced data access control
11. FLEXIBLE AND POWERFUL DATA ACCESS CONTROL
▪Works well for any combination of user group permissions
▪Assign up to 5 levels of data access control
▪ Data Source Level
▪ User Level
▪ User Group Level
▪ Column Level
▪ Row Level
12. GET MORE OUT OF REDSHIFT
▪ Cost-saving: Direct your investment into Redshift to get you the highest
performance per dollar for your analytics infrastructure
▪ Maximize resources utilizations: Fully leverage on the in-memory processing
power of Redshift to run your data transformation and queries activities
▪ Simple UI: User interface to help simplify management of your data pipeline
activities with Redshift
▪ No need to move your data outside of Redshift
13. COST-EFFECTIVE
▪ Computing resources costs for your BI tool (CPU, Memory)
▪ Cost of storage in your BI tool for data that is already in Redshift
▪ Cost of training and hiring for specific BI skill-sets
In February 2013, Amazon launched Amazon Redshift, a fast, cheap and fully-managed data warehouse that makes it simple and cost-effect to store and analyze all your data with existing business intelligence tools.
Since then, the its growth has been phenomenal, companies have been on-boarding Amazon Redshift into their data platforms.
Companies (especially technology startups) now can have a data-warehouse of their own for less than US$1000/TB/year