July 2017 Meeting of the Denver AWS Users' Group

•Descargar como PPTX, PDF•

1 recomendación•252 vistas

David McDaniel

July 2017 Meeting slides on Amazon Redshift.

Tecnología

AWS Users’ Group Updates
David “Mac” McDaniel
Sr. Solution & Cloud Architect - Independent Consultant
david@mobile-360.com
LinkedIn: https://www.linkedin.com/in/davidbmcdaniel
Twitter: @CloudKegGuy, @ServerlessJava
Twitter list: https://twitter.com/CloudKegGuy/lists/aws/members

Getting Connected
Slack Channel: https://DenverAWSUsersGroup.slack.com
You will need an invitation to join, please email me: david@mobile-360.com.
We are now listed on AWS UG site:
https://aws.amazon.com/usergroups/americas/
We are sponsored by CloudAcademy! They have a free portal for our members at:
https://cloudacademy.com/aws-usergroup/?code=newawsugs
We are also sponsored and a member of the official Global AWS Communities!
See them at https://awsug.support

What we’re going to do tonight
1. Describe Amazon Redshift
2. Talk about how it’s different from regular SQL Databases
3. Talk about storage options for Redshift
a. Standard Disk-based storage
b. Spectrum and S3 (CSV & Parquet) storage
4. Describe ways to load data
a. S3, EMR, DynamoDB or Remote Hosts
5. Compare to Athena

What is Redshift?
Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze
all your data using standard SQL and your existing Business Intelligence (BI) tools. It allows you to run
complex analytic queries against petabytes of structured data, using sophisticated query optimization,
columnar storage on high-performance local disks, and massively parallel query execution. Most results come
back in seconds. With Amazon Redshift, you can start small for just $0.25 per hour with no commitments and
scale out to petabytes of data for $1,000 per terabyte per year, less than a tenth the cost of traditional
solutions.
Amazon Redshift also includes Redshift Spectrum, allowing you to directly run SQL queries against exabytes
of unstructured data in Amazon S3. No loading or transformation is required, and you can use open data
formats, including CSV, TSV, Parquet, Sequence, and RCFile. Redshift Spectrum automatically scales query
compute capacity based on the data being retrieved, so queries against Amazon S3 run fast, regardless of
dataset size.
Recently announced 4x compression improvement in Redshift.

How Redshift is Different
Redshift is a column-oriented database whereas regular SQL databases are row-oriented in nature. This
means that Redshift stores groups of columns together rather than groups of rows. This can be hugely
beneficial when processing many rows, but only a few columns, which is typical in BI and Analytical
processing. Many data warehouse databases will be denormalized to reduce joins and therefore tables
will be very wide (many columns) to provide the most value, even though individual queries will only use a
small number of columns.

Storage Options
1. Local Disk Storage
a. Traditional, SSD-based, ties storage to compute.
b. Ties compute to storage.
c. Must make FULL read-only copies to scale.
2. S3 - Used with Redshift Spectrum
a. Uses Amazon Athena Meta-data to understand files in S3.
b. Decouples storage from compute.
c. Still must make read-only copies, but of meta-data only, so smaller & faster to scale.

How do we load data?
Multiple ways:
1. Preferred way: Use COPY command to load data from files in one of many
formats from:
a. S3
b. EMR
c. Remote EC2 Hosts
d. DynamoDB Tables
2. Use DML:

How is it different from Athena?
Athena Redshift
Storage on S3 Storage on attached SSD disks
Automatically scales Must add more instances/change instance
size
Massive parallelism Only as parallel as you configure
Data can be stored in multiple formats per
table
Data can be loaded from files in multiple
formats

Demo!
1. Create Schemas for Redshift tables
2. Load data in multiple formats from S3
3. Create Redshift Spectrum Schemas
4. Load data (really, meta-data)
5. Execute queries
6. Tableau visualization

Next Month’s
TOPIC:
????
We need speakers!
Chipe: Cheesy, Sarcastic

Más contenido relacionado

La actualidad más candente

Amazon Web Services: Lessons for Architecting Data in the CloudSafe Software

AWS SSA Webinar 21 - Getting Started with Data lakes on AWSCobus Bernard

Getting Started with Amazon EMRArman Iman

How Amazon.com is Leveraging Amazon Redshift (DAT306) | AWS re:Invent 2013Amazon Web Services

Unit1 dbmsgowrivageesan87

AWS EMR (Elastic Map Reduce) explainedHarsha KM

AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSCobus Bernard

Hadoop in the cloud with AWS' EMRrICh morrow

Interactively Querying Large-scale Datasets on Amazon S3Amazon Web Services

AWS RDSMahesh Raj

Rethinking the database for the cloud (iJAWS)Rasmus Ekman

Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services

AWS Tutorial-Part2:Exam Intro-2018SaM theCloudGuy

Introduction to Amazon AthenaAmazon Web Services

Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQLAmazon Web Services

Aws Atlanta meetup Amazon AthenaAdam Book

Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...Amazon Web Services

Deep Dive: Amazon Elastic MapReduceAmazon Web Services

AWS RDS Migration Tool Blazeclan Technologies Private Limited

Amazon Aurora and AWS Database Migration ServiceAmazon Web Services

La actualidad más candente (20)

Amazon Web Services: Lessons for Architecting Data in the Cloud

AWS SSA Webinar 21 - Getting Started with Data lakes on AWS

Getting Started with Amazon EMR

How Amazon.com is Leveraging Amazon Redshift (DAT306) | AWS re:Invent 2013

Unit1 dbms

AWS EMR (Elastic Map Reduce) explained

AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS

Hadoop in the cloud with AWS' EMR

Interactively Querying Large-scale Datasets on Amazon S3

AWS RDS

Rethinking the database for the cloud (iJAWS)

Best Practices for Migrating your Data Warehouse to Amazon Redshift

AWS Tutorial-Part2:Exam Intro-2018

Introduction to Amazon Athena

Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQL

Aws Atlanta meetup Amazon Athena

Scaling your Analytics with Amazon Elastic MapReduce (BDT301) | AWS re:Invent...

Deep Dive: Amazon Elastic MapReduce

AWS RDS Migration Tool

Amazon Aurora and AWS Database Migration Service

Similar a July 2017 Meeting of the Denver AWS Users' Group

AWS Certified Solutions Architect Professional Course S15-S18Neal Davis

Building Data Lakes in the AWS CloudAmazon Web Services

Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely

RedshiftOmarFaroque16

AWS Big Data LandscapeCrishantha Nanayakkara

Data warehousing in the era of Big Data: Deep Dive into Amazon RedshiftAmazon Web Services

Owning Your Own (Data) Lake HouseData Con LA

2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?Amazon Web Services Korea

Module 2 - DatalakeLam Le

(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big DataAmazon Web Services

Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksAmazon Web Services

Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...Amazon Web Services

Aplicaciones a gran escala: Cómo servir a millones de usuariosAmazon Web Services

AWS Webcast - Tableau Big Data Solution ShowcaseAmazon Web Services

Databases in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services

Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...Amazon Web Services

Scaling on AWS for the First 10 Million Users at Websummit DublinAmazon Web Services

Scaling on AWS for the First 10 Million Users at Websummit DublinIan Massingham

strategies-for-migrating-oracle-database-to-awsAbdul Sathar Sait

Similar a July 2017 Meeting of the Denver AWS Users' Group (20)

AWS Certified Solutions Architect Professional Course S15-S18

Building Data Lakes in the AWS Cloud

Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...

Redshift

AWS Big Data Landscape

Data warehousing in the era of Big Data: Deep Dive into Amazon Redshift

Owning Your Own (Data) Lake House

2017 AWS DB Day | AWS 데이터베이스 개요 - 나의 업무에 적합한 데이터베이스는?

Module 2 - Datalake

(BDT322) How Redfin & Twitter Leverage Amazon S3 For Big Data

Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks

Migrating your Databases to AWS: Deep Dive on Amazon RDS and AWS Database Mig...

Aplicaciones a gran escala: Cómo servir a millones de usuarios

AWS Webcast - Tableau Big Data Solution Showcase

Databases in the Cloud - DevDay Austin 2017 Day 2

Serverless Analytics with Amazon Redshift Spectrum, AWS Glue, and Amazon Quic...

Scaling on AWS for the First 10 Million Users at Websummit Dublin

strategies-for-migrating-oracle-database-to-aws

Más de David McDaniel

Denver AWS Users' Group Meetup - May 2020David McDaniel

January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupDavid McDaniel

Denver AWS Meetup - March 2019 slidesDavid McDaniel

Denver AWS Meetup - February 2019David McDaniel

Denver AWS Users' Group Meetup - October 2018David McDaniel

Denver AWS Meetup -- August 2018David McDaniel

Denver AWS Users' Group Meeting - July 2018 Slides - Cloud OptimizationDavid McDaniel

Denver AWS Users' Group Meeting - July 2018 SlidesDavid McDaniel

Denver AWS Users' Group Meeting - May 2018 SlidesDavid McDaniel

Denver AWS Users' Group meeting - September 2017David McDaniel

June 2017 Denver AWS Users' Group intro slidesDavid McDaniel

DevOps on AWSDavid McDaniel

May 2017David McDaniel

January 2017 - Deep dive on AWS Lambda and DevOpsDavid McDaniel

October 2016David McDaniel

Más de David McDaniel (15)

Denver AWS Users' Group Meetup - May 2020

January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group

Denver AWS Meetup - March 2019 slides

Denver AWS Meetup - February 2019

Denver AWS Users' Group Meetup - October 2018

Denver AWS Meetup -- August 2018

Denver AWS Users' Group Meeting - July 2018 Slides - Cloud Optimization

Denver AWS Users' Group Meeting - July 2018 Slides

Denver AWS Users' Group Meeting - May 2018 Slides

Denver AWS Users' Group meeting - September 2017

June 2017 Denver AWS Users' Group intro slides

DevOps on AWS

May 2017

January 2017 - Deep dive on AWS Lambda and DevOps

October 2016

Último

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Artificial Intelligence: Facts and MythsJoaquim Jorge

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Partners Life - Insurer Innovation Award 2024The Digital Insurer

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Developing An App To Navigate The Roads of BrazilV3cube

July 2017 Meeting of the Denver AWS Users' Group

1. AWS Users’ Group Updates David “Mac” McDaniel Sr. Solution & Cloud Architect - Independent Consultant david@mobile-360.com LinkedIn: https://www.linkedin.com/in/davidbmcdaniel Twitter: @CloudKegGuy, @ServerlessJava Twitter list: https://twitter.com/CloudKegGuy/lists/aws/members

2. Getting Connected Slack Channel: https://DenverAWSUsersGroup.slack.com You will need an invitation to join, please email me: david@mobile-360.com. We are now listed on AWS UG site: https://aws.amazon.com/usergroups/americas/ We are sponsored by CloudAcademy! They have a free portal for our members at: https://cloudacademy.com/aws-usergroup/?code=newawsugs We are also sponsored and a member of the official Global AWS Communities! See them at https://awsug.support

3. What we’re going to do tonight 1. Describe Amazon Redshift 2. Talk about how it’s different from regular SQL Databases 3. Talk about storage options for Redshift a. Standard Disk-based storage b. Spectrum and S3 (CSV & Parquet) storage 4. Describe ways to load data a. S3, EMR, DynamoDB or Remote Hosts 5. Compare to Athena

4. What is Redshift? Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools. It allows you to run complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution. Most results come back in seconds. With Amazon Redshift, you can start small for just $0.25 per hour with no commitments and scale out to petabytes of data for $1,000 per terabyte per year, less than a tenth the cost of traditional solutions. Amazon Redshift also includes Redshift Spectrum, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3. No loading or transformation is required, and you can use open data formats, including CSV, TSV, Parquet, Sequence, and RCFile. Redshift Spectrum automatically scales query compute capacity based on the data being retrieved, so queries against Amazon S3 run fast, regardless of dataset size. Recently announced 4x compression improvement in Redshift.

5. How Redshift is Different Redshift is a column-oriented database whereas regular SQL databases are row-oriented in nature. This means that Redshift stores groups of columns together rather than groups of rows. This can be hugely beneficial when processing many rows, but only a few columns, which is typical in BI and Analytical processing. Many data warehouse databases will be denormalized to reduce joins and therefore tables will be very wide (many columns) to provide the most value, even though individual queries will only use a small number of columns.

6. Columnar vs. Row Oriented

7. Storage Options 1. Local Disk Storage a. Traditional, SSD-based, ties storage to compute. b. Ties compute to storage. c. Must make FULL read-only copies to scale. 2. S3 - Used with Redshift Spectrum a. Uses Amazon Athena Meta-data to understand files in S3. b. Decouples storage from compute. c. Still must make read-only copies, but of meta-data only, so smaller & faster to scale.

8. How do we load data? Multiple ways: 1. Preferred way: Use COPY command to load data from files in one of many formats from: a. S3 b. EMR c. Remote EC2 Hosts d. DynamoDB Tables 2. Use DML:

9. How is it different from Athena? Athena Redshift Storage on S3 Storage on attached SSD disks Automatically scales Must add more instances/change instance size Massive parallelism Only as parallel as you configure Data can be stored in multiple formats per table Data can be loaded from files in multiple formats

10. Demo! 1. Create Schemas for Redshift tables 2. Load data in multiple formats from S3 3. Create Redshift Spectrum Schemas 4. Load data (really, meta-data) 5. Execute queries 6. Tableau visualization

11. Next Month’s TOPIC: ???? We need speakers! Chipe: Cheesy, Sarcastic

12. Questions?

July 2017 Meeting of the Denver AWS Users' Group

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a July 2017 Meeting of the Denver AWS Users' Group

Similar a July 2017 Meeting of the Denver AWS Users' Group (20)

Más de David McDaniel

Más de David McDaniel (15)

Último

Último (20)

July 2017 Meeting of the Denver AWS Users' Group