SlideShare una empresa de Scribd logo
1 de 7
Descargar para leer sin conexión
PySpark with EMR
Patrick Pierson, DevOps Engineer
Ion Channel
What is Data?
What is Data?
Big Data is when you apply large distributed
computing platforms to a massive dataset.
EMR
Amazon Elastic MapReduce (EMR)
is an Amazon Web Services (AWS)
tool for big data processing and
analysis. Amazon EMR offers the
expandable low-configuration
service as an easier alternative to
running in-house cluster computing.
Apache Spark
Apache® Spark™ is a powerful open
source processing engine built
around speed, ease of use, and
sophisticated analytics.
DEMO
https://github.com/python-frederick/pyspark

Más contenido relacionado

La actualidad más candente

AWS EMR (Elastic Map Reduce) explained
AWS EMR (Elastic Map Reduce) explainedAWS EMR (Elastic Map Reduce) explained
AWS EMR (Elastic Map Reduce) explainedHarsha KM
 
High Performance Computing on AWS
High Performance Computing on AWSHigh Performance Computing on AWS
High Performance Computing on AWSAmazon Web Services
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...Amazon Web Services
 
Hadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRHadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRrICh morrow
 
AWS September Webinar Series - Building Your First Big Data Application on AWS
AWS September Webinar Series - Building Your First Big Data Application on AWS AWS September Webinar Series - Building Your First Big Data Application on AWS
AWS September Webinar Series - Building Your First Big Data Application on AWS Amazon Web Services
 
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...AWS Germany
 
AWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMRAWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMRAmazon Web Services
 
AWS-Enabled Disaster Recovery and Business Continuity for SIFIs
AWS-Enabled Disaster Recovery and Business Continuity for SIFIsAWS-Enabled Disaster Recovery and Business Continuity for SIFIs
AWS-Enabled Disaster Recovery and Business Continuity for SIFIsAmazon Web Services
 
Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery   Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery Amazon Web Services
 
Amazon Elastic Map Reduce: the concepts
Amazon Elastic Map Reduce: the conceptsAmazon Elastic Map Reduce: the concepts
Amazon Elastic Map Reduce: the concepts Julien SIMON
 
AWS Summit 2018 Summary
AWS Summit 2018 SummaryAWS Summit 2018 Summary
AWS Summit 2018 SummaryAshish Mrig
 
Big data and Analytics on AWS
Big data and Analytics on AWSBig data and Analytics on AWS
Big data and Analytics on AWS2nd Watch
 
Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...
Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...
Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...Amazon Web Services
 
CTX case study
CTX case studyCTX case study
CTX case studyPowerup
 
AWS Office Hours: Disaster Recovery
AWS Office Hours: Disaster RecoveryAWS Office Hours: Disaster Recovery
AWS Office Hours: Disaster RecoveryAmazon Web Services
 

La actualidad más candente (20)

Earth Observation in the Cloud
Earth Observation in the CloudEarth Observation in the Cloud
Earth Observation in the Cloud
 
AWS EMR (Elastic Map Reduce) explained
AWS EMR (Elastic Map Reduce) explainedAWS EMR (Elastic Map Reduce) explained
AWS EMR (Elastic Map Reduce) explained
 
High Performance Computing on AWS
High Performance Computing on AWSHigh Performance Computing on AWS
High Performance Computing on AWS
 
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...High Performance Computing on AWS: Accelerating Innovation with virtually unl...
High Performance Computing on AWS: Accelerating Innovation with virtually unl...
 
Hadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMRHadoop in the cloud with AWS' EMR
Hadoop in the cloud with AWS' EMR
 
AWS September Webinar Series - Building Your First Big Data Application on AWS
AWS September Webinar Series - Building Your First Big Data Application on AWS AWS September Webinar Series - Building Your First Big Data Application on AWS
AWS September Webinar Series - Building Your First Big Data Application on AWS
 
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
AWS Summit Berlin 2013 - Optimizing your AWS applications and usage to reduce...
 
AWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMRAWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMR
 
AWS-Enabled Disaster Recovery and Business Continuity for SIFIs
AWS-Enabled Disaster Recovery and Business Continuity for SIFIsAWS-Enabled Disaster Recovery and Business Continuity for SIFIs
AWS-Enabled Disaster Recovery and Business Continuity for SIFIs
 
Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery   Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery
 
Amazon Elastic Map Reduce: the concepts
Amazon Elastic Map Reduce: the conceptsAmazon Elastic Map Reduce: the concepts
Amazon Elastic Map Reduce: the concepts
 
Amazon rds
Amazon rdsAmazon rds
Amazon rds
 
AWS Summit 2018 Summary
AWS Summit 2018 SummaryAWS Summit 2018 Summary
AWS Summit 2018 Summary
 
AWS Perú Meetup - Arquitecting for HA by Raul Hugo
AWS Perú Meetup - Arquitecting for HA by Raul HugoAWS Perú Meetup - Arquitecting for HA by Raul Hugo
AWS Perú Meetup - Arquitecting for HA by Raul Hugo
 
AWS Peru Meetup - recap reinvent 2016 (by Carlos Cortez)
AWS Peru Meetup - recap reinvent 2016 (by Carlos Cortez)AWS Peru Meetup - recap reinvent 2016 (by Carlos Cortez)
AWS Peru Meetup - recap reinvent 2016 (by Carlos Cortez)
 
Big data and Analytics on AWS
Big data and Analytics on AWSBig data and Analytics on AWS
Big data and Analytics on AWS
 
Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...
Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...
Disaster Recovery, Continuity of Operations, Backup, and Archive on AWS | AWS...
 
CTX case study
CTX case studyCTX case study
CTX case study
 
Eventually Everything Connects
Eventually Everything ConnectsEventually Everything Connects
Eventually Everything Connects
 
AWS Office Hours: Disaster Recovery
AWS Office Hours: Disaster RecoveryAWS Office Hours: Disaster Recovery
AWS Office Hours: Disaster Recovery
 

Más de Patrick Pierson

Más de Patrick Pierson (9)

Python + Software Defined Radios
Python + Software Defined RadiosPython + Software Defined Radios
Python + Software Defined Radios
 
Cloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs GoogleCloud comparison - AWS vs Azure vs Google
Cloud comparison - AWS vs Azure vs Google
 
What is IAM?
What is IAM?What is IAM?
What is IAM?
 
Kong API
Kong APIKong API
Kong API
 
Boto3
Boto3Boto3
Boto3
 
SaltStack
SaltStackSaltStack
SaltStack
 
Virtual machines and containers
Virtual machines and containersVirtual machines and containers
Virtual machines and containers
 
Ignite talks
Ignite talksIgnite talks
Ignite talks
 
Aws 101
Aws 101Aws 101
Aws 101
 

Último

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Último (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Pyspark