SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
ShareThis on AWS

                               Paco Nathan, Data Insights
                                     ShareThis.com




AWS Start-Up Tour 2009-06-16
What Does ShareThis Do?

• “Make it simple to share any online content”
• Social content sharing platform
• ESPN, FOX, CS Monitor, HuffPost, CBS Marketwatch,
    Wired, TechCrunch, ThinkGeek, etc.

• When a news story goes viral on a major publisher,
    our sharing services must scale-out to keep pace




AWS Start-Up Tour 2009-06-16
AWS Start-Up Tour 2009-06-16
Why Our Company Uses AWS

• >10^6 publishers, >10^9 users, >10^10 urls
• Early stage start-up, < 25 people, “wearing lots of hats”,
    ultra fast-paced R&D

• Spikes in popular stories impose demands throughout
    the architecture: API services, loggers, DW, BI, etc.

• How can this level of service be built 100% in the cloud?


AWS Start-Up Tour 2009-06-16
http://shar.es/1B7


AWS Start-Up Tour 2009-06-16
System Architecture

• Each service designed for cost-effective, horizontal scale-out
• API served by cluster of LAMP stack + cluster of NginX
• AsterData: nCluster infrastructure “hub-and-spoke” pattern
• Cascading: abstraction layer for tying together components
• Batch jobs on Elastic MapReduce, AsterData SQL/MR
• SQS, EBS, SimpleDB, MTurk, plus other AWS services

AWS Start-Up Tour 2009-06-16
AWS Start-Up Tour 2009-06-16
Key Learnings

• Capability to scale-out horizontally without having to
    recode, rebuild, etc. — add new EC2 nodes to clusters

• Authoritative data + backups in S3, great approach for DR
• Wide range of use cases implemented: widget API,
    log clean-up, vertical search, business intelligence, etc.

• Developers launch their own sandbox instances —
    makes dev/test/debug cycles more efficient

• Staff enabled to “wear even more hats” with less risk
AWS Start-Up Tour 2009-06-16
Cascading + Elastic MapReduce




AWS Start-Up Tour 2009-06-16
Cascading + Elastic MapReduce

• “Syntax is for humans, APIs are for software”
• Defines apps as set operations applied to data flows
• Engineers & data scientists don’t think in terms of
    MapReduce primitives, key/value pairs, etc.

• Integrates Hadoop API + other APIs (S3, SQS, JDBC)
• Expresses end-points as Java design patterns,
    compiled code — not just a scramble of scripts



AWS Start-Up Tour 2009-06-16
Cascading + Elastic MapReduce

• Highly scalable, fault-tolerate framework for batch jobs
• Dramatically reduced need for Ops overhead
• Excellent command line tools make the dev/test/debug
    cycle very efficient with “Big Data”

• Highly expert staff, very responsive and helpful in forums
• Cascading example code in developer resources:
    “LogAnalyzer for CloudFront” and “Multitool”



AWS Start-Up Tour 2009-06-16
Hadoop Book / Case Study

  ShareThis case study, "Cascading"
  by Chris K Wensel, in…




AWS Start-Up Tour 2009-06-16
Contacts


                               http://sharethis.com
                                @pacoid on Twitter




AWS Start-Up Tour 2009-06-16

Más contenido relacionado

La actualidad más candente

Wix sql on-storm-platform
Wix sql on-storm-platformWix sql on-storm-platform
Wix sql on-storm-platform
alooma
 

La actualidad más candente (20)

엔터프라이즈 기업을 위한 Digital 플랫폼 구축 사례 - 권낙주(SK C&C) :: AWS Community Day Online 2020
엔터프라이즈 기업을 위한 Digital 플랫폼 구축 사례 - 권낙주(SK C&C)  :: AWS Community Day Online 2020엔터프라이즈 기업을 위한 Digital 플랫폼 구축 사례 - 권낙주(SK C&C)  :: AWS Community Day Online 2020
엔터프라이즈 기업을 위한 Digital 플랫폼 구축 사례 - 권낙주(SK C&C) :: AWS Community Day Online 2020
 
Application Lifecycle Management on AWS
Application Lifecycle Management on AWSApplication Lifecycle Management on AWS
Application Lifecycle Management on AWS
 
Comparison of AWS, GCP & Azure web solutions
Comparison of AWS, GCP & Azure web solutionsComparison of AWS, GCP & Azure web solutions
Comparison of AWS, GCP & Azure web solutions
 
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DBBuilding near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
Building near real-time HTAP solutions using Synapse Link for Azure Cosmos DB
 
(New)SQL on AWS: Aurora serverless
(New)SQL on AWS: Aurora serverless(New)SQL on AWS: Aurora serverless
(New)SQL on AWS: Aurora serverless
 
How to Migrate a Web App to AWS
How to Migrate a Web App to AWSHow to Migrate a Web App to AWS
How to Migrate a Web App to AWS
 
Amazon Web Services 101
Amazon Web Services 101Amazon Web Services 101
Amazon Web Services 101
 
SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena
SLC .Net User Group -- .Net, Kinesis Firehose, Glue, AthenaSLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena
SLC .Net User Group -- .Net, Kinesis Firehose, Glue, Athena
 
Cluster SQL - TIAD Camp Microsoft Cloud Readiness
Cluster SQL - TIAD Camp Microsoft Cloud ReadinessCluster SQL - TIAD Camp Microsoft Cloud Readiness
Cluster SQL - TIAD Camp Microsoft Cloud Readiness
 
Table ronde clients
Table ronde clientsTable ronde clients
Table ronde clients
 
AWS systems manager | Francisco edilton
AWS systems manager | Francisco edilton  AWS systems manager | Francisco edilton
AWS systems manager | Francisco edilton
 
Data Virtualization in the Cloud – Accelerating Time-to-Value
Data Virtualization in the Cloud – Accelerating Time-to-ValueData Virtualization in the Cloud – Accelerating Time-to-Value
Data Virtualization in the Cloud – Accelerating Time-to-Value
 
AWS for the SQL Server Pro
AWS for the SQL Server ProAWS for the SQL Server Pro
AWS for the SQL Server Pro
 
Amazon QuickSight
Amazon QuickSightAmazon QuickSight
Amazon QuickSight
 
Basics of cloud computing ( aws )
Basics of cloud computing ( aws )Basics of cloud computing ( aws )
Basics of cloud computing ( aws )
 
Cloud Patterns Beuth Hochschule
Cloud Patterns Beuth HochschuleCloud Patterns Beuth Hochschule
Cloud Patterns Beuth Hochschule
 
Real time serverless data pipelines on AWS
Real time serverless data pipelines on AWSReal time serverless data pipelines on AWS
Real time serverless data pipelines on AWS
 
Wix sql on-storm-platform
Wix sql on-storm-platformWix sql on-storm-platform
Wix sql on-storm-platform
 
Next Generation Data Warehouse Development with Lambda and Redshift
Next Generation Data Warehouse Development with Lambda and RedshiftNext Generation Data Warehouse Development with Lambda and Redshift
Next Generation Data Warehouse Development with Lambda and Redshift
 
Designing for elasticity on AWS
Designing for elasticity on AWSDesigning for elasticity on AWS
Designing for elasticity on AWS
 

Destacado

El gan cantautor Victor Jara
El gan cantautor Victor JaraEl gan cantautor Victor Jara
El gan cantautor Victor Jara
sussannadapueto
 
Campo magnetico
Campo magneticoCampo magnetico
Campo magnetico
unach
 
Principios contables generalmente aceptados
Principios contables generalmente aceptadosPrincipios contables generalmente aceptados
Principios contables generalmente aceptados
kpeluffo3
 
Windows movie maker
Windows movie makerWindows movie maker
Windows movie maker
inigop17
 
Peregrina María Refugio del Amor Santo
Peregrina María Refugio del Amor SantoPeregrina María Refugio del Amor Santo
Peregrina María Refugio del Amor Santo
Amor Santo
 

Destacado (20)

For you
For youFor you
For you
 
Orientación décimos
Orientación décimosOrientación décimos
Orientación décimos
 
Controversias sobre Sistema de Contabilidad Integrada
Controversias sobre Sistema de Contabilidad IntegradaControversias sobre Sistema de Contabilidad Integrada
Controversias sobre Sistema de Contabilidad Integrada
 
Andalucismo ii república
Andalucismo ii repúblicaAndalucismo ii república
Andalucismo ii república
 
Inmobi honda
Inmobi hondaInmobi honda
Inmobi honda
 
Material didáctico
Material didácticoMaterial didáctico
Material didáctico
 
El gan cantautor Victor Jara
El gan cantautor Victor JaraEl gan cantautor Victor Jara
El gan cantautor Victor Jara
 
Traccia pubblicazione intervista Aif Emanuel Benedetti
Traccia pubblicazione intervista Aif Emanuel BenedettiTraccia pubblicazione intervista Aif Emanuel Benedetti
Traccia pubblicazione intervista Aif Emanuel Benedetti
 
Campo magnetico
Campo magneticoCampo magnetico
Campo magnetico
 
Principios contables generalmente aceptados
Principios contables generalmente aceptadosPrincipios contables generalmente aceptados
Principios contables generalmente aceptados
 
Windows movie maker
Windows movie makerWindows movie maker
Windows movie maker
 
Empresas que utilizan e
Empresas que utilizan eEmpresas que utilizan e
Empresas que utilizan e
 
Hoyquehemuerto
HoyquehemuertoHoyquehemuerto
Hoyquehemuerto
 
Edwin potosí.pptx22
Edwin potosí.pptx22Edwin potosí.pptx22
Edwin potosí.pptx22
 
Aspen ideas Festival Talk on Gov20
Aspen ideas Festival Talk on Gov20Aspen ideas Festival Talk on Gov20
Aspen ideas Festival Talk on Gov20
 
Pesquisa sobre Lan House no Brasil - CETIC.BR
Pesquisa sobre Lan House no Brasil - CETIC.BRPesquisa sobre Lan House no Brasil - CETIC.BR
Pesquisa sobre Lan House no Brasil - CETIC.BR
 
Presentación1
Presentación1Presentación1
Presentación1
 
Cómo hacer un libro usando wikipedia 1
Cómo hacer un libro usando wikipedia 1Cómo hacer un libro usando wikipedia 1
Cómo hacer un libro usando wikipedia 1
 
La verdadera razon de la navidad
La verdadera razon de la navidadLa verdadera razon de la navidad
La verdadera razon de la navidad
 
Peregrina María Refugio del Amor Santo
Peregrina María Refugio del Amor SantoPeregrina María Refugio del Amor Santo
Peregrina María Refugio del Amor Santo
 

Similar a AWS Start-Up Tour 2009 / ShareThis

Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
George Ang
 
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...
IndicThreads
 

Similar a AWS Start-Up Tour 2009 / ShareThis (20)

Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
 
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐從劍宗到氣宗  - 談AWS ECS與Serverless最佳實踐
從劍宗到氣宗 - 談AWS ECS與Serverless最佳實踐
 
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
AWS re:Invent 2016: How to Launch a 100K-User Corporate Back Office with Micr...
 
Aws re invent 2018 recap
Aws re invent 2018 recapAws re invent 2018 recap
Aws re invent 2018 recap
 
Building compelling Enterprise Solutions on AWS
Building compelling Enterprise Solutions on AWSBuilding compelling Enterprise Solutions on AWS
Building compelling Enterprise Solutions on AWS
 
AWS 101 - An Introduction to the Amazon Cloud
AWS 101  - An Introduction to the Amazon CloudAWS 101  - An Introduction to the Amazon Cloud
AWS 101 - An Introduction to the Amazon Cloud
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV Forum
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
 
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...
 
10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users10 Pro Tips for Scaling Your Startup from 0-10M Users
10 Pro Tips for Scaling Your Startup from 0-10M Users
 
Cloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & OpportunitiesCloud Computing - Challenges & Opportunities
Cloud Computing - Challenges & Opportunities
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Openstack - Enterprise cloud management platform
Openstack - Enterprise cloud management platformOpenstack - Enterprise cloud management platform
Openstack - Enterprise cloud management platform
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
Serverless / FaaS / Lambda and how it relates to Microservices
Serverless / FaaS / Lambda and how it relates to MicroservicesServerless / FaaS / Lambda and how it relates to Microservices
Serverless / FaaS / Lambda and how it relates to Microservices
 
Convergence of Containers and Serverless by Mency Woo
Convergence of Containers and Serverless by Mency WooConvergence of Containers and Serverless by Mency Woo
Convergence of Containers and Serverless by Mency Woo
 
Lets SAASify that Desktop Application
Lets SAASify that Desktop ApplicationLets SAASify that Desktop Application
Lets SAASify that Desktop Application
 
Cloud First: New Architecture for New Infrastructure
Cloud First: New Architecture for New InfrastructureCloud First: New Architecture for New Infrastructure
Cloud First: New Architecture for New Infrastructure
 
Introduction to AWS
Introduction to AWSIntroduction to AWS
Introduction to AWS
 

Más de Paco Nathan

Human-in-the-loop: a design pattern for managing teams that leverage ML
Human-in-the-loop: a design pattern for managing teams that leverage MLHuman-in-the-loop: a design pattern for managing teams that leverage ML
Human-in-the-loop: a design pattern for managing teams that leverage ML
Paco Nathan
 
Human-in-a-loop: a design pattern for managing teams which leverage ML
Human-in-a-loop: a design pattern for managing teams which leverage MLHuman-in-a-loop: a design pattern for managing teams which leverage ML
Human-in-a-loop: a design pattern for managing teams which leverage ML
Paco Nathan
 
Humans in a loop: Jupyter notebooks as a front-end for AI
Humans in a loop: Jupyter notebooks as a front-end for AIHumans in a loop: Jupyter notebooks as a front-end for AI
Humans in a loop: Jupyter notebooks as a front-end for AI
Paco Nathan
 
Humans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryHumans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industry
Paco Nathan
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
Paco Nathan
 

Más de Paco Nathan (20)

Human in the loop: a design pattern for managing teams working with ML
Human in the loop: a design pattern for managing  teams working with MLHuman in the loop: a design pattern for managing  teams working with ML
Human in the loop: a design pattern for managing teams working with ML
 
Human-in-the-loop: a design pattern for managing teams that leverage ML
Human-in-the-loop: a design pattern for managing teams that leverage MLHuman-in-the-loop: a design pattern for managing teams that leverage ML
Human-in-the-loop: a design pattern for managing teams that leverage ML
 
Human-in-a-loop: a design pattern for managing teams which leverage ML
Human-in-a-loop: a design pattern for managing teams which leverage MLHuman-in-a-loop: a design pattern for managing teams which leverage ML
Human-in-a-loop: a design pattern for managing teams which leverage ML
 
Humans in a loop: Jupyter notebooks as a front-end for AI
Humans in a loop: Jupyter notebooks as a front-end for AIHumans in a loop: Jupyter notebooks as a front-end for AI
Humans in a loop: Jupyter notebooks as a front-end for AI
 
Humans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryHumans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industry
 
Computable Content
Computable ContentComputable Content
Computable Content
 
Computable Content: Lessons Learned
Computable Content: Lessons LearnedComputable Content: Lessons Learned
Computable Content: Lessons Learned
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in Python
 
Use of standards and related issues in predictive analytics
Use of standards and related issues in predictive analyticsUse of standards and related issues in predictive analytics
Use of standards and related issues in predictive analytics
 
Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Data Science Reinvents Learning?
Data Science Reinvents Learning?Data Science Reinvents Learning?
Data Science Reinvents Learning?
 
Jupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and ErasmusJupyter for Education: Beyond Gutenberg and Erasmus
Jupyter for Education: Beyond Gutenberg and Erasmus
 
GalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About DataGalvanizeU Seattle: Eleven Almost-Truisms About Data
GalvanizeU Seattle: Eleven Almost-Truisms About Data
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
GraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communitiesGraphX: Graph analytics for insights about developer communities
GraphX: Graph analytics for insights about developer communities
 
Graph Analytics in Spark
Graph Analytics in SparkGraph Analytics in Spark
Graph Analytics in Spark
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
QCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark StreamingQCon São Paulo: Real-Time Analytics with Spark Streaming
QCon São Paulo: Real-Time Analytics with Spark Streaming
 
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and MoreStrata 2015 Data Preview: Spark, Data Visualization, YARN, and More
Strata 2015 Data Preview: Spark, Data Visualization, YARN, and More
 
A New Year in Data Science: ML Unpaused
A New Year in Data Science: ML UnpausedA New Year in Data Science: ML Unpaused
A New Year in Data Science: ML Unpaused
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

AWS Start-Up Tour 2009 / ShareThis

  • 1. ShareThis on AWS Paco Nathan, Data Insights ShareThis.com AWS Start-Up Tour 2009-06-16
  • 2. What Does ShareThis Do? • “Make it simple to share any online content” • Social content sharing platform • ESPN, FOX, CS Monitor, HuffPost, CBS Marketwatch, Wired, TechCrunch, ThinkGeek, etc. • When a news story goes viral on a major publisher, our sharing services must scale-out to keep pace AWS Start-Up Tour 2009-06-16
  • 3. AWS Start-Up Tour 2009-06-16
  • 4. Why Our Company Uses AWS • >10^6 publishers, >10^9 users, >10^10 urls • Early stage start-up, < 25 people, “wearing lots of hats”, ultra fast-paced R&D • Spikes in popular stories impose demands throughout the architecture: API services, loggers, DW, BI, etc. • How can this level of service be built 100% in the cloud? AWS Start-Up Tour 2009-06-16
  • 6. System Architecture • Each service designed for cost-effective, horizontal scale-out • API served by cluster of LAMP stack + cluster of NginX • AsterData: nCluster infrastructure “hub-and-spoke” pattern • Cascading: abstraction layer for tying together components • Batch jobs on Elastic MapReduce, AsterData SQL/MR • SQS, EBS, SimpleDB, MTurk, plus other AWS services AWS Start-Up Tour 2009-06-16
  • 7. AWS Start-Up Tour 2009-06-16
  • 8. Key Learnings • Capability to scale-out horizontally without having to recode, rebuild, etc. — add new EC2 nodes to clusters • Authoritative data + backups in S3, great approach for DR • Wide range of use cases implemented: widget API, log clean-up, vertical search, business intelligence, etc. • Developers launch their own sandbox instances — makes dev/test/debug cycles more efficient • Staff enabled to “wear even more hats” with less risk AWS Start-Up Tour 2009-06-16
  • 9. Cascading + Elastic MapReduce AWS Start-Up Tour 2009-06-16
  • 10. Cascading + Elastic MapReduce • “Syntax is for humans, APIs are for software” • Defines apps as set operations applied to data flows • Engineers & data scientists don’t think in terms of MapReduce primitives, key/value pairs, etc. • Integrates Hadoop API + other APIs (S3, SQS, JDBC) • Expresses end-points as Java design patterns, compiled code — not just a scramble of scripts AWS Start-Up Tour 2009-06-16
  • 11. Cascading + Elastic MapReduce • Highly scalable, fault-tolerate framework for batch jobs • Dramatically reduced need for Ops overhead • Excellent command line tools make the dev/test/debug cycle very efficient with “Big Data” • Highly expert staff, very responsive and helpful in forums • Cascading example code in developer resources: “LogAnalyzer for CloudFront” and “Multitool” AWS Start-Up Tour 2009-06-16
  • 12. Hadoop Book / Case Study ShareThis case study, "Cascading" by Chris K Wensel, in… AWS Start-Up Tour 2009-06-16
  • 13. Contacts http://sharethis.com @pacoid on Twitter AWS Start-Up Tour 2009-06-16