SlideShare una empresa de Scribd logo
1 de 21
Going Big: Scalability
Who am I?
• Chris Miller

• Huffington Post - Senior Developer
• CMS platform and API

• Started in systems/network admin before code
What is Huffington Post?
• #87 most popular site in the world (Alexa)
• #3 most popular news site in world (Alexa)
• #19 most popular US site (Alexa)

• More traffic than nytimes.com
Our Platform: Today
• Everything! No, really.

•   Perl: CMS core
•   PHP “layer” integrated on top of Perl code
•   MySQL data storage
•   MongoDB for comments storage
•   Hadoop for internal statistical analysis
•   Memcache for lightweight caching
•   Redis for more structured data types
•   Varnish for caching!
Our Platform: Tomorrow
• Re-think tools and platform from ground up
• Building new API
   – Yes, OAuth 2.0!
   – Complete REST approach
   – Will be public!
• We can’t re-write everything at once, so the API build has 4
  phases:
   –   Build “bridge” middleware to allow access to existing functionality
   –   Refactor backend edit/admin tools
   –   Refactor frontend to use API
   –   Transparently, and calmly, refactor old code while maintaining API
       interfaces
So what about CI?
• New API is built on CodeIgniter
  – Using Phil’s REST library as a starting point
     • Thanks Phil!


• Backend editorial tools are being built on CI

• We love CI
  – But it isn’t our only framework
  – Different tools work better for different teams
  – We use what works. You should too.
How we scale
• CDN: Akamai
     • 80%+ hit rate
     • Amazon S3 for origin of static files
• Basic page layout/content is generated to flat file
     • These contain some dynamic content, in PHP
     • By having the basic page as a flat file, it's less overhead to
       load
     • It also means for certain changes, we have to "regenerate"
       the page. Ugh.
Varnish
• HTTP caching reverse proxy (“HTTP Accelerator”)
• Caching layer in front of your web server

• Stores complete responses in memory
• If request exists, serves from memory
   – Otherwise, forwards to web server, and then caches

• Works nicely with Linux Kernel to delegate memory
  allocation and management to the OS, where it
  belongs
Controlling Varnish
• Set custom TTLs for content:
if (beresp.http.X-HP-Cache-Control ~ "s-maxage") {

    set beresp.http.X-HP-Cache-Control = regsub(beresp.http.X-HP-Cache-Control, "^.*s-maxage=([0-9]+).*", "1");

    // set the ttl.
    C{
        char *ttl;
        ttl = VRT_GetHdr(sp, HDR_BERESP, "023X-HP-Cache-Control:");
        VRT_l_beresp_ttl(sp, atoi(ttl));
    }C
    set beresp.http.X-Cacheable = "CUSTOM: " + beresp.ttl ;

} elsif (beresp.http.X-HP-Cache-Control ~ "(no-cache|private)" || beresp.http.pragma ~ "no-cache") {

    set beresp.ttl = 0s;
    set beresp.http.X-Cacheable = "NO-CACHE";

} else {

    set beresp.http.X-Cacheable = "DEFAULT: 30s";
    set beresp.ttl = 30s;

}
Controlling Varnish
• Refreshing content

sub process_refresh_requests {

    if (req.request == "REFRESH") {
        set req.request = "GET";
        set req.hash_always_miss = true;
    }

}


• This is invoked early in the vcl_recvvcl_recv method
Edge Side Includes
• Include cached content blocks into pages

<html>
<body>

<esi:include
   src="http://example.com/my_page1.html”
   alt="http://example.com/my_page2.html"
   onerror="continue”
/>

</body>
</html>
Edge Side Includes
• How to use ESI:
  – Make complicated blocks independently-
    accessible URIs
  – Create a “template” file with ESI includes to bring
    the page together
• Why this is powerful
  – If multiple pages use different combinations of
    page components, some may already be cached
  – Reduces amount of times entire page must be
    served; Serve only components needed
Varnish Tricks
• Intelligently purge the cache when your
  content changes
  – Allows you to increase TTL without fear of caching
    outdated content
     if (req.request == "PURGE") {
         if (!client.ip ~ purgers) {
             error 405 "Method not allowed";
         }
         return (lookup);
     }
Other Scaling Tips
• Hardware SSL offloading is your friend
• Consider mod_php
  – CGI has huge overhead
  – CGI/SuExec has huge security advantages
  – FastCGI is a happy-medium for some
Other Scaling Tips
• Don’t try to do everything on one
  server/cluster
  – Splitting your application is ok
  – 1 cluster for frontend, 1 server/cluster for backend, etc.


• Keep an open mind about technologies,
  platforms, and tools
One More Thing…
   (sorry, I couldn’t resist)
Guilds!
• What a guild is:
   – Groups of people around a topic
   – Membership/participating is encouraged, but not
     required
   – Think of it as an internal Meetup

• Join to learn new things
• Join to talk about things you are interested in

• Examples: PHP, Front End, Python, Ruby,
  Management, Platform/Architecture, Big Data,
  etc…
Guilds!
• Experts to solve technology-specific problems
  – Example: Front-end swat team to improve page load
    time due to slow/too much JS


• Collectively give back to the community around
  your technology

• Help others learn, and learn from others

• Meet people on other teams
Guilds!
• Try it out
¿Preguntas?

Questions?

Perguntas?
Chris Miller

chris.miller@huffingtonpost.com

           @ee99ee



   (P.S. – We’re hiring in NYC)

Más contenido relacionado

La actualidad más candente

Nosql taxonomy with new nugget
Nosql taxonomy with new nuggetNosql taxonomy with new nugget
Nosql taxonomy with new nuggetMatt Ingenthron
 
Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Kate Marshalkina
 
What can-be-done-around-mesos
What can-be-done-around-mesosWhat can-be-done-around-mesos
What can-be-done-around-mesosZhou Weitao
 
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performancethomas_mb
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB
 
Moving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScaleMoving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScalemmoline
 
Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2Teamskunkworks
 
Barcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWSBarcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWSWong Hoi Sing Edison
 
MongoDB SF Python
MongoDB SF PythonMongoDB SF Python
MongoDB SF PythonMike Dirolf
 
Scaling WordPress on DigitalOcean
Scaling WordPress on DigitalOceanScaling WordPress on DigitalOcean
Scaling WordPress on DigitalOceanServerGuy
 
Hong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13thHong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13thWong Hoi Sing Edison
 
캐시 분산처리 인프라
캐시 분산처리 인프라캐시 분산처리 인프라
캐시 분산처리 인프라Park Chunduck
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalabilityjbellis
 
Microsoft Azure Media Services
Microsoft Azure Media ServicesMicrosoft Azure Media Services
Microsoft Azure Media ServicesPavel Revenkov
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0MongoDB
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOLradiocats
 

La actualidad más candente (18)

Nosql taxonomy with new nugget
Nosql taxonomy with new nuggetNosql taxonomy with new nugget
Nosql taxonomy with new nugget
 
Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014Drupal meets PostgreSQL for DrupalCamp MSK 2014
Drupal meets PostgreSQL for DrupalCamp MSK 2014
 
What can-be-done-around-mesos
What can-be-done-around-mesosWhat can-be-done-around-mesos
What can-be-done-around-mesos
 
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performance
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB DeploymentsMongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
MongoDB and Amazon Web Services: Storage Options for MongoDB Deployments
 
Moving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScaleMoving to the Cloud: AWS, Zend, RightScale
Moving to the Cloud: AWS, Zend, RightScale
 
Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2Cloud Computing: Amazon AWS and EC2
Cloud Computing: Amazon AWS and EC2
 
Barcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWSBarcamp Macau 2014 - Introduction to AWS
Barcamp Macau 2014 - Introduction to AWS
 
MongoDB SF Python
MongoDB SF PythonMongoDB SF Python
MongoDB SF Python
 
Scaling WordPress on DigitalOcean
Scaling WordPress on DigitalOceanScaling WordPress on DigitalOcean
Scaling WordPress on DigitalOcean
 
Hong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13thHong Kong Drupal User Group - Sep 13th
Hong Kong Drupal User Group - Sep 13th
 
캐시 분산처리 인프라
캐시 분산처리 인프라캐시 분산처리 인프라
캐시 분산처리 인프라
 
What's up?
What's up?What's up?
What's up?
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Microsoft Azure Media Services
Microsoft Azure Media ServicesMicrosoft Azure Media Services
Microsoft Azure Media Services
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 

Similar a CI_CONF 2012: Scaling

Drupal performance
Drupal performanceDrupal performance
Drupal performanceGabi Lee
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsDataWorks Summit
 
Learn from my Mistakes - Building Better Solutions in SPFx
Learn from my  Mistakes - Building Better Solutions in SPFxLearn from my  Mistakes - Building Better Solutions in SPFx
Learn from my Mistakes - Building Better Solutions in SPFxThomas Daly
 
Optimization of modern web applications
Optimization of modern web applicationsOptimization of modern web applications
Optimization of modern web applicationsEugene Lazutkin
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyCeph Community
 
Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i  Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i Zend by Rogue Wave Software
 
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...Dirk Petersen
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilitycherryhillco
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionNguyen Tung
 
Caching strategies with lucee
Caching strategies with luceeCaching strategies with lucee
Caching strategies with luceeGert Franz
 
Apache Content Technologies
Apache Content TechnologiesApache Content Technologies
Apache Content Technologiesgagravarr
 
Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar litbbsr
 
Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar litbbsr
 
Preparing for SRE Interviews
Preparing for SRE InterviewsPreparing for SRE Interviews
Preparing for SRE InterviewsShivam Mitra
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your WebsiteAcquia
 

Similar a CI_CONF 2012: Scaling (20)

HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
Drupal performance
Drupal performanceDrupal performance
Drupal performance
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
 
Learn from my Mistakes - Building Better Solutions in SPFx
Learn from my  Mistakes - Building Better Solutions in SPFxLearn from my  Mistakes - Building Better Solutions in SPFx
Learn from my Mistakes - Building Better Solutions in SPFx
 
Optimization of modern web applications
Optimization of modern web applicationsOptimization of modern web applications
Optimization of modern web applications
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Be faster then rabbits
Be faster then rabbitsBe faster then rabbits
Be faster then rabbits
 
Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i  Fundamentals of performance tuning PHP on IBM i
Fundamentals of performance tuning PHP on IBM i
 
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
BIO IT 15 - Are Your Researchers Paying Too Much for Their Cloud-Based Data B...
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
Caching strategies with lucee
Caching strategies with luceeCaching strategies with lucee
Caching strategies with lucee
 
Apache Content Technologies
Apache Content TechnologiesApache Content Technologies
Apache Content Technologies
 
Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar
 
Php training in bhubaneswar
Php training in bhubaneswar Php training in bhubaneswar
Php training in bhubaneswar
 
Top ten-list
Top ten-listTop ten-list
Top ten-list
 
Preparing for SRE Interviews
Preparing for SRE InterviewsPreparing for SRE Interviews
Preparing for SRE Interviews
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your Website
 

Último

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Último (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

CI_CONF 2012: Scaling

  • 2. Who am I? • Chris Miller • Huffington Post - Senior Developer • CMS platform and API • Started in systems/network admin before code
  • 3. What is Huffington Post? • #87 most popular site in the world (Alexa) • #3 most popular news site in world (Alexa) • #19 most popular US site (Alexa) • More traffic than nytimes.com
  • 4. Our Platform: Today • Everything! No, really. • Perl: CMS core • PHP “layer” integrated on top of Perl code • MySQL data storage • MongoDB for comments storage • Hadoop for internal statistical analysis • Memcache for lightweight caching • Redis for more structured data types • Varnish for caching!
  • 5. Our Platform: Tomorrow • Re-think tools and platform from ground up • Building new API – Yes, OAuth 2.0! – Complete REST approach – Will be public! • We can’t re-write everything at once, so the API build has 4 phases: – Build “bridge” middleware to allow access to existing functionality – Refactor backend edit/admin tools – Refactor frontend to use API – Transparently, and calmly, refactor old code while maintaining API interfaces
  • 6. So what about CI? • New API is built on CodeIgniter – Using Phil’s REST library as a starting point • Thanks Phil! • Backend editorial tools are being built on CI • We love CI – But it isn’t our only framework – Different tools work better for different teams – We use what works. You should too.
  • 7. How we scale • CDN: Akamai • 80%+ hit rate • Amazon S3 for origin of static files • Basic page layout/content is generated to flat file • These contain some dynamic content, in PHP • By having the basic page as a flat file, it's less overhead to load • It also means for certain changes, we have to "regenerate" the page. Ugh.
  • 8. Varnish • HTTP caching reverse proxy (“HTTP Accelerator”) • Caching layer in front of your web server • Stores complete responses in memory • If request exists, serves from memory – Otherwise, forwards to web server, and then caches • Works nicely with Linux Kernel to delegate memory allocation and management to the OS, where it belongs
  • 9. Controlling Varnish • Set custom TTLs for content: if (beresp.http.X-HP-Cache-Control ~ "s-maxage") { set beresp.http.X-HP-Cache-Control = regsub(beresp.http.X-HP-Cache-Control, "^.*s-maxage=([0-9]+).*", "1"); // set the ttl. C{ char *ttl; ttl = VRT_GetHdr(sp, HDR_BERESP, "023X-HP-Cache-Control:"); VRT_l_beresp_ttl(sp, atoi(ttl)); }C set beresp.http.X-Cacheable = "CUSTOM: " + beresp.ttl ; } elsif (beresp.http.X-HP-Cache-Control ~ "(no-cache|private)" || beresp.http.pragma ~ "no-cache") { set beresp.ttl = 0s; set beresp.http.X-Cacheable = "NO-CACHE"; } else { set beresp.http.X-Cacheable = "DEFAULT: 30s"; set beresp.ttl = 30s; }
  • 10. Controlling Varnish • Refreshing content sub process_refresh_requests { if (req.request == "REFRESH") { set req.request = "GET"; set req.hash_always_miss = true; } } • This is invoked early in the vcl_recvvcl_recv method
  • 11. Edge Side Includes • Include cached content blocks into pages <html> <body> <esi:include src="http://example.com/my_page1.html” alt="http://example.com/my_page2.html" onerror="continue” /> </body> </html>
  • 12. Edge Side Includes • How to use ESI: – Make complicated blocks independently- accessible URIs – Create a “template” file with ESI includes to bring the page together • Why this is powerful – If multiple pages use different combinations of page components, some may already be cached – Reduces amount of times entire page must be served; Serve only components needed
  • 13. Varnish Tricks • Intelligently purge the cache when your content changes – Allows you to increase TTL without fear of caching outdated content if (req.request == "PURGE") { if (!client.ip ~ purgers) { error 405 "Method not allowed"; } return (lookup); }
  • 14. Other Scaling Tips • Hardware SSL offloading is your friend • Consider mod_php – CGI has huge overhead – CGI/SuExec has huge security advantages – FastCGI is a happy-medium for some
  • 15. Other Scaling Tips • Don’t try to do everything on one server/cluster – Splitting your application is ok – 1 cluster for frontend, 1 server/cluster for backend, etc. • Keep an open mind about technologies, platforms, and tools
  • 16. One More Thing… (sorry, I couldn’t resist)
  • 17. Guilds! • What a guild is: – Groups of people around a topic – Membership/participating is encouraged, but not required – Think of it as an internal Meetup • Join to learn new things • Join to talk about things you are interested in • Examples: PHP, Front End, Python, Ruby, Management, Platform/Architecture, Big Data, etc…
  • 18. Guilds! • Experts to solve technology-specific problems – Example: Front-end swat team to improve page load time due to slow/too much JS • Collectively give back to the community around your technology • Help others learn, and learn from others • Meet people on other teams
  • 21. Chris Miller chris.miller@huffingtonpost.com @ee99ee (P.S. – We’re hiring in NYC)