SlideShare una empresa de Scribd logo
1 de 41
Descargar para leer sin conexión
Flickr and PHP
   Cal Henderson
What’s Flickr

• Photo sharing
• Open APIs
Logical Architecture

     Photo Storage                 Database          Node Service


                     Application Logic


               Page Logic                     API


                Templates                Endpoints


                                 3rd Party Apps      Flickr Apps
              Flickr.com
  Email




                               Users
Physical Architecture

   Static Servers   Database Servers   Node Servers




                     Web Servers



                        Users
Where is PHP?

     Photo Storage                Database           Node Service


                     Application Logic


               Page Logic                    API


                Templates                Endpoints


                                 3rd Party Apps      Flickr Apps
              Flickr.com
  Email




                               Users
Other than PHP?

•   Smarty for templating
•   PEAR for XML and Email parsing
•   Perl for controlling…
•   ImageMagick, for image processing
•   MySQL (4.0 / InnoDb)
•   Java, for the node service
•   Apache 2, Redhat, etc. etc.
Big Application?

•   One programmer, one designer, etc.
•   ~60,000 lines of PHP code
•   ~60,000 lines of templates
•   ~70 custom smarty functions/modifiers
•   ~25,000 DB transactions/second at peak
•   ~1000 pages per second at peak
Thinking outside the web app

• Services
   – Atom/RSS/RDF Feeds
   – APIs
      •   SOAP
      •   XML-RPC
      •   REST
      •   PEAR::XML::Tree
More cool stuff

• Email interface
    – Postfix
    – PHP
    – PEAR::Mail::mimeDecode
•   FTP
•   Uploading API
•   Authentication API
•   Unicode
Even more stuff

• Real time application
• Cool flash apps
• Blogging
   –   Blogger API (1 & 2)
   –   Metaweblog API
   –   Atom
   –   LiveJournal
APIs are simple!

•   Modeled on XML-RPC (Sort of)
•   Method calls with XML responses
•   SOAP, XML-RPC and REST are just transports
•   PHP endpoints mean we can use the same application
    logic as the website
XML isn’t simple :(

• PHP 4 doesn’t have good a XML parser
• Expat is cool though (PEAR::XML::Parser)
• Why doesn’t PEAR have XPath?
   – Because PEAR is stupid!
   – PHP 4 sucks!
I love XPath

if ($tree->root->name == 'methodResponse'){
      if (($tree->root->children[0]->name == 'params')
       && ($tree->root->children[0]->children[0]->name == 'param')
       && ($tree->root->children[0]->children[0]->children[0]->name == 'value')
       && ($tree->root->children[0]->children[0]->children[0]->children[0]->name == 'array')
       && ($tree->root->children[0]->children[0]->children[0]->children[0]->children[0]->name == 'data')){
               $rsp = $tree->root->children[0]->children[0]->children[0]->children[0]->children[0];
      }
      if ($tree->root->children[0]->name == 'fault'){
               $fault = $tree->root->children[0];
               return $fault;
      }
}




$nodes = $tree->select_nodes('/methodResponse/params/param[1]/value[1]/array[1]/data[1]/text()');

if (count($nodes)){
      $rsp = array_pop($nodes);
}else{
      list($fault) = $tree->select_nodes('/methodResponse/fault');
      return $fault;
}
Creating API methods

• Stateless method-call APIs are easy to extend
• Adding a method requires no knowledge of the transport
• Adding a method once makes it available to all the
  interfaces
• Self documenting
Red Hot Unicode Action

• UTF-8 pages
• CJKV support
• It’s really cool
Unicode for all

• It’s really easy
   –   Don’t need PHP support
   –   Don’t need MySQL support
   –   Just need the right headers
   –   UTF-8 is 7-bit transparent
   –   (Just don’t mess with high characters)
        • Don’t use HtmlEntities()!
• But bear in mind…
        • JavaScript has patchy Unicode support
        • People using your APIs might be stupid
Scaling the beast

•   Why PHP is great
•   MySQL scaling
•   Search scaling
•   Horizontal scaling
Why PHP is great

• Stateless
   –   We can bounce people around servers
   –   Everything is stored in the database
   –   Even the smarty cache
   –   “Shared nothing”
   –   (so long as we avoid PHP sessions)
MySQL Scaling

• Our database server started to slow
• Load of 200
• Replication!
MySQL Replication

• But it only gives you more SELECT’s
• Else you need to partition vertically
• Re-architecting sucks :(
Looking at usage

• Snapshot of db1.flickr.com
   –   SELECT’s 44,220,588
   –   INSERT’s 1,349,234
   –   UPDATE’s 1,755,503
   –   DELETE’s 318,439
   –   13 SELECT’s per I/U/D
Replication is really cool

• A bunch of slave servers handle all the SELECT’s
• A single master just handles I/U/D’s
• It can scale horizontally, at least for a while.
Searching

•   A simple text search
•   We were using RLIKE
•   Then switched to LIKE
•   Then disabled it all together
FULLTEXT Indexes

•   MySQL saves the day!
•   But they’re only supported my MyISAM tables
•   We use InnoDb, because it’s a lot faster
•   We’re doomed
But wait!

• Partial replication saves the day
• Replicate the portion of the database we want to search.
• But change the table types on the slave to MyISAM
• It can keep up because it’s only handling I/U/D’s on a
  couple of tables
• And we can reduce the I/U/D’s with a little bit of vertical
  partitioning
JOIN’s are slow

•   Normalised data is for sissies
•   Keep multiple copies of data around
•   Makes searching faster
•   Have to ensure consistency in the application logic
Our current setup


                         DB1
                                  I/U/D’s
                         Master




          SELECT’s
                         DB2
                     Main Slave




                                            DB3
                                       Main Search
            Slave Farm
                                          slave
                                                      Search
                                                     SELECT’s


                                     Search Slave
                                        Farm
Horizontal scaling

•   At the core of our design
•   Just add hardware!
•   Inexpensive
•   Not exponential
•   Avoid redesigns
Talking to the Node Service

• Everyone speaks XML (badly)
• Just TCP/IP - fsockopen()
• We’re issuing commands, not requesting data, so we
  don’t bother to parse the response
   – Just substring search for state=“ok”
• Don’t rely on it!
RSS / Atom / RDF

•   Different formats
•   All quite bad
•   We’re generating a lot of different feeds
•   Abstract the difference away using templates
•   No good way to do private feeds. Why is nobody working
    on this? (WSSE maybe?)
Receiving email

•   Want users to be able to email photos to Flickr
•   Just get postfix to pipe each mail to a PHP script
•   Parse the mail and find any photos
•   Cellular phone companies hate you
•   Lots of mailers are retarded
    – Photos as text/plain attachments :/
Upload via FTP

• PHP isn’t so great at being a daemon
• Leaks memory like a sieve
• No threads
• Java to the rescue
• Java just acts as an FTPd and passes all uploaded files
  to PHP for processing
• (This isn’t actually public)
• Bricolage does this I think. Maybe Zope?
Blogs

• Why does everyone loves blogs so much?
• Only a few APIs really
   –   Blogger
   –   Metaweblog
   –   Blogger2
   –   Movable Type
   –   Atom
   –   Live Journal
It’s all broken

•   Lots of blog software has broken interfaces
•   It’s a support nightmare
•   Manila is tricky
•   But it all works, more or less
•   Abstracted in the application logic
•   We just call blogs_post_message();
Back to those APIs

• We opened up the Flickr APIs a few weeks ago
• Programmers mainly build tools for other programmers
• We have Perl, python, PHP, ActionScript, XMLHTTP and
  .NET interface libraries
• But also a few actual applications
Flickr Rainbow
Tag Wallpaper
So what next?

•   Much more scaling
•   PHP 5?
•   MySQL 5?
•   Taking over the world
Flickr and PHP
   Cal Henderson
Any Questions?

Más contenido relacionado

La actualidad más candente

Travelling Salesman
Travelling SalesmanTravelling Salesman
Travelling Salesman
Shuvojit Kar
 

La actualidad más candente (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Dependency preserving
Dependency preservingDependency preserving
Dependency preserving
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Theory of Computation Unit 3
Theory of Computation Unit 3Theory of Computation Unit 3
Theory of Computation Unit 3
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
 
Semantic Analysis.pptx
Semantic Analysis.pptxSemantic Analysis.pptx
Semantic Analysis.pptx
 
Time complexity
Time complexityTime complexity
Time complexity
 
B+ trees and height balance tree
B+ trees and height balance treeB+ trees and height balance tree
B+ trees and height balance tree
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Introduction to D3.js
Introduction to D3.jsIntroduction to D3.js
Introduction to D3.js
 
Travelling Salesman
Travelling SalesmanTravelling Salesman
Travelling Salesman
 
直交領域探索
直交領域探索直交領域探索
直交領域探索
 
強力なグラフィック機能を備えた組版処理システムTwightの開発
強力なグラフィック機能を備えた組版処理システムTwightの開発強力なグラフィック機能を備えた組版処理システムTwightの開発
強力なグラフィック機能を備えた組版処理システムTwightの開発
 
Lecture 04 syntax analysis
Lecture 04 syntax analysisLecture 04 syntax analysis
Lecture 04 syntax analysis
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
1.Role lexical Analyzer
1.Role lexical Analyzer1.Role lexical Analyzer
1.Role lexical Analyzer
 
Compiler Optimization Presentation
Compiler Optimization PresentationCompiler Optimization Presentation
Compiler Optimization Presentation
 
Regular Expression to Finite Automata
Regular Expression to Finite AutomataRegular Expression to Finite Automata
Regular Expression to Finite Automata
 

Similar a Flickr Architecture Presentation

Flickr Architecture Presentation
Flickr Architecture PresentationFlickr Architecture Presentation
Flickr Architecture Presentation
web25
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecture
drewz lin
 
Facebook的架构
Facebook的架构Facebook的架构
Facebook的架构
yiditushe
 
Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01
jgregory1234
 
From One to a Cluster
From One to a ClusterFrom One to a Cluster
From One to a Cluster
guestd34230
 
WordPress Performance & Scalability
WordPress Performance & ScalabilityWordPress Performance & Scalability
WordPress Performance & Scalability
Joseph Scott
 
Ajax Tutorial
Ajax TutorialAjax Tutorial
Ajax Tutorial
oscon2007
 

Similar a Flickr Architecture Presentation (20)

Flickr Architecture Presentation
Flickr Architecture PresentationFlickr Architecture Presentation
Flickr Architecture Presentation
 
flickr's architecture & php
flickr's architecture & php flickr's architecture & php
flickr's architecture & php
 
Flickr Php
Flickr PhpFlickr Php
Flickr Php
 
Flickr Services
Flickr ServicesFlickr Services
Flickr Services
 
Flickr Services
Flickr ServicesFlickr Services
Flickr Services
 
Mashups with Drupal and QueryPath
Mashups with Drupal and QueryPathMashups with Drupal and QueryPath
Mashups with Drupal and QueryPath
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecture
 
Facebook的架构
Facebook的架构Facebook的架构
Facebook的架构
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecture
 
Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01
 
Top ten-list
Top ten-listTop ten-list
Top ten-list
 
Qcon
QconQcon
Qcon
 
Profiling php applications
Profiling php applicationsProfiling php applications
Profiling php applications
 
Introduction to php
Introduction to phpIntroduction to php
Introduction to php
 
From One to a Cluster
From One to a ClusterFrom One to a Cluster
From One to a Cluster
 
WordPress Performance & Scalability
WordPress Performance & ScalabilityWordPress Performance & Scalability
WordPress Performance & Scalability
 
Ajax Tutorial
Ajax TutorialAjax Tutorial
Ajax Tutorial
 
Lennart Regebro What Zope Did Wrong (And What To Do Instead)
Lennart Regebro   What Zope Did Wrong (And What To Do Instead)Lennart Regebro   What Zope Did Wrong (And What To Do Instead)
Lennart Regebro What Zope Did Wrong (And What To Do Instead)
 
Lennart Regebro What Zope Did Wrong (And What To Do Instead)
Lennart Regebro   What Zope Did Wrong (And What To Do Instead)Lennart Regebro   What Zope Did Wrong (And What To Do Instead)
Lennart Regebro What Zope Did Wrong (And What To Do Instead)
 
Merb For The Enterprise
Merb For The EnterpriseMerb For The Enterprise
Merb For The Enterprise
 

Más de eraz

Mustaches
MustachesMustaches
Mustaches
eraz
 
Most Unusual Haircut
Most Unusual HaircutMost Unusual Haircut
Most Unusual Haircut
eraz
 
Cool Unusual Sculptures
Cool Unusual SculpturesCool Unusual Sculptures
Cool Unusual Sculptures
eraz
 
Funny
FunnyFunny
Funny
eraz
 
Wal Mart Presentation Citigroup 021407
Wal  Mart  Presentation  Citigroup 021407Wal  Mart  Presentation  Citigroup 021407
Wal Mart Presentation Citigroup 021407
eraz
 
Top 5 Dos And Don Ts For Measuring Web 2 0
Top 5  Dos And  Don Ts For  Measuring  Web 2 0Top 5  Dos And  Don Ts For  Measuring  Web 2 0
Top 5 Dos And Don Ts For Measuring Web 2 0
eraz
 
Wal Mart Presentation Citigroup 021407
Wal Mart Presentation Citigroup 021407Wal Mart Presentation Citigroup 021407
Wal Mart Presentation Citigroup 021407
eraz
 
Are Agile Projects Doomed To Halfbaked Design
Are Agile Projects Doomed To Halfbaked DesignAre Agile Projects Doomed To Halfbaked Design
Are Agile Projects Doomed To Halfbaked Design
eraz
 
Web Applications Are Getting Interesting!
Web Applications Are Getting Interesting!Web Applications Are Getting Interesting!
Web Applications Are Getting Interesting!
eraz
 
Web20 Expo 2007 Mobile Experience
Web20 Expo 2007 Mobile ExperienceWeb20 Expo 2007 Mobile Experience
Web20 Expo 2007 Mobile Experience
eraz
 
Sxsw2007 Mobile
Sxsw2007 MobileSxsw2007 Mobile
Sxsw2007 Mobile
eraz
 
Form A Wall Presentation Short
Form A Wall Presentation   ShortForm A Wall Presentation   Short
Form A Wall Presentation Short
eraz
 
Momentum Infocare Corporate Presentation
Momentum Infocare   Corporate PresentationMomentum Infocare   Corporate Presentation
Momentum Infocare Corporate Presentation
eraz
 
Mc Kinney Presentation
Mc Kinney PresentationMc Kinney Presentation
Mc Kinney Presentation
eraz
 
Srx Product Introduction Power Point Presentation ©Palmetto Equipment
Srx Product Introduction Power Point Presentation ©Palmetto EquipmentSrx Product Introduction Power Point Presentation ©Palmetto Equipment
Srx Product Introduction Power Point Presentation ©Palmetto Equipment
eraz
 
Pitney Bowes 2006 Presentation Martin
Pitney Bowes 2006 Presentation   MartinPitney Bowes 2006 Presentation   Martin
Pitney Bowes 2006 Presentation Martin
eraz
 
Dot Net Tips And Tricks
Dot Net Tips And TricksDot Net Tips And Tricks
Dot Net Tips And Tricks
eraz
 
Dot Mobi Mobile Web Developers Guide
Dot Mobi Mobile Web Developers GuideDot Mobi Mobile Web Developers Guide
Dot Mobi Mobile Web Developers Guide
eraz
 
Clearspring Widgetsphere
Clearspring WidgetsphereClearspring Widgetsphere
Clearspring Widgetsphere
eraz
 
Babson Glavin Presentation Onsite Videos 2007
Babson Glavin Presentation Onsite Videos 2007Babson Glavin Presentation Onsite Videos 2007
Babson Glavin Presentation Onsite Videos 2007
eraz
 

Más de eraz (20)

Mustaches
MustachesMustaches
Mustaches
 
Most Unusual Haircut
Most Unusual HaircutMost Unusual Haircut
Most Unusual Haircut
 
Cool Unusual Sculptures
Cool Unusual SculpturesCool Unusual Sculptures
Cool Unusual Sculptures
 
Funny
FunnyFunny
Funny
 
Wal Mart Presentation Citigroup 021407
Wal  Mart  Presentation  Citigroup 021407Wal  Mart  Presentation  Citigroup 021407
Wal Mart Presentation Citigroup 021407
 
Top 5 Dos And Don Ts For Measuring Web 2 0
Top 5  Dos And  Don Ts For  Measuring  Web 2 0Top 5  Dos And  Don Ts For  Measuring  Web 2 0
Top 5 Dos And Don Ts For Measuring Web 2 0
 
Wal Mart Presentation Citigroup 021407
Wal Mart Presentation Citigroup 021407Wal Mart Presentation Citigroup 021407
Wal Mart Presentation Citigroup 021407
 
Are Agile Projects Doomed To Halfbaked Design
Are Agile Projects Doomed To Halfbaked DesignAre Agile Projects Doomed To Halfbaked Design
Are Agile Projects Doomed To Halfbaked Design
 
Web Applications Are Getting Interesting!
Web Applications Are Getting Interesting!Web Applications Are Getting Interesting!
Web Applications Are Getting Interesting!
 
Web20 Expo 2007 Mobile Experience
Web20 Expo 2007 Mobile ExperienceWeb20 Expo 2007 Mobile Experience
Web20 Expo 2007 Mobile Experience
 
Sxsw2007 Mobile
Sxsw2007 MobileSxsw2007 Mobile
Sxsw2007 Mobile
 
Form A Wall Presentation Short
Form A Wall Presentation   ShortForm A Wall Presentation   Short
Form A Wall Presentation Short
 
Momentum Infocare Corporate Presentation
Momentum Infocare   Corporate PresentationMomentum Infocare   Corporate Presentation
Momentum Infocare Corporate Presentation
 
Mc Kinney Presentation
Mc Kinney PresentationMc Kinney Presentation
Mc Kinney Presentation
 
Srx Product Introduction Power Point Presentation ©Palmetto Equipment
Srx Product Introduction Power Point Presentation ©Palmetto EquipmentSrx Product Introduction Power Point Presentation ©Palmetto Equipment
Srx Product Introduction Power Point Presentation ©Palmetto Equipment
 
Pitney Bowes 2006 Presentation Martin
Pitney Bowes 2006 Presentation   MartinPitney Bowes 2006 Presentation   Martin
Pitney Bowes 2006 Presentation Martin
 
Dot Net Tips And Tricks
Dot Net Tips And TricksDot Net Tips And Tricks
Dot Net Tips And Tricks
 
Dot Mobi Mobile Web Developers Guide
Dot Mobi Mobile Web Developers GuideDot Mobi Mobile Web Developers Guide
Dot Mobi Mobile Web Developers Guide
 
Clearspring Widgetsphere
Clearspring WidgetsphereClearspring Widgetsphere
Clearspring Widgetsphere
 
Babson Glavin Presentation Onsite Videos 2007
Babson Glavin Presentation Onsite Videos 2007Babson Glavin Presentation Onsite Videos 2007
Babson Glavin Presentation Onsite Videos 2007
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Flickr Architecture Presentation

  • 1. Flickr and PHP Cal Henderson
  • 2. What’s Flickr • Photo sharing • Open APIs
  • 3. Logical Architecture Photo Storage Database Node Service Application Logic Page Logic API Templates Endpoints 3rd Party Apps Flickr Apps Flickr.com Email Users
  • 4. Physical Architecture Static Servers Database Servers Node Servers Web Servers Users
  • 5. Where is PHP? Photo Storage Database Node Service Application Logic Page Logic API Templates Endpoints 3rd Party Apps Flickr Apps Flickr.com Email Users
  • 6. Other than PHP? • Smarty for templating • PEAR for XML and Email parsing • Perl for controlling… • ImageMagick, for image processing • MySQL (4.0 / InnoDb) • Java, for the node service • Apache 2, Redhat, etc. etc.
  • 7. Big Application? • One programmer, one designer, etc. • ~60,000 lines of PHP code • ~60,000 lines of templates • ~70 custom smarty functions/modifiers • ~25,000 DB transactions/second at peak • ~1000 pages per second at peak
  • 8. Thinking outside the web app • Services – Atom/RSS/RDF Feeds – APIs • SOAP • XML-RPC • REST • PEAR::XML::Tree
  • 9. More cool stuff • Email interface – Postfix – PHP – PEAR::Mail::mimeDecode • FTP • Uploading API • Authentication API • Unicode
  • 10. Even more stuff • Real time application • Cool flash apps • Blogging – Blogger API (1 & 2) – Metaweblog API – Atom – LiveJournal
  • 11. APIs are simple! • Modeled on XML-RPC (Sort of) • Method calls with XML responses • SOAP, XML-RPC and REST are just transports • PHP endpoints mean we can use the same application logic as the website
  • 12. XML isn’t simple :( • PHP 4 doesn’t have good a XML parser • Expat is cool though (PEAR::XML::Parser) • Why doesn’t PEAR have XPath? – Because PEAR is stupid! – PHP 4 sucks!
  • 13. I love XPath if ($tree->root->name == 'methodResponse'){ if (($tree->root->children[0]->name == 'params') && ($tree->root->children[0]->children[0]->name == 'param') && ($tree->root->children[0]->children[0]->children[0]->name == 'value') && ($tree->root->children[0]->children[0]->children[0]->children[0]->name == 'array') && ($tree->root->children[0]->children[0]->children[0]->children[0]->children[0]->name == 'data')){ $rsp = $tree->root->children[0]->children[0]->children[0]->children[0]->children[0]; } if ($tree->root->children[0]->name == 'fault'){ $fault = $tree->root->children[0]; return $fault; } } $nodes = $tree->select_nodes('/methodResponse/params/param[1]/value[1]/array[1]/data[1]/text()'); if (count($nodes)){ $rsp = array_pop($nodes); }else{ list($fault) = $tree->select_nodes('/methodResponse/fault'); return $fault; }
  • 14. Creating API methods • Stateless method-call APIs are easy to extend • Adding a method requires no knowledge of the transport • Adding a method once makes it available to all the interfaces • Self documenting
  • 15. Red Hot Unicode Action • UTF-8 pages • CJKV support • It’s really cool
  • 16.
  • 17. Unicode for all • It’s really easy – Don’t need PHP support – Don’t need MySQL support – Just need the right headers – UTF-8 is 7-bit transparent – (Just don’t mess with high characters) • Don’t use HtmlEntities()! • But bear in mind… • JavaScript has patchy Unicode support • People using your APIs might be stupid
  • 18. Scaling the beast • Why PHP is great • MySQL scaling • Search scaling • Horizontal scaling
  • 19. Why PHP is great • Stateless – We can bounce people around servers – Everything is stored in the database – Even the smarty cache – “Shared nothing” – (so long as we avoid PHP sessions)
  • 20. MySQL Scaling • Our database server started to slow • Load of 200 • Replication!
  • 21. MySQL Replication • But it only gives you more SELECT’s • Else you need to partition vertically • Re-architecting sucks :(
  • 22. Looking at usage • Snapshot of db1.flickr.com – SELECT’s 44,220,588 – INSERT’s 1,349,234 – UPDATE’s 1,755,503 – DELETE’s 318,439 – 13 SELECT’s per I/U/D
  • 23. Replication is really cool • A bunch of slave servers handle all the SELECT’s • A single master just handles I/U/D’s • It can scale horizontally, at least for a while.
  • 24. Searching • A simple text search • We were using RLIKE • Then switched to LIKE • Then disabled it all together
  • 25. FULLTEXT Indexes • MySQL saves the day! • But they’re only supported my MyISAM tables • We use InnoDb, because it’s a lot faster • We’re doomed
  • 26. But wait! • Partial replication saves the day • Replicate the portion of the database we want to search. • But change the table types on the slave to MyISAM • It can keep up because it’s only handling I/U/D’s on a couple of tables • And we can reduce the I/U/D’s with a little bit of vertical partitioning
  • 27. JOIN’s are slow • Normalised data is for sissies • Keep multiple copies of data around • Makes searching faster • Have to ensure consistency in the application logic
  • 28. Our current setup DB1 I/U/D’s Master SELECT’s DB2 Main Slave DB3 Main Search Slave Farm slave Search SELECT’s Search Slave Farm
  • 29. Horizontal scaling • At the core of our design • Just add hardware! • Inexpensive • Not exponential • Avoid redesigns
  • 30. Talking to the Node Service • Everyone speaks XML (badly) • Just TCP/IP - fsockopen() • We’re issuing commands, not requesting data, so we don’t bother to parse the response – Just substring search for state=“ok” • Don’t rely on it!
  • 31. RSS / Atom / RDF • Different formats • All quite bad • We’re generating a lot of different feeds • Abstract the difference away using templates • No good way to do private feeds. Why is nobody working on this? (WSSE maybe?)
  • 32. Receiving email • Want users to be able to email photos to Flickr • Just get postfix to pipe each mail to a PHP script • Parse the mail and find any photos • Cellular phone companies hate you • Lots of mailers are retarded – Photos as text/plain attachments :/
  • 33. Upload via FTP • PHP isn’t so great at being a daemon • Leaks memory like a sieve • No threads • Java to the rescue • Java just acts as an FTPd and passes all uploaded files to PHP for processing • (This isn’t actually public) • Bricolage does this I think. Maybe Zope?
  • 34. Blogs • Why does everyone loves blogs so much? • Only a few APIs really – Blogger – Metaweblog – Blogger2 – Movable Type – Atom – Live Journal
  • 35. It’s all broken • Lots of blog software has broken interfaces • It’s a support nightmare • Manila is tricky • But it all works, more or less • Abstracted in the application logic • We just call blogs_post_message();
  • 36. Back to those APIs • We opened up the Flickr APIs a few weeks ago • Programmers mainly build tools for other programmers • We have Perl, python, PHP, ActionScript, XMLHTTP and .NET interface libraries • But also a few actual applications
  • 39. So what next? • Much more scaling • PHP 5? • MySQL 5? • Taking over the world
  • 40. Flickr and PHP Cal Henderson