SlideShare una empresa de Scribd logo
1 de 49
Descargar para leer sin conexión
Lessons learned while
 building omroep.nl
    Bart Zonneveld (@bartzon)
    Sjoerd Tieleman (@tieleman)
Nederlandse Publieke Omroep
Dutch Public broadcasting Company


  AVRO    Joodse Omroep   NMO   Teleac

   BNN        KRO         NOS   TROS

   BOS       LLiNK        NPS   VARA

   EO         MAX         OHM   VPRO

  HUMAN      NCRV         RKK   ZvK

   IKON       NIO         RVU
Rails sites
• Beetlejuice       • Radio 1
• Centrale          • Z@PP
  navigatie
• Omroep.nl         • Z@ppelin
• Nederland 1       • Zelda
• Nederland 3       • Various tools
Team
• 2 coders
• 1 designer
• 1 editor
• 1 project manager
  6 months, CMS built from scratch
Requirements

• Handle 30.000-40.000 pageviews per day
• Handle traffic spikes
• Flexible, multi user CMS
• Loads of external data
Daily spread
Some numbers
+----------------------+-------+-------+---------+---------+-----+-------+
| Name                 | Lines |   LOC | Classes | Methods | M/C | LOC/M |
+----------------------+-------+-------+---------+---------+-----+-------+
| Controllers          | 1864 | 1535 |        41 |     185 |   4 |     6 |
| Helpers              |   797 |   631 |       1 |      75 | 75 |      6 |
| Models               | 1303 | 1055 |        40 |     153 |   3 |     4 |
| Libraries            |   814 |   620 |      15 |      79 |   5 |     5 |
| Integration tests    |     0 |     0 |       0 |       0 |   0 |     0 |
| Functional tests     |     0 |     0 |       0 |       0 |   0 |     0 |
| Unit tests           |     0 |     0 |       0 |       0 |   0 |     0 |
| Model specs          | 1932 | 1573 |         0 |       9 |   0 |   172 |
| View specs           | 7322 | 5950 |         0 |     153 |   0 |    36 |
| Controller specs     | 7292 | 5846 |         0 |     175 |   0 |    31 |
| Helper specs         |   900 |   676 |       0 |       2 |   0 |   336 |
| Library specs        |    56 |    45 |       2 |      12 |   6 |     1 |
+----------------------+-------+-------+---------+---------+-----+-------+
| Total                | 22280 | 17931 |      99 |     843 |   8 |    19 |
+----------------------+-------+-------+---------+---------+-----+-------+
       Code LOC: 3841    Test LOC: 14090     Code to Test Ratio: 1:3.7
Moar numbers


• 410 Cucumber scenarios, 600 step definitions
• 2235 RSpec specifications


              so it must be bug-free, right? ;-)
Tools
Ruby on Rails 2.3.4      Rspec + Webrat + Cucumber

   Apache 2.2                     Paperclip

       SVN              App monitoring: RPM, Hoptoad

    MySQL 5               Service monitoring: Nagios

    Memcache
Tools: app monitoring




 Hoptoad     New Relic RPM
Architecture

• Apache 2.2 with mod_proxy
• Rails 2.3.4 running on Phusion Passenger
  2.2.5 with REE
• 4 hosts, each running 4 instances (per app)
  Appdex: 1.0, avg. response time 40ms, 130 rpm, db load 0.6 %
Servers


• Quadcore Intel Xeon E542, 32 GB Ram
• Fedora 8
• Other mumbojumbo
Architecture
               Front proxy    Front proxy




Application     Application    Application   Application
  server          server         server        server




                Database      memcache
Workflow

• BDD
• Shared behaviours
• Performance testing
• Staging and production environment
BDD

• RSpec
• Cucumber
• (Webrat)
3 slide intro to BDD:
           RSpec
describe Article do
  it_should_behave_like   "all   objects   with   userstamps"
  it_should_behave_like   "all   objects   than   can be published"
  it_should_behave_like   "all   objects   that   have an url"
  it_should_behave_like   "all   objects   that   can be searched"
  it_should_behave_like   "all   objects   with   related articles"

  it "should not be valid without a name" do
    @article.attributes = @valid_attributes.except(:name)
    @article.should_not be_valid
  end

  it "should not be valid without contents" do
    @article.attributes = @valid_attributes.except(:contents)
    @article.should_not be_valid
  end
end
3 slide intro to BDD:
     Cucumber features
Feature: Articles on the homepage
  As a visitor
  I want to view articles on the homepage
  So that I can see the latest content

  Scenario: 5 most recent articles
    Given there are 8 articles
    When I visit the homepage
    Then I should see the 5 last published articles
3 slide intro to BDD:
     Cucumber steps
Given "there are $num articles" do |num|
  num.to_i.times { create_article }
end

When "I visit the homepage" do
  visit root_path
end

Then "I should see the $num last published articles" do |num|
  Article.last_published(num).each do |article|
    response.should contain(article.title)
  end
end
Shared behaviours

• Tags
• User stamps (created by, updated by)
• Searching
• “Related” articles
• Publication timestamps (on/offline at)
Shared behaviours
module UserStamps
  def self.included(klass)
                                    class Article < ActiveRecord::Base
    klass.instance_eval do
                                      include Shared::UserStamps
      include InstanceMethods
    end                               include Shared::Published
  end                                 include Shared::Url
                                      include Shared::Search
  module InstanceMethods              include Shared::RelatedArticles
    def created_by
      User.find_by_id(creator_id)     # stuff
    end                             end

    def updated_by
      User.find_by_id(updater_id)
    end
  end
end
Workflow: performance testing


• ab, httperf, autobench, cURL
• NewRelic RPM
• Safari Web Inspector
• http://railslab.newrelic.com/scaling-rails
Autobench
Challenges


• Content Management System
• Loads and loads and loads of external data
CMS


• Articles, Themas, Specials, Subsites
• Multiple feeds, images, links
• Version control
• Media database
CMS: Articles
          Subsite



Thema      Page     Special



          Article



   Link    Feed     Image
CMS:Version control
Media DB

• Implemented as REST app
• To be used as REST service
• Files, folders, crops
External data
•   RSS feeds

•   EPG data

•   Zelda

•   Babel

•   News/sport/teletekst

•   Twitter

•   Lots of custom XML formats
External data: XML/RSS
• Empty feeds
• Encodings are off (Windows-1252,
  ISO-8859-1, UTF-8)
• “Custom” fields
• Incorrect fields (dates, unescaped HTML)
• Timeouts
• Everything that can go wrong, will go wrong
External data: Twitter
External data: EPG data


   Zelda




                   don’t sue us Nintendo... please? :)
External data

• Empty feeds
• Retrieving the feed while someone is
  updating it
• Required fields that are empty
• DTD?
<!ELEMENT aflevering (
            prid?, tite?, medium?,
            icon?, aankondiging?, inkl?,
            ingl?, infi?, inak?, inds?, inbb?,
            kykw?, orti?, aant?, land?, lcod?,
            psrt?, prem?, inh1?, afle?, atit?,
            inh2?, bron?, prij?, inh3?, mail?,
            webs?, inhk?, gids_tekst?,
            omroepen?, genres?, personen?,
            streams?, fragmenten?, serie?
)>
<!ELEMENT   inkl   (#PCDATA)>
<!ELEMENT   ingl   (#PCDATA)>
<!ELEMENT   intt   (#PCDATA)>
<!ELEMENT   inhh   (#PCDATA)>
<!ELEMENT   omro   (#PCDATA)>
<!ELEMENT   lcod   (#PCDATA)>
<!ELEMENT   herh   (#PCDATA)>
<!ELEMENT   inds   (#PCDATA)>
<!ELEMENT   infi   (#PCDATA)>
<!ELEMENT   inbb   (#PCDATA)>
<!ELEMENT   genr   (#PCDATA)>
<!ELEMENT   kykw   (#PCDATA)>
<!ELEMENT   afle   (#PCDATA)>
Lessons learned

• Cache the crap out of everything
• Rescue everything
• Test everything (frontend and backend)
Caching

• Cache the homepage
• Page cache → Fragment cache
• Don’t cache forms
• Cache as much as possible
Case: article views


• Article is page cached
• Update the number of views in realtime?
Use AJAX!

<% javascript_tag do %>
    <%= remote_function :url =>
update_views_article_path(@article) %>
<% end %>
Case: banner items
Case: banner items
• Fast requests (<10ms)
• ETags (304 Not Modified)
• Static resource → page cache
• Move to front proxy, frees up Rails cluster
• 1100rpm → 130rpm
• 20ms → 40ms
•   Average response time going up? Oh nooooes!
Caching external data

• Don’t expire cache (preferrably)
• Explicitly overwrite
• Update in background (feeeeeeeds)
• memcache FTW!
memcache

• Escape your keys using CGI::escape
• Max keylength is 250
• Max value size is 1MB
Rescueing
def self.get_feed_contents(url)
  content = ""
  open(url) { |s| content = s.read }
  RSS::Parser.parse(content, false).items
rescue => e
  logger.warn "Feed #{url} raised error: #{e.message}"
  []
rescue Timeout::Error => e
  logger.warn "Feed #{url} timed out: #{e.message}"
  []
end
          Timeout::Error is an interrupt...
Testing
• rcov
• Refactor your tests
• Peer reviews, external audits
• Run specs/features
  (continuously) in parallel
  (your Cucumber is too slooooow!)
Cucumber salad
num_of_processes.times do |count|
  pids << Process.fork do
    setup_database(conn, count)
    Cucumber::Cli::Main.execute(
      ["-f", "progress", "-l", "nl", "-r", "features"] +
      feature_sets[count]
    )
  end
end
                                           MacBook Pro
                               “Regular”               Mac Pro (8)
                                               (4)

                                 12:12        4:34        2:12
Conclusions

• Test
• Optimize
• Monitor
What’s next for us?

• Building a high-performance backend
• Uitzending Gemist statistics API
• 250+ reqs/s at minimum
@questions.any?

Más contenido relacionado

La actualidad más candente

Habits of Effective Sqoop Users
Habits of Effective Sqoop UsersHabits of Effective Sqoop Users
Habits of Effective Sqoop UsersKathleen Ting
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrlucenerevolution
 
Safely Protect PostgreSQL Passwords - Tell Others to SCRAM
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMSafely Protect PostgreSQL Passwords - Tell Others to SCRAM
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMJonathan Katz
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Shalin Shekhar Mangar
 
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data storesTomas Doran
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...Lucidworks
 
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsySearch-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsyLucidworks
 
How a Small Team Scales Instagram
How a Small Team Scales InstagramHow a Small Team Scales Instagram
How a Small Team Scales InstagramC4Media
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Allen Wittenauer
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Oleksiy Panchenko
 
Harnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaHarnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaKnoldus Inc.
 
How to build your query engine in spark
How to build your query engine in sparkHow to build your query engine in spark
How to build your query engine in sparkPeng Cheng
 
Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...
Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...
Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...Ontico
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefMatt Ray
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Chris Fregly
 
Interactive Apache Spark in Your Browser
Interactive Apache Spark in Your BrowserInteractive Apache Spark in Your Browser
Interactive Apache Spark in Your BrowserCloudera, Inc.
 

La actualidad más candente (20)

Habits of Effective Sqoop Users
Habits of Effective Sqoop UsersHabits of Effective Sqoop Users
Habits of Effective Sqoop Users
 
Top ten-list
Top ten-listTop ten-list
Top ten-list
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
How Flipkart scales PHP
How Flipkart scales PHPHow Flipkart scales PHP
How Flipkart scales PHP
 
Safely Protect PostgreSQL Passwords - Tell Others to SCRAM
Safely Protect PostgreSQL Passwords - Tell Others to SCRAMSafely Protect PostgreSQL Passwords - Tell Others to SCRAM
Safely Protect PostgreSQL Passwords - Tell Others to SCRAM
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
 
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
SearchHub - How to Spend Your Summer Keeping it Real: Presented by Grant Inge...
 
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsySearch-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
 
How a Small Team Scales Instagram
How a Small Team Scales InstagramHow a Small Team Scales Instagram
How a Small Team Scales Instagram
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
 
Giraph+Gora in ApacheCon14
Giraph+Gora in ApacheCon14Giraph+Gora in ApacheCon14
Giraph+Gora in ApacheCon14
 
Harnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaHarnessing the power of Nutch with Scala
Harnessing the power of Nutch with Scala
 
How to build your query engine in spark
How to build your query engine in sparkHow to build your query engine in spark
How to build your query engine in spark
 
Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...
Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...
Efficient cluster resource management by using Cook and Mesos / Li Jin (Two S...
 
Achieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with ChefAchieving Infrastructure Portability with Chef
Achieving Infrastructure Portability with Chef
 
Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015Dublin Ireland Spark Meetup October 15, 2015
Dublin Ireland Spark Meetup October 15, 2015
 
Interactive Apache Spark in Your Browser
Interactive Apache Spark in Your BrowserInteractive Apache Spark in Your Browser
Interactive Apache Spark in Your Browser
 

Destacado

Best approach to farming by sotonye anga
Best approach to farming by sotonye angaBest approach to farming by sotonye anga
Best approach to farming by sotonye angaSotonye anga
 
Trip to asia.ppx
Trip to asia.ppxTrip to asia.ppx
Trip to asia.ppxservinj
 
Millenium development goals progress chart to date
Millenium development goals  progress chart to dateMillenium development goals  progress chart to date
Millenium development goals progress chart to dateBread for the World
 
Table 6, mdg 6 combat hiv, malaria & other diseases
Table 6, mdg 6  combat hiv, malaria & other diseasesTable 6, mdg 6  combat hiv, malaria & other diseases
Table 6, mdg 6 combat hiv, malaria & other diseasesBread for the World
 
Vereshchagina English 1 Lesson 8
Vereshchagina English 1 Lesson 8Vereshchagina English 1 Lesson 8
Vereshchagina English 1 Lesson 8AnnMaryganova
 

Destacado (7)

Best approach to farming by sotonye anga
Best approach to farming by sotonye angaBest approach to farming by sotonye anga
Best approach to farming by sotonye anga
 
Trip to asia.ppx
Trip to asia.ppxTrip to asia.ppx
Trip to asia.ppx
 
test
testtest
test
 
Chocolate
ChocolateChocolate
Chocolate
 
Millenium development goals progress chart to date
Millenium development goals  progress chart to dateMillenium development goals  progress chart to date
Millenium development goals progress chart to date
 
Table 6, mdg 6 combat hiv, malaria & other diseases
Table 6, mdg 6  combat hiv, malaria & other diseasesTable 6, mdg 6  combat hiv, malaria & other diseases
Table 6, mdg 6 combat hiv, malaria & other diseases
 
Vereshchagina English 1 Lesson 8
Vereshchagina English 1 Lesson 8Vereshchagina English 1 Lesson 8
Vereshchagina English 1 Lesson 8
 

Similar a Lessons learned while building Omroep.nl

Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Databricks
 
Play Framework and Activator
Play Framework and ActivatorPlay Framework and Activator
Play Framework and ActivatorKevin Webber
 
How to Contribute to Apache Usergrid
How to Contribute to Apache UsergridHow to Contribute to Apache Usergrid
How to Contribute to Apache UsergridDavid M. Johnson
 
Druid at naver.com - part 1
Druid at naver.com - part 1Druid at naver.com - part 1
Druid at naver.com - part 1Jungsu Heo
 
Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Henry S
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performanceEngine Yard
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your WebsiteAcquia
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackJakub Hajek
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupRafal Kwasny
 
MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?DrupalCamp Kyiv
 
What is MariaDB Server 10.3?
What is MariaDB Server 10.3?What is MariaDB Server 10.3?
What is MariaDB Server 10.3?Colin Charles
 
What's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareWhat's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareDatabricks
 

Similar a Lessons learned while building Omroep.nl (20)

Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
ITB2017 - Keynote
ITB2017 - KeynoteITB2017 - Keynote
ITB2017 - Keynote
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
Play Framework and Activator
Play Framework and ActivatorPlay Framework and Activator
Play Framework and Activator
 
How to Contribute to Apache Usergrid
How to Contribute to Apache UsergridHow to Contribute to Apache Usergrid
How to Contribute to Apache Usergrid
 
Druid at naver.com - part 1
Druid at naver.com - part 1Druid at naver.com - part 1
Druid at naver.com - part 1
 
Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1Code for Startup MVP (Ruby on Rails) Session 1
Code for Startup MVP (Ruby on Rails) Session 1
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
 
5 Common Mistakes You are Making on your Website
 5 Common Mistakes You are Making on your Website 5 Common Mistakes You are Making on your Website
5 Common Mistakes You are Making on your Website
 
Machine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy CrossMachine Learning with ML.NET and Azure - Andy Cross
Machine Learning with ML.NET and Azure - Andy Cross
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Mini-Training: Redis
Mini-Training: RedisMini-Training: Redis
Mini-Training: Redis
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
Remix
RemixRemix
Remix
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?
 
What is MariaDB Server 10.3?
What is MariaDB Server 10.3?What is MariaDB Server 10.3?
What is MariaDB Server 10.3?
 
What's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareWhat's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You Care
 

Último

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Último (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Lessons learned while building Omroep.nl

  • 1. Lessons learned while building omroep.nl Bart Zonneveld (@bartzon) Sjoerd Tieleman (@tieleman)
  • 2. Nederlandse Publieke Omroep Dutch Public broadcasting Company AVRO Joodse Omroep NMO Teleac BNN KRO NOS TROS BOS LLiNK NPS VARA EO MAX OHM VPRO HUMAN NCRV RKK ZvK IKON NIO RVU
  • 3. Rails sites • Beetlejuice • Radio 1 • Centrale • Z@PP navigatie • Omroep.nl • Z@ppelin • Nederland 1 • Zelda • Nederland 3 • Various tools
  • 4.
  • 5. Team • 2 coders • 1 designer • 1 editor • 1 project manager 6 months, CMS built from scratch
  • 6. Requirements • Handle 30.000-40.000 pageviews per day • Handle traffic spikes • Flexible, multi user CMS • Loads of external data
  • 8. Some numbers +----------------------+-------+-------+---------+---------+-----+-------+ | Name | Lines | LOC | Classes | Methods | M/C | LOC/M | +----------------------+-------+-------+---------+---------+-----+-------+ | Controllers | 1864 | 1535 | 41 | 185 | 4 | 6 | | Helpers | 797 | 631 | 1 | 75 | 75 | 6 | | Models | 1303 | 1055 | 40 | 153 | 3 | 4 | | Libraries | 814 | 620 | 15 | 79 | 5 | 5 | | Integration tests | 0 | 0 | 0 | 0 | 0 | 0 | | Functional tests | 0 | 0 | 0 | 0 | 0 | 0 | | Unit tests | 0 | 0 | 0 | 0 | 0 | 0 | | Model specs | 1932 | 1573 | 0 | 9 | 0 | 172 | | View specs | 7322 | 5950 | 0 | 153 | 0 | 36 | | Controller specs | 7292 | 5846 | 0 | 175 | 0 | 31 | | Helper specs | 900 | 676 | 0 | 2 | 0 | 336 | | Library specs | 56 | 45 | 2 | 12 | 6 | 1 | +----------------------+-------+-------+---------+---------+-----+-------+ | Total | 22280 | 17931 | 99 | 843 | 8 | 19 | +----------------------+-------+-------+---------+---------+-----+-------+ Code LOC: 3841 Test LOC: 14090 Code to Test Ratio: 1:3.7
  • 9. Moar numbers • 410 Cucumber scenarios, 600 step definitions • 2235 RSpec specifications so it must be bug-free, right? ;-)
  • 10. Tools Ruby on Rails 2.3.4 Rspec + Webrat + Cucumber Apache 2.2 Paperclip SVN App monitoring: RPM, Hoptoad MySQL 5 Service monitoring: Nagios Memcache
  • 11. Tools: app monitoring Hoptoad New Relic RPM
  • 12. Architecture • Apache 2.2 with mod_proxy • Rails 2.3.4 running on Phusion Passenger 2.2.5 with REE • 4 hosts, each running 4 instances (per app) Appdex: 1.0, avg. response time 40ms, 130 rpm, db load 0.6 %
  • 13. Servers • Quadcore Intel Xeon E542, 32 GB Ram • Fedora 8 • Other mumbojumbo
  • 14. Architecture Front proxy Front proxy Application Application Application Application server server server server Database memcache
  • 15. Workflow • BDD • Shared behaviours • Performance testing • Staging and production environment
  • 17. 3 slide intro to BDD: RSpec describe Article do it_should_behave_like "all objects with userstamps" it_should_behave_like "all objects than can be published" it_should_behave_like "all objects that have an url" it_should_behave_like "all objects that can be searched" it_should_behave_like "all objects with related articles" it "should not be valid without a name" do @article.attributes = @valid_attributes.except(:name) @article.should_not be_valid end it "should not be valid without contents" do @article.attributes = @valid_attributes.except(:contents) @article.should_not be_valid end end
  • 18. 3 slide intro to BDD: Cucumber features Feature: Articles on the homepage As a visitor I want to view articles on the homepage So that I can see the latest content Scenario: 5 most recent articles Given there are 8 articles When I visit the homepage Then I should see the 5 last published articles
  • 19. 3 slide intro to BDD: Cucumber steps Given "there are $num articles" do |num| num.to_i.times { create_article } end When "I visit the homepage" do visit root_path end Then "I should see the $num last published articles" do |num| Article.last_published(num).each do |article| response.should contain(article.title) end end
  • 20. Shared behaviours • Tags • User stamps (created by, updated by) • Searching • “Related” articles • Publication timestamps (on/offline at)
  • 21. Shared behaviours module UserStamps def self.included(klass) class Article < ActiveRecord::Base klass.instance_eval do include Shared::UserStamps include InstanceMethods end include Shared::Published end include Shared::Url include Shared::Search module InstanceMethods include Shared::RelatedArticles def created_by User.find_by_id(creator_id) # stuff end end def updated_by User.find_by_id(updater_id) end end end
  • 22. Workflow: performance testing • ab, httperf, autobench, cURL • NewRelic RPM • Safari Web Inspector • http://railslab.newrelic.com/scaling-rails
  • 24. Challenges • Content Management System • Loads and loads and loads of external data
  • 25. CMS • Articles, Themas, Specials, Subsites • Multiple feeds, images, links • Version control • Media database
  • 26. CMS: Articles Subsite Thema Page Special Article Link Feed Image
  • 28. Media DB • Implemented as REST app • To be used as REST service • Files, folders, crops
  • 29. External data • RSS feeds • EPG data • Zelda • Babel • News/sport/teletekst • Twitter • Lots of custom XML formats
  • 30. External data: XML/RSS • Empty feeds • Encodings are off (Windows-1252, ISO-8859-1, UTF-8) • “Custom” fields • Incorrect fields (dates, unescaped HTML) • Timeouts • Everything that can go wrong, will go wrong
  • 32. External data: EPG data Zelda don’t sue us Nintendo... please? :)
  • 33. External data • Empty feeds • Retrieving the feed while someone is updating it • Required fields that are empty • DTD?
  • 34. <!ELEMENT aflevering ( prid?, tite?, medium?, icon?, aankondiging?, inkl?, ingl?, infi?, inak?, inds?, inbb?, kykw?, orti?, aant?, land?, lcod?, psrt?, prem?, inh1?, afle?, atit?, inh2?, bron?, prij?, inh3?, mail?, webs?, inhk?, gids_tekst?, omroepen?, genres?, personen?, streams?, fragmenten?, serie? )>
  • 35. <!ELEMENT inkl (#PCDATA)> <!ELEMENT ingl (#PCDATA)> <!ELEMENT intt (#PCDATA)> <!ELEMENT inhh (#PCDATA)> <!ELEMENT omro (#PCDATA)> <!ELEMENT lcod (#PCDATA)> <!ELEMENT herh (#PCDATA)> <!ELEMENT inds (#PCDATA)> <!ELEMENT infi (#PCDATA)> <!ELEMENT inbb (#PCDATA)> <!ELEMENT genr (#PCDATA)> <!ELEMENT kykw (#PCDATA)> <!ELEMENT afle (#PCDATA)>
  • 36. Lessons learned • Cache the crap out of everything • Rescue everything • Test everything (frontend and backend)
  • 37. Caching • Cache the homepage • Page cache → Fragment cache • Don’t cache forms • Cache as much as possible
  • 38. Case: article views • Article is page cached • Update the number of views in realtime?
  • 39. Use AJAX! <% javascript_tag do %> <%= remote_function :url => update_views_article_path(@article) %> <% end %>
  • 41. Case: banner items • Fast requests (<10ms) • ETags (304 Not Modified) • Static resource → page cache • Move to front proxy, frees up Rails cluster • 1100rpm → 130rpm • 20ms → 40ms • Average response time going up? Oh nooooes!
  • 42. Caching external data • Don’t expire cache (preferrably) • Explicitly overwrite • Update in background (feeeeeeeds) • memcache FTW!
  • 43. memcache • Escape your keys using CGI::escape • Max keylength is 250 • Max value size is 1MB
  • 44. Rescueing def self.get_feed_contents(url) content = "" open(url) { |s| content = s.read } RSS::Parser.parse(content, false).items rescue => e logger.warn "Feed #{url} raised error: #{e.message}" [] rescue Timeout::Error => e logger.warn "Feed #{url} timed out: #{e.message}" [] end Timeout::Error is an interrupt...
  • 45. Testing • rcov • Refactor your tests • Peer reviews, external audits • Run specs/features (continuously) in parallel (your Cucumber is too slooooow!)
  • 46. Cucumber salad num_of_processes.times do |count| pids << Process.fork do setup_database(conn, count) Cucumber::Cli::Main.execute( ["-f", "progress", "-l", "nl", "-r", "features"] + feature_sets[count] ) end end MacBook Pro “Regular” Mac Pro (8) (4) 12:12 4:34 2:12
  • 48. What’s next for us? • Building a high-performance backend • Uitzending Gemist statistics API • 250+ reqs/s at minimum