SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
James Cox
                  Chief squirrel, smokeclouds
                   james@smokeclouds.com




Scalability without
going nuts
                                                1



                                                    1
what this is
                                       ( just an overview )




                                                                                             2



                                                                                                   2

This
is
an
overview
of
some
of
the
areas
i’ve
focused
on
when
investigating
scalability.

There
are
no
easy
answers
‐
but
hopefully
these
ideas
will
give
you
some
directions
for
your
own

apps

From
something
small
comes
something
big
‐
i
just
made
that
up.
We’re
going
to
have
fun
with

making
our
apps
work
when
there
is
more
than
one
user
by
looking
at
code,
ops
and
more.

particularly
we’ll
try
and
wade
through
some
language
improvements/tips,
some
infrastructure

planning
tips,
stuff
to
make
MySQL
better,
and
so
on.

we’ll
also
touch
on
proxy/app
servers,
fileshares
and
some
questions
at
the
end,
if
we
get
that

far.

I
hope
you’re
all
comfortable,
mobile
phones
are
all
off
as
i
know
we’re
all
busy
people

right.
lets
begin.
The Language Performance Race
                                3



                                    3
Rails isn’t fastest  ( assembler is )




                                                                                                     4



                                                                                                            4

rails
isn’t
fastest
‐
that’s
ok.

Life
is
about
tradeoff
and
compromise


We
pick
rails
because
of
its
ease
and
efficiency
to
code
‐
and
we
can
refactor,
scale
and
improve
later.

or
just
buy
more
servers.

refer
to
recent
rants
on
ruby
perf
etc...
Planning Trumps All        ( even donald )




                                                                                                     5



                                                                                                           5

A
bit
of
planning
and
process
mapping
will
do
more
for
your
ability
to
scale
than
any
later

improvements,
usually
ruling
out
a
rewrite
if
you’ve
got
the
core
of
the
project
in
the
right
direction.
Analyze
                                         ( don’t guess )




                                                                                                6



                                                                                                      6

Once
you
have
your
planning
arranged,
don’t
guess
as
to
where
performance
is
struggling
‐
actually

try
and
get
some
numbers
to
benchmark
against.
To
learn
more
go
watch
the
excellent
peepcast

httperf
tutorial
‐
which
i
almost
played
instead
of
doing
this
talk!
Speed Perceived      ( the easiest way )




                                                                                                    7



                                                                                                        7

There
is
always
the
“glamour”
of
making
a
high
performance
app
which
can
handle
all
the
requests

you
can
possibly
imagine.


Not
everyone
can
be
a
livejournal
and
actually
make
their
servers
push
98
MB/s
on
their
100MB

network
cards.

Find
the
areas
of
the
app
which
the
userbase
perceives
to
be
the
slowest:
it
may
be
that
you
can
make

your
app
appear
‘faster’
by
improving
the
UI/UX.

Work
on
these
areas
and
then
radiate
outwards:
it’s
easier
to
refactor
in
chunks
than
as
a
whole


(tangent:
SOA
architecture
is
not
a
bad
idea....)
Focus on your app ( it’s usually cheaper )




                                                                            8



                                                                                8

So
how
can
we
make
our
app
faster?

There
are
a
number
of
techniques
we
can
employ
to
make
our
apps
better.



Now
to
discuss
some
of
them.....
Improving ActiveRecord:

                     :select, :limit, :offset
                                        ( take what you need )




                                                                                                     9



                                                                                                           9

‐
You
don’t
always
need
that
data
‐
this
problem
hides
itself
when

you
are
first
building
‐
but
as
you

add
data,
no
limit/offset
means
you
often
end
up
grabbing
too
many
rows
‐
This
is
particularly
important
when
using
TCP
connections
to
your
database.
‐
oftentimes
an
app
is
waiting
for
the
data
to
transfer,
so
limit
it
to
just
the
stuff
you
need
Improving ActiveRecord:

                :include => :association
                                          ( keep it eager )




                                                                                               10



                                                                                                    10


OK
so
eager
loading
changes
your
query
from
N+1
(where
n
is
the
number
of
rows
multiplied
by

associations)
to
one
query.

Under
the
hood,
this
works
by
causing
a
LEFT
OUTER
JOIN
‐
SQL
for
joining
the
tables
together.
Outer

joins
work
by
including
rows
even
when
one
half
of
the
join
is
NULL.

High
query
counts
are
bad
because
they
cause
queueing
for
read/write
on
the
table.
Improving ActiveRecord:

                   Model < CachedModel
                                 ( cache first, ask questions later )




                                                                                                  11



                                                                                                       11

So
you’ve
limited
your
query
to
the
least
amount
of
data
necessary
‐
or
you’re
just
looking
up
a

single

row.

What
next?

Cache
your
data
in
a
fast
retrieval
store
such
as
memcached.
Nice
ActiveRecord
extension
for
this

(even
if
it
is
a
bit
hairy)

n.b.
this
only
works
with
simple
ID
based
lookups
‐
for
anything
complex
you
need
to
use
Cache.set

and
Cache.get
Improving ActiveRecord:

                           acts_as_cached
                                     ( built from experience )




                                                                                      12



                                                                                           12

Better
alternative
to
CachedModel,
but
you
have
to
add
this
as
a
method
to
a
Model.

This
is
a
bit
more
structured
than
CachedModel.
Built
from
CNET’s
chow/chowhound
team
Improving ActiveRecord:

                                      cache_fu
                                          ( in incubation )




                                                                                      13



                                                                                           13

Better
alternative
to
CachedModel,
but
you
have
to
add
this
as
a
method
to
a
Model.

This
is
a
bit
more
structured
than
CachedModel.
Built
from
CNET’s
chow/chowhound
team
Improving ActiveRecord:

                  @var ||= Model.find(...)
                                        ( keep your code dry )




                                                                                              14



                                                                                                   14

Ever
do
a
lookup
‐
current_user,
current_page,
or
some
other
check
that
happens
more
than
once
in
a

request?

the
||=
method
says
‐
use
the
instance
variable
or
define
it
via
the
query.
Improving ActiveRecord:

                     @@modulo ||= (52 % 100)
                                       ( run once, save forever )




                                                                                          15



                                                                                               15

@@
is
a
class
variable
‐
a
quick
way
to
store
a
variable
for
the
lifetime
of
the
app...
Improving ActionView:

                        template optimizer
                                          ( non-lazy views )




                                                                                                  16



                                                                                                         16

if
you
use
semantic
views
‐
markaby,
builder
‐
or
lots
of
helpers
‐‐
you
have
to
spend
way
too
much

time
to
parse
the
file
to
get
some
HTML
in
the
end...

link_to,
image_tag,
form_tags
‐
all
helpers
for
HTML
functions
which
are,
honestly,
for
people
who’ve

gotten
bored
writing
HTML.

During
each
request
the
view
rhtml
has
to
be
parsed
and
delivered
‐
this
is
expensive.
It’s
so
expensive

to
do
this
parsing
that,
in
other
languages
‐
e.g.
PHP
all
optimizers
focus
on
serving
up
byte‐compiled

scripts
‐
and
this
goes
back
to
our
first
comment
that
assembler
is
faster.


so
get
your
views
back
to
the
‘compiled’
form
and
ditch
those
helpers
early
by
optimizing
your

templates

This
should
bring
down
the
‘Render’
part
of
the
query
log.
Improving ActionView:

                                Publish Once
                                      ( caching always wins )




                                                                                              17



                                                                                                      17

You’re
going
to
have
gotten
your
page
load
time
to
somewhat
of
an
optimal
level
by
now
‐
improving

your
database
queries,
and
then
pre‐compiling
your
templates.

Now
consider
if
you
can
cache
your
pages.

Is
this
a
highly
trafficked
content
website?
(caching
is
a
must)
Can
you
get
away
with
profile
etc
pages
being
cached
till
updated?
(social
networking
site)
Improving ActionView:

                          caches_page: bad
                                       ( nightmare to cleanup )




                                                                                                    18



                                                                                                            18

caches_page
is
the
trick
used
to
simply
write
out
the
entire
page
to
disk...
can
be
tricky
to
keep
up
to

date,
and
also
hard
work
for
a
slow
disk.

This
also
falls
down
if
you
have
a
loose
url
schema:
a
site
i’ve
hacked
on
had
about
500MB
of
content,

but
caches_page
has
generated
30GB
of
content
‐‐
why?
spiders
will
pervert
your
url
schema
‐
and

cause
it
to
generate
waaaay
too
much
content.
Improving ActionView:
                 <%=
                   cache(:action => 'feature', :part => 'most_read') do
                        render :partial => 'article/most_read'
                      end -%>




                                                                                      19



                                                                                           19

Drop
a
fragment
cache
into
your
view
and
save
repetitive
tasks

Doesn’t
yet
work
with
robot‐coop’s
memcache‐client
as
a
fast
store
for
fragments
‐


but

There
is
a
memcache
backed
fragment
store
gem
‐
eg,
extended
fragment
cache
Improving Sanity:

                                       Follow Edge
                                          ( DHH Breaks Stuff )




                                                                                          20



                                                                                               20

@@
is
a
class
variable
‐
a
quick
way
to
store
a
variable
for
the
lifetime
of
the
app...
Tuning Up
            21



                 21
Tuning Up
            22



                 22
Tuning Up
            23



                 23
Avoid Shared Hosting
                               ( there’s only so much to go around )




                                                                                                  24



                                                                                                          24

When
I
was
living
at
my
family
home,
my
brothers
always
used
to
share
my
stuff
‐
clothes,
shower
gel,

aftershave
‐
you
name
it.

Same
is
true
for
server
resources
‐
everyone’s
gotta
share.

Not
all
users
play
nice
‐
that
crazy
crawler
on
your
box
is
taking
up
all
the
ram
and
the
spammer
is

getting
you
black
listed.

Too
many
variables
you
can’t
control
‐
VPS
software
is
pretty
harsh
for
setting
process
limits
to
save

the
box
as
a
whole

Underconfigured
software
‐
all
packages
to
make
it
work
for
everyone.
Low
performance:
designed
to

encourage
upgrades.
New Players
                                           ( always one )




                                                                                                25



                                                                                                     25

SOME
vps
are
getting
it
right
‐
Engine
Yard,
Rails
Machine
‐
high‐performance
focused
servers

Expects
trusted
users
‐
won’t
cater
for
the
low‐end
user

Expensive
to
buy
into,
low
availability
‐
but
often
a
worthwhile
investment
Multiple Servers?
                                          ( work them hard )




                                                                                      26



                                                                                           26

One
server
or
more?

It’s
great
if
you
have
the
infrastructure....
but
do
you
know
how
to
split
them
up?
Setup Hot
                                       ( universe is infinite )




                                                                                               27



                                                                                                      27

There’s
also
performance
in
productivity
‐
it
makes
sense
to
mirror
setups
on
each
machine
for
hot‐
backup
as
well
as
for
predictability.

capistrano
will
help
you
with
this.
8 Server Gem
                                                    Proxy/Web Static (2)


                                                         Application Servers (4)

                                                 Database Layer (2)




                                                                                                     28



                                                                                                             28

It’s
great
if
you
have
the
infrastructure....
but
do
you
know
how
to
split
them
up?

Think
of
the
shape
of
a
ruby
‐
the
top
is
a
bit
of
a
plateau,
and
that’s
where
you
put
static
and
proxy

servers.
You’ll
want
to
load
balance
these
for
high
availability
‐
but
generally
these
scale
very
well
as

they
don’t
do
much
but
route
traffic
and
serve
files.

The
widest
part
‐
those
are
your
application
servers,
and
you
can
grow
these
out
to
as
many
as
you

can
imagine.
This
is
your
workhorse
layer
‐
everything
interesting
happens
here.
Careful
you
don’t

have
too
many
of
these
for
the
proxy
servers
‐
if
there
are
so
many
choices
for
each
proxy
some
of

these
can
sit
idle.

The
bottom,
hidden
part
is
the
best
bit
‐
the
database
layer.
This
is
a
somewhat
sacred
layer:
not
many

servers
can
play
this
part
at
once.
Ensure
you
put
your
best
machines
at
this
level.
You’re
going
to

want
to
see
high
ram,
good
I/O
throughput,
lots
of
CPU
power
and
plentiful
disk
space.
Playing Well Together
                                   ( there is only one sandpit )




                                                                                                 29



                                                                                                      29

So
you’ve
gotten
your
servers
tagged
up
‐
how
do
you
assign
them
tasks?

With
one
of
our
clients,
we
had
a
situation
where
we
have
a
mega
busy
ad‐server
and
a
busy
CMS

sharing
the
same
database.
it
made
sense
to
break
them
apart
onto
two
servers
‐
the
query
stats
made

sense.

...
but
we
could
put
the
admin
and
the
front
end
app
and
proxy
servers
on
the
same
machines
‐


Why?
Front
end/admin
work
well
together.
Databases
are
heavy
read/write
so
two
busy
databases

will
fight/queue
for
file
system
access.

MySQL Tuning
                                          ( feed the beast )




                                                                                                  30



                                                                                                       30

OK
lets
cover
some
tips
getting
MySQL
to
play
nice.

Why
MySQL
over
others?
Mostly
business
reasons
than
tech
‐
it
has
a
nice
pathway
to
move
on
to
a

fully
supported
contract
when
you
need
it.


MySQL
is
also
on
the
cusp
of
launching
a
really
awesome
NBD
cluster
‐
this
is
basically
a
high

availability
memory
store
database
which
retains
integrity
via
the
standard
server.
mysql> s

             mysql Ver 14.7 Distrib 4.1.19, for pc-linux-gnu (i686) using readline 4.3

             Uptime: 10 hours 11 min 47 sec

             Threads: 3 Questions: 10,171,505 Slow queries: 334 Opens: 224 Flush tables: 1 Open
             tables: 106 Queries per second avg: 277.100




                                                                                                  31



                                                                                                        31

This
is
a
single
machine,
dual
2.4GHz
xeon
processor,
hyperthreaded.
2GB
RAM.
Linux.

Yes
it
is
possible
to
get
some
really
high
performance
MySQL
going
‐
you
just
need
to
get
the
settings

right
‐
this
is
trial
and
error
(mostly)

Had
over
a
billion
queries
on
an
uptime
of
60
days,
but
some
‘technician’
at
the
datacenter
rebooted

the
wrong
box.
So
I
can’t
show
that
off.
shame!
# query cache considered harmful
              query_cache_size=0

              # key_buffer_size is the size of the buffer used for index blocks.
              key_buffer_size=100M

              # The maximum size of one packet.
              max_allowed_packet=1M

              # the length of time (in seconds) that we want to log against.
              #long-query-time=3
              log-slow-queries=/var/log/mysql_slow_queries




                                                                                                     32



                                                                                                          32

Some
key
variables
I
always
have
set...

query
cache
is
not
always
as
useful
as
it
seems
‐
OK
for
truly
unoptimized
badly
indexed
stuff,
not
so

good
for
when
you
need
to
manage
the
stack‐
think
of
a
logging
table
or
a
user
table
in
a
social

network
‐
when
the
data
changes
more
quickly
than
the
time
it
takes
to
create
and
query
the
cache‐

you’re
in
trouble.

it
was
also
quickly
written
to
make
MySQL
4
less
slow
in
response
to
a
customer
request.


buffer
size
‐
set
to
be
as
much
spare
ram
as
you
have
‐
this
is
the
amount
of
memory
it’ll
allocate
to
fit

in
the
buffer.
If
it
has
to
keep
allocating,
then
it’ll
do
the
sort
in
chunks
which
takes
FOREVER.

The
message
buffer
is
initialised
to
net_buffer_length
bytes,
but
can
grow
up
to
max_allowed_packet

bytes
when
needed.
Good
if
you’re
passing
around
large
objects
such
as
images,
articles,
and
so
on
‐
set

it
high
and
forget
about
it
(as
long
as
your
network
can
cope)

ALWAYS
log
slow
queries
‐
and
regularly
check.
This
is
your
first
port
of
call
for
optimizing
your
DB!!!
# if you use network (tcp) based connections

              wait_timeout=90
              net_write_timeout=180
              net_read_timeout=60
              max_connections=500


              mysql > SHOW FULL PROCESSLIST; (for more info)




                                                                                                        33



                                                                                                             33

If
your
DB
server
is
different
to
your
app
server,
it’s
important
to
set
these.
Oftentimes
i’ve
seen

servers
where
appservers
are
queuing
due
to
long
laggy
timeouts
and
no
available
connections.
It’s OK to ditch AR
                                        ( DHH won’t get upset )




                                                                                                     34



                                                                                                           34

Sometimes
it’s
just
simpler
to
drop
out
and
craft
a
very
focused
query,
use
a
stored
procedure
or

function,
mysql
variables....
force
an
index.

Just
because
you
can’t
do
it
in
a
#find
doesn’t
mean
you
shouldn’t
do
it.
(ie,
don’t
sacrifice
ultimate

performance
for
manageability
every
time)

good
example
and
not
easy
using
standard
AR
‐‐
using
INSERT
DELAYED
is
great
for
when
you
don’t

need
to
know
the
id
of
the
row
inserted.
Good
for
things
like
logs,
stats
etc.

Proxy > App
                             ( warm up the pack, the engine’s running )




                                                                                                     35



                                                                                                          35

Best
advice
right
now
is
to
use
nginx
as
a
front
end
to
a
mongrel
cluster
(or
two)

it’s
very
fast
and
scalable
‐
nginx
is
lightweight,
and
can
handle
upstream
clusters
with
ease,
as
well
as

use
fast
onboard
PCRE
style
regex
for
handling
different
paths
based
on
their
needs.

mongrel,
while
not
being
the
fastest
in
the
pack,
lets
you
scale
out
easily.
Plus
Zed
is
pretty
clever,
and

he’ll
fix
stuff
quickly.


Why
use
them?
Lots
of
these
‘new’
http
servers
are
more
focused
towards
a
smaller
goalset
‐
they
are

designed
to
achieve
one
or
two
things.
Apache
HTTPD
lets
you
embed
almost
any
module
imaginable

in
the
chainset.
It’s
clear
who’s
going
to
be
faster.
Event Driven?
                                     ( don’t presume your traffic )




                                                                                                     36



                                                                                                          36

You
can
use
swiftiply
and
evented
mongrel
to
move
away
from
the
high
cost
of
threads.
This
is
useful

because
rails
sits
in
one
big
loop
for
each
request
‐
so
tieing
up
expensive
threads
waiting
for
your
app

to
get
done
is
not
necessarily
efficient.
Perhaps
try
running
it
in
an
event
loop

haven’t
tried
this
yet
in
any
kind
of
real‐world
example
‐
but
really
keen
to
see
if
it
can
scale
(and
stand

up)
Req/sec (mean)
      250.00
                     Stats courtesy of http://blog.kovyrin.net/                              234

                                                                      220
      218.75
                                               207


                         187
       187.50




      156.25




      125.00
                        nginx               litespeed             lighttpd(fcgi)        apache(fcgi)

                                                                                                       37



                                                                                                            37

Clear
alternatives
if
you
aren’t
scaling
past
one
appserver
‐
these
numbers
are
sort
of
indicative

litespeed
(pay
for
product)
has
some
nice
numbers
and
an
apparently
easy‐to‐use
interface
‐
live
tool

for
adding
new
lsapis
on
the
fly

lighttpd
+
apache,
yes,
straight
fastcgi
is
good
but
you
can’t
scale
past
four
FCGI
processes,
mongrel

can
KeepAlive
                                     ( no point if you’re dead )




                                                                                                38



                                                                                                     38

KeepAlive
almost
never
works.
99%
of
the
time,
you’re
going
to
benefit
just
making
your
appserver/
webserver
ignore
it.
Most
browsers
now
work
around
this
to
help
improve
perceived
performance.

You
can
get
the
same
kind
of
benefit
by
parallelizing
your
asset
requests
‐
ie
randomize
from
server1/
server2
etc.

Edge
rails
supports
this
natively.
Hostname Lookup
                                        ( do not do this. ever. )




                                                                                                    39



                                                                                                         39

anything
that
interferes
with
the
business
of
serving
your
webpage
to
the
client
is
going
to
hurt
your

performance.

turn
off
hostname
lookup,
excessive
logs,
unused
modules
‐‐
anything
you
really
really
don’t
need.

make
sure
your
apps
are
compiled
to
perform
the
best
with
your
setup
(except
for
MySQL
where
you

should
always
use
their
compiled
versions)

Do
you
use
stats
packages?
Make
sure
the
JS
calls
are
right
before
the
end
</body>
tag
‐‐
you
may
get

lucky
and
browsers
will
deal
with
complicated
stuff
like
styles
and
so
on,
or
render
the
page
to
the

screen
whilst
waiting
‐
these
calls
typically
block
and
the
browser
can’t
do
much
till
they
return.

So
be
sure
your
stats
package
can
handle
your
traffic
before
you
stick
it
up
there.
(Hint:
self‐installable

stuff
like
mint
can’t
handle
millions
of
hits
per
day
without
lots
of
hardware
to
support
it)

Really
bad
stats?

perhaps
use
an
async
XMLHttpRequest
to
fire
it,
an
IFrame
or
the
onload
handler....
NFS and Beyond
                                           ( sharing is good )




                                                                                                 40



                                                                                                      40

Are
you
pre‐caching
on
every
server
?
Then
use
a
shared
file
store!

It’s
also
easier
to
expire
one
store
than
many.

be
warned
‐
NFS
traditionally
hasn’t
been
known
to
scale
as
well
as
it
could
‐
more
recent
versions
are

more
performant

Some
NFS
options
you
can
turn
off
(you
don’t
always
need
to
write,
for
example)
and
staying
in
sync
is

not
always
important
for
a
small
share
you
can
just
remount
if
it
gets
crazy.
Write over NFS
                                         ( be super efficient )




                                                                                                  41



                                                                                                       41

Zed
pointed
out
this
really
brain‐dead
simple
efficiency.
If
you
use
NFS
‐
use
it
to
write
to
your
asset

servers
‐
disk
is
cheap
but
the
network
tear
down
/
start
up
is
expensive.
Don’t
saturate
your
net
card

just
passing
data
around
again
and
again.

Always
look
for
the
simplest
path.
MogileFS, NFS Clusters
                                            ( brainy sharing )




                                                                                                      42



                                                                                                           42

If
you’re
struggling
under
the
load
of
lots
of
static
assets
(think
youtube
or
flickr)

and
you
can’t
quite

afford
a
network
attached
storage
device
with
a
petabyte
of
disk
space,

consider
using
up
the
many
multi
gigabyte
disks
you
have
in
your
servers!

cluster
up
for
NFS
clusters
(tricky
but
not
impossible)
where
you
can
create
a
pseudo
raid
over

machines
via
software.
google
for
it

or
use
mogileFS
and
its
HTTP
DAV
style
api
for
grabbing
your
data
chunks.
RobotCOOP
have
a

working
library.
Tuning Recap
                                        ( were you listening? )




                                                                                                    43



                                                                                                           43



1.
Check
for
bottlenecks.
focus
on
perceived
areas
of
slowness


2.
Improve
by
making
users
happy


3.
Look
at
your
layout
‐
are
your
servers
fighting
for
CPU/RAM
time?


4.
Are
you
on
a
shared
host
and
being
kept
in
strict
limits?


5.
Is
your
code
optimal
‐
especially
templates?



6.
Can
you
get
more
servers?


7.
Tuning
your
apps
‐
is
the
MySQL
processlist
showing
lots
of
waiting
queries?


8.
Are
you
running
the
most
optimal
HTTP
setup?


9.
is
your
cache
causing
you
problems
on
the
disk?
10.
Attend
one
of
our
scalability
talks
‐
starting
in
May.
ask
the
skillsmatter
team
here
for
more
info.
10.
Hire
me....
or
someone
like
me
:)
Any Questions?




                 44



                      44
Resources -
   talk: smokeclouds.com/scalability.pdf
    me: smokeclouds.com :: imaj.es
  blogs: brainspl.at :: blog.kovyrin.net : caboo.se
   app: mongrel.net :: litespeed.com
   web: lighttpd.net :: nginx.net :: swiftcore.org
  hosts: railsmachina.com :: engineyard.com




                                                      45



                                                           45

Más contenido relacionado

Similar a Scalability without going nuts

The Lean Startup at Web 2.0 Expo
The Lean Startup at Web 2.0 ExpoThe Lean Startup at Web 2.0 Expo
The Lean Startup at Web 2.0 ExpoVenture Hacks
 
2009 05 01 How To Build A Lean Startup Step By Step
2009 05 01 How To Build A Lean Startup Step By Step2009 05 01 How To Build A Lean Startup Step By Step
2009 05 01 How To Build A Lean Startup Step By StepEric Ries
 
UW ADC - Course 3 - Class 1 - User Stories And Acceptance Testing
UW ADC - Course 3 - Class 1 - User Stories And Acceptance TestingUW ADC - Course 3 - Class 1 - User Stories And Acceptance Testing
UW ADC - Course 3 - Class 1 - User Stories And Acceptance TestingChris Sterling
 
HA+DRBD+Postgres - PostgresWest '08
HA+DRBD+Postgres - PostgresWest '08HA+DRBD+Postgres - PostgresWest '08
HA+DRBD+Postgres - PostgresWest '08Jesse Young
 
Agilebuddy Users Guide
Agilebuddy Users GuideAgilebuddy Users Guide
Agilebuddy Users Guideagilebuddy
 
Yakov Fain - Design Patterns a Deep Dive
Yakov Fain - Design Patterns a Deep DiveYakov Fain - Design Patterns a Deep Dive
Yakov Fain - Design Patterns a Deep Dive360|Conferences
 
Class5 Scaling And Strategic Planning
Class5 Scaling And Strategic PlanningClass5 Scaling And Strategic Planning
Class5 Scaling And Strategic PlanningChris Sterling
 
Sapo BUS Hands-On
Sapo BUS Hands-OnSapo BUS Hands-On
Sapo BUS Hands-Oncodebits
 
Fedora App Slide 2009 Hastac
Fedora App Slide 2009 HastacFedora App Slide 2009 Hastac
Fedora App Slide 2009 HastacLoretta Auvil
 
High-Octane Dev Teams: Three Things You Can Do To Improve Code Quality
High-Octane Dev Teams: Three Things You Can Do To Improve Code QualityHigh-Octane Dev Teams: Three Things You Can Do To Improve Code Quality
High-Octane Dev Teams: Three Things You Can Do To Improve Code QualityAtlassian
 
Tesi Laurea Specialistica
Tesi Laurea SpecialisticaTesi Laurea Specialistica
Tesi Laurea Specialisticalando84
 
The New Face of Learning? (full version)
The New Face of Learning? (full version)The New Face of Learning? (full version)
The New Face of Learning? (full version)Judith Christian-Carter
 
Continuous Improvement 101
Continuous Improvement 101Continuous Improvement 101
Continuous Improvement 101flarco
 
Roll-out of the NYU HSL Website and Drupal CMS
Roll-out of the NYU HSL Website and Drupal CMSRoll-out of the NYU HSL Website and Drupal CMS
Roll-out of the NYU HSL Website and Drupal CMSChris Evjy
 

Similar a Scalability without going nuts (20)

The Lean Startup at Web 2.0 Expo
The Lean Startup at Web 2.0 ExpoThe Lean Startup at Web 2.0 Expo
The Lean Startup at Web 2.0 Expo
 
From Work To Word
From Work To WordFrom Work To Word
From Work To Word
 
2009 05 01 How To Build A Lean Startup Step By Step
2009 05 01 How To Build A Lean Startup Step By Step2009 05 01 How To Build A Lean Startup Step By Step
2009 05 01 How To Build A Lean Startup Step By Step
 
UW ADC - Course 3 - Class 1 - User Stories And Acceptance Testing
UW ADC - Course 3 - Class 1 - User Stories And Acceptance TestingUW ADC - Course 3 - Class 1 - User Stories And Acceptance Testing
UW ADC - Course 3 - Class 1 - User Stories And Acceptance Testing
 
HA+DRBD+Postgres - PostgresWest '08
HA+DRBD+Postgres - PostgresWest '08HA+DRBD+Postgres - PostgresWest '08
HA+DRBD+Postgres - PostgresWest '08
 
HTML Parsing With Hpricot
HTML Parsing With HpricotHTML Parsing With Hpricot
HTML Parsing With Hpricot
 
Agilebuddy Users Guide
Agilebuddy Users GuideAgilebuddy Users Guide
Agilebuddy Users Guide
 
Mobile Marketing Forum - MOOGA
Mobile Marketing Forum - MOOGAMobile Marketing Forum - MOOGA
Mobile Marketing Forum - MOOGA
 
Yakov Fain - Design Patterns a Deep Dive
Yakov Fain - Design Patterns a Deep DiveYakov Fain - Design Patterns a Deep Dive
Yakov Fain - Design Patterns a Deep Dive
 
Class5 Scaling And Strategic Planning
Class5 Scaling And Strategic PlanningClass5 Scaling And Strategic Planning
Class5 Scaling And Strategic Planning
 
Sapo BUS Hands-On
Sapo BUS Hands-OnSapo BUS Hands-On
Sapo BUS Hands-On
 
Fedora App Slide 2009 Hastac
Fedora App Slide 2009 HastacFedora App Slide 2009 Hastac
Fedora App Slide 2009 Hastac
 
Cutbots - Presentation
Cutbots - PresentationCutbots - Presentation
Cutbots - Presentation
 
High-Octane Dev Teams: Three Things You Can Do To Improve Code Quality
High-Octane Dev Teams: Three Things You Can Do To Improve Code QualityHigh-Octane Dev Teams: Three Things You Can Do To Improve Code Quality
High-Octane Dev Teams: Three Things You Can Do To Improve Code Quality
 
Tesi Laurea Specialistica
Tesi Laurea SpecialisticaTesi Laurea Specialistica
Tesi Laurea Specialistica
 
The New Face of Learning? (full version)
The New Face of Learning? (full version)The New Face of Learning? (full version)
The New Face of Learning? (full version)
 
Continuous Improvement 101
Continuous Improvement 101Continuous Improvement 101
Continuous Improvement 101
 
Roll-out of the NYU HSL Website and Drupal CMS
Roll-out of the NYU HSL Website and Drupal CMSRoll-out of the NYU HSL Website and Drupal CMS
Roll-out of the NYU HSL Website and Drupal CMS
 
Rich Web Clients 20081118
Rich Web Clients 20081118Rich Web Clients 20081118
Rich Web Clients 20081118
 
Ms Dm Online
Ms Dm OnlineMs Dm Online
Ms Dm Online
 

Último

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 

Último (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 

Scalability without going nuts

  • 1. James Cox Chief squirrel, smokeclouds james@smokeclouds.com Scalability without going nuts 1 1
  • 2. what this is ( just an overview ) 2 2 This
is
an
overview
of
some
of
the
areas
i’ve
focused
on
when
investigating
scalability. There
are
no
easy
answers
‐
but
hopefully
these
ideas
will
give
you
some
directions
for
your
own
 apps From
something
small
comes
something
big
‐
i
just
made
that
up.
We’re
going
to
have
fun
with
 making
our
apps
work
when
there
is
more
than
one
user
by
looking
at
code,
ops
and
more. particularly
we’ll
try
and
wade
through
some
language
improvements/tips,
some
infrastructure
 planning
tips,
stuff
to
make
MySQL
better,
and
so
on. we’ll
also
touch
on
proxy/app
servers,
fileshares
and
some
questions
at
the
end,
if
we
get
that
 far. I
hope
you’re
all
comfortable,
mobile
phones
are
all
off
as
i
know
we’re
all
busy
people right.
lets
begin.
  • 4. Rails isn’t fastest ( assembler is ) 4 4 rails
isn’t
fastest
‐
that’s
ok. Life
is
about
tradeoff
and
compromise
 We
pick
rails
because
of
its
ease
and
efficiency
to
code
‐
and
we
can
refactor,
scale
and
improve
later.
 or
just
buy
more
servers. refer
to
recent
rants
on
ruby
perf
etc...
  • 5. Planning Trumps All ( even donald ) 5 5 A
bit
of
planning
and
process
mapping
will
do
more
for
your
ability
to
scale
than
any
later
 improvements,
usually
ruling
out
a
rewrite
if
you’ve
got
the
core
of
the
project
in
the
right
direction.
  • 6. Analyze ( don’t guess ) 6 6 Once
you
have
your
planning
arranged,
don’t
guess
as
to
where
performance
is
struggling
‐
actually
 try
and
get
some
numbers
to
benchmark
against.
To
learn
more
go
watch
the
excellent
peepcast
 httperf
tutorial
‐
which
i
almost
played
instead
of
doing
this
talk!
  • 7. Speed Perceived ( the easiest way ) 7 7 There
is
always
the
“glamour”
of
making
a
high
performance
app
which
can
handle
all
the
requests
 you
can
possibly
imagine.
 Not
everyone
can
be
a
livejournal
and
actually
make
their
servers
push
98
MB/s
on
their
100MB
 network
cards. Find
the
areas
of
the
app
which
the
userbase
perceives
to
be
the
slowest:
it
may
be
that
you
can
make
 your
app
appear
‘faster’
by
improving
the
UI/UX. Work
on
these
areas
and
then
radiate
outwards:
it’s
easier
to
refactor
in
chunks
than
as
a
whole
 (tangent:
SOA
architecture
is
not
a
bad
idea....)
  • 8. Focus on your app ( it’s usually cheaper ) 8 8 So
how
can
we
make
our
app
faster? There
are
a
number
of
techniques
we
can
employ
to
make
our
apps
better.

 Now
to
discuss
some
of
them.....
  • 9. Improving ActiveRecord: :select, :limit, :offset ( take what you need ) 9 9 ‐
You
don’t
always
need
that
data
‐
this
problem
hides
itself
when
 you
are
first
building
‐
but
as
you
 add
data,
no
limit/offset
means
you
often
end
up
grabbing
too
many
rows ‐
This
is
particularly
important
when
using
TCP
connections
to
your
database. ‐
oftentimes
an
app
is
waiting
for
the
data
to
transfer,
so
limit
it
to
just
the
stuff
you
need
  • 10. Improving ActiveRecord: :include => :association ( keep it eager ) 10 10 OK
so
eager
loading
changes
your
query
from
N+1
(where
n
is
the
number
of
rows
multiplied
by
 associations)
to
one
query. Under
the
hood,
this
works
by
causing
a
LEFT
OUTER
JOIN
‐
SQL
for
joining
the
tables
together.
Outer
 joins
work
by
including
rows
even
when
one
half
of
the
join
is
NULL. High
query
counts
are
bad
because
they
cause
queueing
for
read/write
on
the
table.
  • 11. Improving ActiveRecord: Model < CachedModel ( cache first, ask questions later ) 11 11 So
you’ve
limited
your
query
to
the
least
amount
of
data
necessary
‐
or
you’re
just
looking
up
a

single
 row.

What
next? Cache
your
data
in
a
fast
retrieval
store
such
as
memcached.
Nice
ActiveRecord
extension
for
this
 (even
if
it
is
a
bit
hairy) n.b.
this
only
works
with
simple
ID
based
lookups
‐
for
anything
complex
you
need
to
use
Cache.set
 and
Cache.get
  • 12. Improving ActiveRecord: acts_as_cached ( built from experience ) 12 12 Better
alternative
to
CachedModel,
but
you
have
to
add
this
as
a
method
to
a
Model. This
is
a
bit
more
structured
than
CachedModel. Built
from
CNET’s
chow/chowhound
team
  • 13. Improving ActiveRecord: cache_fu ( in incubation ) 13 13 Better
alternative
to
CachedModel,
but
you
have
to
add
this
as
a
method
to
a
Model. This
is
a
bit
more
structured
than
CachedModel. Built
from
CNET’s
chow/chowhound
team
  • 14. Improving ActiveRecord: @var ||= Model.find(...) ( keep your code dry ) 14 14 Ever
do
a
lookup
‐
current_user,
current_page,
or
some
other
check
that
happens
more
than
once
in
a
 request? the
||=
method
says
‐
use
the
instance
variable
or
define
it
via
the
query.
  • 15. Improving ActiveRecord: @@modulo ||= (52 % 100) ( run once, save forever ) 15 15 @@
is
a
class
variable
‐
a
quick
way
to
store
a
variable
for
the
lifetime
of
the
app...
  • 16. Improving ActionView: template optimizer ( non-lazy views ) 16 16 if
you
use
semantic
views
‐
markaby,
builder
‐
or
lots
of
helpers
‐‐
you
have
to
spend
way
too
much
 time
to
parse
the
file
to
get
some
HTML
in
the
end... link_to,
image_tag,
form_tags
‐
all
helpers
for
HTML
functions
which
are,
honestly,
for
people
who’ve
 gotten
bored
writing
HTML. During
each
request
the
view
rhtml
has
to
be
parsed
and
delivered
‐
this
is
expensive.
It’s
so
expensive
 to
do
this
parsing
that,
in
other
languages
‐
e.g.
PHP
all
optimizers
focus
on
serving
up
byte‐compiled
 scripts
‐
and
this
goes
back
to
our
first
comment
that
assembler
is
faster.
 so
get
your
views
back
to
the
‘compiled’
form
and
ditch
those
helpers
early
by
optimizing
your
 templates This
should
bring
down
the
‘Render’
part
of
the
query
log.
  • 17. Improving ActionView: Publish Once ( caching always wins ) 17 17 You’re
going
to
have
gotten
your
page
load
time
to
somewhat
of
an
optimal
level
by
now
‐
improving
 your
database
queries,
and
then
pre‐compiling
your
templates. Now
consider
if
you
can
cache
your
pages. Is
this
a
highly
trafficked
content
website?
(caching
is
a
must) Can
you
get
away
with
profile
etc
pages
being
cached
till
updated?
(social
networking
site)
  • 18. Improving ActionView: caches_page: bad ( nightmare to cleanup ) 18 18 caches_page
is
the
trick
used
to
simply
write
out
the
entire
page
to
disk...
can
be
tricky
to
keep
up
to
 date,
and
also
hard
work
for
a
slow
disk. This
also
falls
down
if
you
have
a
loose
url
schema:
a
site
i’ve
hacked
on
had
about
500MB
of
content,
 but
caches_page
has
generated
30GB
of
content
‐‐
why?
spiders
will
pervert
your
url
schema
‐
and
 cause
it
to
generate
waaaay
too
much
content.
  • 19. Improving ActionView: <%= cache(:action => 'feature', :part => 'most_read') do render :partial => 'article/most_read' end -%> 19 19 Drop
a
fragment
cache
into
your
view
and
save
repetitive
tasks Doesn’t
yet
work
with
robot‐coop’s
memcache‐client
as
a
fast
store
for
fragments
‐
 but There
is
a
memcache
backed
fragment
store
gem
‐
eg,
extended
fragment
cache
  • 20. Improving Sanity: Follow Edge ( DHH Breaks Stuff ) 20 20 @@
is
a
class
variable
‐
a
quick
way
to
store
a
variable
for
the
lifetime
of
the
app...
  • 21. Tuning Up 21 21
  • 22. Tuning Up 22 22
  • 23. Tuning Up 23 23
  • 24. Avoid Shared Hosting ( there’s only so much to go around ) 24 24 When
I
was
living
at
my
family
home,
my
brothers
always
used
to
share
my
stuff
‐
clothes,
shower
gel,
 aftershave
‐
you
name
it. Same
is
true
for
server
resources
‐
everyone’s
gotta
share. Not
all
users
play
nice
‐
that
crazy
crawler
on
your
box
is
taking
up
all
the
ram
and
the
spammer
is
 getting
you
black
listed. Too
many
variables
you
can’t
control
‐
VPS
software
is
pretty
harsh
for
setting
process
limits
to
save
 the
box
as
a
whole Underconfigured
software
‐
all
packages
to
make
it
work
for
everyone.
Low
performance:
designed
to
 encourage
upgrades.
  • 25. New Players ( always one ) 25 25 SOME
vps
are
getting
it
right
‐
Engine
Yard,
Rails
Machine
‐
high‐performance
focused
servers Expects
trusted
users
‐
won’t
cater
for
the
low‐end
user Expensive
to
buy
into,
low
availability
‐
but
often
a
worthwhile
investment
  • 26. Multiple Servers? ( work them hard ) 26 26 One
server
or
more? It’s
great
if
you
have
the
infrastructure....
but
do
you
know
how
to
split
them
up?
  • 27. Setup Hot ( universe is infinite ) 27 27 There’s
also
performance
in
productivity
‐
it
makes
sense
to
mirror
setups
on
each
machine
for
hot‐ backup
as
well
as
for
predictability. capistrano
will
help
you
with
this.
  • 28. 8 Server Gem Proxy/Web Static (2) Application Servers (4) Database Layer (2) 28 28 It’s
great
if
you
have
the
infrastructure....
but
do
you
know
how
to
split
them
up? Think
of
the
shape
of
a
ruby
‐
the
top
is
a
bit
of
a
plateau,
and
that’s
where
you
put
static
and
proxy
 servers.
You’ll
want
to
load
balance
these
for
high
availability
‐
but
generally
these
scale
very
well
as
 they
don’t
do
much
but
route
traffic
and
serve
files. The
widest
part
‐
those
are
your
application
servers,
and
you
can
grow
these
out
to
as
many
as
you
 can
imagine.
This
is
your
workhorse
layer
‐
everything
interesting
happens
here.
Careful
you
don’t
 have
too
many
of
these
for
the
proxy
servers
‐
if
there
are
so
many
choices
for
each
proxy
some
of
 these
can
sit
idle. The
bottom,
hidden
part
is
the
best
bit
‐
the
database
layer.
This
is
a
somewhat
sacred
layer:
not
many
 servers
can
play
this
part
at
once.
Ensure
you
put
your
best
machines
at
this
level.
You’re
going
to
 want
to
see
high
ram,
good
I/O
throughput,
lots
of
CPU
power
and
plentiful
disk
space.
  • 29. Playing Well Together ( there is only one sandpit ) 29 29 So
you’ve
gotten
your
servers
tagged
up
‐
how
do
you
assign
them
tasks? With
one
of
our
clients,
we
had
a
situation
where
we
have
a
mega
busy
ad‐server
and
a
busy
CMS
 sharing
the
same
database.
it
made
sense
to
break
them
apart
onto
two
servers
‐
the
query
stats
made
 sense. ...
but
we
could
put
the
admin
and
the
front
end
app
and
proxy
servers
on
the
same
machines
‐
 Why?
Front
end/admin
work
well
together.
Databases
are
heavy
read/write
so
two
busy
databases
 will
fight/queue
for
file
system
access.

  • 30. MySQL Tuning ( feed the beast ) 30 30 OK
lets
cover
some
tips
getting
MySQL
to
play
nice. Why
MySQL
over
others?
Mostly
business
reasons
than
tech
‐
it
has
a
nice
pathway
to
move
on
to
a
 fully
supported
contract
when
you
need
it.
 MySQL
is
also
on
the
cusp
of
launching
a
really
awesome
NBD
cluster
‐
this
is
basically
a
high
 availability
memory
store
database
which
retains
integrity
via
the
standard
server.
  • 31. mysql> s mysql Ver 14.7 Distrib 4.1.19, for pc-linux-gnu (i686) using readline 4.3 Uptime: 10 hours 11 min 47 sec Threads: 3 Questions: 10,171,505 Slow queries: 334 Opens: 224 Flush tables: 1 Open tables: 106 Queries per second avg: 277.100 31 31 This
is
a
single
machine,
dual
2.4GHz
xeon
processor,
hyperthreaded.
2GB
RAM.
Linux. Yes
it
is
possible
to
get
some
really
high
performance
MySQL
going
‐
you
just
need
to
get
the
settings
 right
‐
this
is
trial
and
error
(mostly) Had
over
a
billion
queries
on
an
uptime
of
60
days,
but
some
‘technician’
at
the
datacenter
rebooted
 the
wrong
box.
So
I
can’t
show
that
off.
shame!
  • 32. # query cache considered harmful query_cache_size=0 # key_buffer_size is the size of the buffer used for index blocks. key_buffer_size=100M # The maximum size of one packet. max_allowed_packet=1M # the length of time (in seconds) that we want to log against. #long-query-time=3 log-slow-queries=/var/log/mysql_slow_queries 32 32 Some
key
variables
I
always
have
set... query
cache
is
not
always
as
useful
as
it
seems
‐
OK
for
truly
unoptimized
badly
indexed
stuff,
not
so
 good
for
when
you
need
to
manage
the
stack‐
think
of
a
logging
table
or
a
user
table
in
a
social
 network
‐
when
the
data
changes
more
quickly
than
the
time
it
takes
to
create
and
query
the
cache‐
 you’re
in
trouble. it
was
also
quickly
written
to
make
MySQL
4
less
slow
in
response
to
a
customer
request.
 buffer
size
‐
set
to
be
as
much
spare
ram
as
you
have
‐
this
is
the
amount
of
memory
it’ll
allocate
to
fit
 in
the
buffer.
If
it
has
to
keep
allocating,
then
it’ll
do
the
sort
in
chunks
which
takes
FOREVER. The
message
buffer
is
initialised
to
net_buffer_length
bytes,
but
can
grow
up
to
max_allowed_packet
 bytes
when
needed.
Good
if
you’re
passing
around
large
objects
such
as
images,
articles,
and
so
on
‐
set
 it
high
and
forget
about
it
(as
long
as
your
network
can
cope) ALWAYS
log
slow
queries
‐
and
regularly
check.
This
is
your
first
port
of
call
for
optimizing
your
DB!!!
  • 33. # if you use network (tcp) based connections wait_timeout=90 net_write_timeout=180 net_read_timeout=60 max_connections=500 mysql > SHOW FULL PROCESSLIST; (for more info) 33 33 If
your
DB
server
is
different
to
your
app
server,
it’s
important
to
set
these.
Oftentimes
i’ve
seen
 servers
where
appservers
are
queuing
due
to
long
laggy
timeouts
and
no
available
connections.
  • 34. It’s OK to ditch AR ( DHH won’t get upset ) 34 34 Sometimes
it’s
just
simpler
to
drop
out
and
craft
a
very
focused
query,
use
a
stored
procedure
or
 function,
mysql
variables....
force
an
index. Just
because
you
can’t
do
it
in
a
#find
doesn’t
mean
you
shouldn’t
do
it.
(ie,
don’t
sacrifice
ultimate
 performance
for
manageability
every
time) good
example
and
not
easy
using
standard
AR
‐‐
using
INSERT
DELAYED
is
great
for
when
you
don’t
 need
to
know
the
id
of
the
row
inserted.
Good
for
things
like
logs,
stats
etc.

  • 35. Proxy > App ( warm up the pack, the engine’s running ) 35 35 Best
advice
right
now
is
to
use
nginx
as
a
front
end
to
a
mongrel
cluster
(or
two) it’s
very
fast
and
scalable
‐
nginx
is
lightweight,
and
can
handle
upstream
clusters
with
ease,
as
well
as
 use
fast
onboard
PCRE
style
regex
for
handling
different
paths
based
on
their
needs. mongrel,
while
not
being
the
fastest
in
the
pack,
lets
you
scale
out
easily.
Plus
Zed
is
pretty
clever,
and
 he’ll
fix
stuff
quickly.
 Why
use
them?
Lots
of
these
‘new’
http
servers
are
more
focused
towards
a
smaller
goalset
‐
they
are
 designed
to
achieve
one
or
two
things.
Apache
HTTPD
lets
you
embed
almost
any
module
imaginable
 in
the
chainset.
It’s
clear
who’s
going
to
be
faster.
  • 36. Event Driven? ( don’t presume your traffic ) 36 36 You
can
use
swiftiply
and
evented
mongrel
to
move
away
from
the
high
cost
of
threads.
This
is
useful
 because
rails
sits
in
one
big
loop
for
each
request
‐
so
tieing
up
expensive
threads
waiting
for
your
app
 to
get
done
is
not
necessarily
efficient.
Perhaps
try
running
it
in
an
event
loop haven’t
tried
this
yet
in
any
kind
of
real‐world
example
‐
but
really
keen
to
see
if
it
can
scale
(and
stand
 up)
  • 37. Req/sec (mean) 250.00 Stats courtesy of http://blog.kovyrin.net/ 234 220 218.75 207 187 187.50 156.25 125.00 nginx litespeed lighttpd(fcgi) apache(fcgi) 37 37 Clear
alternatives
if
you
aren’t
scaling
past
one
appserver
‐
these
numbers
are
sort
of
indicative litespeed
(pay
for
product)
has
some
nice
numbers
and
an
apparently
easy‐to‐use
interface
‐
live
tool
 for
adding
new
lsapis
on
the
fly lighttpd
+
apache,
yes,
straight
fastcgi
is
good
but
you
can’t
scale
past
four
FCGI
processes,
mongrel
 can
  • 38. KeepAlive ( no point if you’re dead ) 38 38 KeepAlive
almost
never
works.
99%
of
the
time,
you’re
going
to
benefit
just
making
your
appserver/ webserver
ignore
it.
Most
browsers
now
work
around
this
to
help
improve
perceived
performance. You
can
get
the
same
kind
of
benefit
by
parallelizing
your
asset
requests
‐
ie
randomize
from
server1/ server2
etc. Edge
rails
supports
this
natively.
  • 39. Hostname Lookup ( do not do this. ever. ) 39 39 anything
that
interferes
with
the
business
of
serving
your
webpage
to
the
client
is
going
to
hurt
your
 performance. turn
off
hostname
lookup,
excessive
logs,
unused
modules
‐‐
anything
you
really
really
don’t
need. make
sure
your
apps
are
compiled
to
perform
the
best
with
your
setup
(except
for
MySQL
where
you
 should
always
use
their
compiled
versions) Do
you
use
stats
packages?
Make
sure
the
JS
calls
are
right
before
the
end
</body>
tag
‐‐
you
may
get
 lucky
and
browsers
will
deal
with
complicated
stuff
like
styles
and
so
on,
or
render
the
page
to
the
 screen
whilst
waiting
‐
these
calls
typically
block
and
the
browser
can’t
do
much
till
they
return. So
be
sure
your
stats
package
can
handle
your
traffic
before
you
stick
it
up
there.
(Hint:
self‐installable
 stuff
like
mint
can’t
handle
millions
of
hits
per
day
without
lots
of
hardware
to
support
it) Really
bad
stats? perhaps
use
an
async
XMLHttpRequest
to
fire
it,
an
IFrame
or
the
onload
handler....
  • 40. NFS and Beyond ( sharing is good ) 40 40 Are
you
pre‐caching
on
every
server
?
Then
use
a
shared
file
store! It’s
also
easier
to
expire
one
store
than
many. be
warned
‐
NFS
traditionally
hasn’t
been
known
to
scale
as
well
as
it
could
‐
more
recent
versions
are
 more
performant Some
NFS
options
you
can
turn
off
(you
don’t
always
need
to
write,
for
example)
and
staying
in
sync
is
 not
always
important
for
a
small
share
you
can
just
remount
if
it
gets
crazy.
  • 41. Write over NFS ( be super efficient ) 41 41 Zed
pointed
out
this
really
brain‐dead
simple
efficiency.
If
you
use
NFS
‐
use
it
to
write
to
your
asset
 servers
‐
disk
is
cheap
but
the
network
tear
down
/
start
up
is
expensive.
Don’t
saturate
your
net
card
 just
passing
data
around
again
and
again. Always
look
for
the
simplest
path.
  • 42. MogileFS, NFS Clusters ( brainy sharing ) 42 42 If
you’re
struggling
under
the
load
of
lots
of
static
assets
(think
youtube
or
flickr)

and
you
can’t
quite
 afford
a
network
attached
storage
device
with
a
petabyte
of
disk
space, consider
using
up
the
many
multi
gigabyte
disks
you
have
in
your
servers! cluster
up
for
NFS
clusters
(tricky
but
not
impossible)
where
you
can
create
a
pseudo
raid
over
 machines
via
software.
google
for
it or
use
mogileFS
and
its
HTTP
DAV
style
api
for
grabbing
your
data
chunks.
RobotCOOP
have
a
 working
library.
  • 43. Tuning Recap ( were you listening? ) 43 43 

1.
Check
for
bottlenecks.
focus
on
perceived
areas
of
slowness 

2.
Improve
by
making
users
happy 

3.
Look
at
your
layout
‐
are
your
servers
fighting
for
CPU/RAM
time? 

4.
Are
you
on
a
shared
host
and
being
kept
in
strict
limits? 

5.
Is
your
code
optimal
‐
especially
templates?
 

6.
Can
you
get
more
servers? 

7.
Tuning
your
apps
‐
is
the
MySQL
processlist
showing
lots
of
waiting
queries? 

8.
Are
you
running
the
most
optimal
HTTP
setup? 

9.
is
your
cache
causing
you
problems
on
the
disk? 10.
Attend
one
of
our
scalability
talks
‐
starting
in
May.
ask
the
skillsmatter
team
here
for
more
info. 10.
Hire
me....
or
someone
like
me
:)
  • 45. Resources - talk: smokeclouds.com/scalability.pdf me: smokeclouds.com :: imaj.es blogs: brainspl.at :: blog.kovyrin.net : caboo.se app: mongrel.net :: litespeed.com web: lighttpd.net :: nginx.net :: swiftcore.org hosts: railsmachina.com :: engineyard.com 45 45