Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Scalability without going nuts
1. James Cox
Chief squirrel, smokeclouds
james@smokeclouds.com
Scalability without
going nuts
1
1
2. what this is
( just an overview )
2
2
This is an overview of some of the areas i’ve focused on when investigating scalability.
There are no easy answers ‐ but hopefully these ideas will give you some directions for your own
apps
From something small comes something big ‐ i just made that up. We’re going to have fun with
making our apps work when there is more than one user by looking at code, ops and more.
particularly we’ll try and wade through some language improvements/tips, some infrastructure
planning tips, stuff to make MySQL better, and so on.
we’ll also touch on proxy/app servers, fileshares and some questions at the end, if we get that
far.
I hope you’re all comfortable, mobile phones are all off as i know we’re all busy people
right. lets begin.
4. Rails isn’t fastest ( assembler is )
4
4
rails isn’t fastest ‐ that’s ok.
Life is about tradeoff and compromise
We pick rails because of its ease and efficiency to code ‐ and we can refactor, scale and improve later.
or just buy more servers.
refer to recent rants on ruby perf etc...
5. Planning Trumps All ( even donald )
5
5
A bit of planning and process mapping will do more for your ability to scale than any later
improvements, usually ruling out a rewrite if you’ve got the core of the project in the right direction.
6. Analyze
( don’t guess )
6
6
Once you have your planning arranged, don’t guess as to where performance is struggling ‐ actually
try and get some numbers to benchmark against. To learn more go watch the excellent peepcast
httperf tutorial ‐ which i almost played instead of doing this talk!
7. Speed Perceived ( the easiest way )
7
7
There is always the “glamour” of making a high performance app which can handle all the requests
you can possibly imagine.
Not everyone can be a livejournal and actually make their servers push 98 MB/s on their 100MB
network cards.
Find the areas of the app which the userbase perceives to be the slowest: it may be that you can make
your app appear ‘faster’ by improving the UI/UX.
Work on these areas and then radiate outwards: it’s easier to refactor in chunks than as a whole
(tangent: SOA architecture is not a bad idea....)
8. Focus on your app ( it’s usually cheaper )
8
8
So how can we make our app faster?
There are a number of techniques we can employ to make our apps better.
Now to discuss some of them.....
9. Improving ActiveRecord:
:select, :limit, :offset
( take what you need )
9
9
‐ You don’t always need that data ‐ this problem hides itself when
you are first building ‐ but as you
add data, no limit/offset means you often end up grabbing too many rows
‐ This is particularly important when using TCP connections to your database.
‐ oftentimes an app is waiting for the data to transfer, so limit it to just the stuff you need
10. Improving ActiveRecord:
:include => :association
( keep it eager )
10
10
OK so eager loading changes your query from N+1 (where n is the number of rows multiplied by
associations) to one query.
Under the hood, this works by causing a LEFT OUTER JOIN ‐ SQL for joining the tables together. Outer
joins work by including rows even when one half of the join is NULL.
High query counts are bad because they cause queueing for read/write on the table.
11. Improving ActiveRecord:
Model < CachedModel
( cache first, ask questions later )
11
11
So you’ve limited your query to the least amount of data necessary ‐ or you’re just looking up a single
row. What next?
Cache your data in a fast retrieval store such as memcached. Nice ActiveRecord extension for this
(even if it is a bit hairy)
n.b. this only works with simple ID based lookups ‐ for anything complex you need to use Cache.set
and Cache.get
12. Improving ActiveRecord:
acts_as_cached
( built from experience )
12
12
Better alternative to CachedModel, but you have to add this as a method to a Model.
This is a bit more structured than CachedModel.
Built from CNET’s chow/chowhound team
13. Improving ActiveRecord:
cache_fu
( in incubation )
13
13
Better alternative to CachedModel, but you have to add this as a method to a Model.
This is a bit more structured than CachedModel.
Built from CNET’s chow/chowhound team
14. Improving ActiveRecord:
@var ||= Model.find(...)
( keep your code dry )
14
14
Ever do a lookup ‐ current_user, current_page, or some other check that happens more than once in a
request?
the ||= method says ‐ use the instance variable or define it via the query.
15. Improving ActiveRecord:
@@modulo ||= (52 % 100)
( run once, save forever )
15
15
@@ is a class variable ‐ a quick way to store a variable for the lifetime of the app...
16. Improving ActionView:
template optimizer
( non-lazy views )
16
16
if you use semantic views ‐ markaby, builder ‐ or lots of helpers ‐‐ you have to spend way too much
time to parse the file to get some HTML in the end...
link_to, image_tag, form_tags ‐ all helpers for HTML functions which are, honestly, for people who’ve
gotten bored writing HTML.
During each request the view rhtml has to be parsed and delivered ‐ this is expensive. It’s so expensive
to do this parsing that, in other languages ‐ e.g. PHP all optimizers focus on serving up byte‐compiled
scripts ‐ and this goes back to our first comment that assembler is faster.
so get your views back to the ‘compiled’ form and ditch those helpers early by optimizing your
templates
This should bring down the ‘Render’ part of the query log.
17. Improving ActionView:
Publish Once
( caching always wins )
17
17
You’re going to have gotten your page load time to somewhat of an optimal level by now ‐ improving
your database queries, and then pre‐compiling your templates.
Now consider if you can cache your pages.
Is this a highly trafficked content website? (caching is a must)
Can you get away with profile etc pages being cached till updated? (social networking site)
18. Improving ActionView:
caches_page: bad
( nightmare to cleanup )
18
18
caches_page is the trick used to simply write out the entire page to disk... can be tricky to keep up to
date, and also hard work for a slow disk.
This also falls down if you have a loose url schema: a site i’ve hacked on had about 500MB of content,
but caches_page has generated 30GB of content ‐‐ why? spiders will pervert your url schema ‐ and
cause it to generate waaaay too much content.
19. Improving ActionView:
<%=
cache(:action => 'feature', :part => 'most_read') do
render :partial => 'article/most_read'
end -%>
19
19
Drop a fragment cache into your view and save repetitive tasks
Doesn’t yet work with robot‐coop’s memcache‐client as a fast store for fragments ‐
but
There is a memcache backed fragment store gem ‐ eg, extended fragment cache
20. Improving Sanity:
Follow Edge
( DHH Breaks Stuff )
20
20
@@ is a class variable ‐ a quick way to store a variable for the lifetime of the app...
24. Avoid Shared Hosting
( there’s only so much to go around )
24
24
When I was living at my family home, my brothers always used to share my stuff ‐ clothes, shower gel,
aftershave ‐ you name it.
Same is true for server resources ‐ everyone’s gotta share.
Not all users play nice ‐ that crazy crawler on your box is taking up all the ram and the spammer is
getting you black listed.
Too many variables you can’t control ‐ VPS software is pretty harsh for setting process limits to save
the box as a whole
Underconfigured software ‐ all packages to make it work for everyone. Low performance: designed to
encourage upgrades.
25. New Players
( always one )
25
25
SOME vps are getting it right ‐ Engine Yard, Rails Machine ‐ high‐performance focused servers
Expects trusted users ‐ won’t cater for the low‐end user
Expensive to buy into, low availability ‐ but often a worthwhile investment
26. Multiple Servers?
( work them hard )
26
26
One server or more?
It’s great if you have the infrastructure.... but do you know how to split them up?
27. Setup Hot
( universe is infinite )
27
27
There’s also performance in productivity ‐ it makes sense to mirror setups on each machine for hot‐
backup as well as for predictability.
capistrano will help you with this.
28. 8 Server Gem
Proxy/Web Static (2)
Application Servers (4)
Database Layer (2)
28
28
It’s great if you have the infrastructure.... but do you know how to split them up?
Think of the shape of a ruby ‐ the top is a bit of a plateau, and that’s where you put static and proxy
servers. You’ll want to load balance these for high availability ‐ but generally these scale very well as
they don’t do much but route traffic and serve files.
The widest part ‐ those are your application servers, and you can grow these out to as many as you
can imagine. This is your workhorse layer ‐ everything interesting happens here. Careful you don’t
have too many of these for the proxy servers ‐ if there are so many choices for each proxy some of
these can sit idle.
The bottom, hidden part is the best bit ‐ the database layer. This is a somewhat sacred layer: not many
servers can play this part at once. Ensure you put your best machines at this level. You’re going to
want to see high ram, good I/O throughput, lots of CPU power and plentiful disk space.
29. Playing Well Together
( there is only one sandpit )
29
29
So you’ve gotten your servers tagged up ‐ how do you assign them tasks?
With one of our clients, we had a situation where we have a mega busy ad‐server and a busy CMS
sharing the same database. it made sense to break them apart onto two servers ‐ the query stats made
sense.
... but we could put the admin and the front end app and proxy servers on the same machines ‐
Why? Front end/admin work well together. Databases are heavy read/write so two busy databases
will fight/queue for file system access.
30. MySQL Tuning
( feed the beast )
30
30
OK lets cover some tips getting MySQL to play nice.
Why MySQL over others? Mostly business reasons than tech ‐ it has a nice pathway to move on to a
fully supported contract when you need it.
MySQL is also on the cusp of launching a really awesome NBD cluster ‐ this is basically a high
availability memory store database which retains integrity via the standard server.
31. mysql> s
mysql Ver 14.7 Distrib 4.1.19, for pc-linux-gnu (i686) using readline 4.3
Uptime: 10 hours 11 min 47 sec
Threads: 3 Questions: 10,171,505 Slow queries: 334 Opens: 224 Flush tables: 1 Open
tables: 106 Queries per second avg: 277.100
31
31
This is a single machine, dual 2.4GHz xeon processor, hyperthreaded. 2GB RAM. Linux.
Yes it is possible to get some really high performance MySQL going ‐ you just need to get the settings
right ‐ this is trial and error (mostly)
Had over a billion queries on an uptime of 60 days, but some ‘technician’ at the datacenter rebooted
the wrong box. So I can’t show that off. shame!
32. # query cache considered harmful
query_cache_size=0
# key_buffer_size is the size of the buffer used for index blocks.
key_buffer_size=100M
# The maximum size of one packet.
max_allowed_packet=1M
# the length of time (in seconds) that we want to log against.
#long-query-time=3
log-slow-queries=/var/log/mysql_slow_queries
32
32
Some key variables I always have set...
query cache is not always as useful as it seems ‐ OK for truly unoptimized badly indexed stuff, not so
good for when you need to manage the stack‐ think of a logging table or a user table in a social
network ‐ when the data changes more quickly than the time it takes to create and query the cache‐
you’re in trouble.
it was also quickly written to make MySQL 4 less slow in response to a customer request.
buffer size ‐ set to be as much spare ram as you have ‐ this is the amount of memory it’ll allocate to fit
in the buffer. If it has to keep allocating, then it’ll do the sort in chunks which takes FOREVER.
The message buffer is initialised to net_buffer_length bytes, but can grow up to max_allowed_packet
bytes when needed. Good if you’re passing around large objects such as images, articles, and so on ‐ set
it high and forget about it (as long as your network can cope)
ALWAYS log slow queries ‐ and regularly check. This is your first port of call for optimizing your DB!!!
33. # if you use network (tcp) based connections
wait_timeout=90
net_write_timeout=180
net_read_timeout=60
max_connections=500
mysql > SHOW FULL PROCESSLIST; (for more info)
33
33
If your DB server is different to your app server, it’s important to set these. Oftentimes i’ve seen
servers where appservers are queuing due to long laggy timeouts and no available connections.
34. It’s OK to ditch AR
( DHH won’t get upset )
34
34
Sometimes it’s just simpler to drop out and craft a very focused query, use a stored procedure or
function, mysql variables.... force an index.
Just because you can’t do it in a #find doesn’t mean you shouldn’t do it. (ie, don’t sacrifice ultimate
performance for manageability every time)
good example and not easy using standard AR ‐‐ using INSERT DELAYED is great for when you don’t
need to know the id of the row inserted. Good for things like logs, stats etc.
35. Proxy > App
( warm up the pack, the engine’s running )
35
35
Best advice right now is to use nginx as a front end to a mongrel cluster (or two)
it’s very fast and scalable ‐ nginx is lightweight, and can handle upstream clusters with ease, as well as
use fast onboard PCRE style regex for handling different paths based on their needs.
mongrel, while not being the fastest in the pack, lets you scale out easily. Plus Zed is pretty clever, and
he’ll fix stuff quickly.
Why use them? Lots of these ‘new’ http servers are more focused towards a smaller goalset ‐ they are
designed to achieve one or two things. Apache HTTPD lets you embed almost any module imaginable
in the chainset. It’s clear who’s going to be faster.
36. Event Driven?
( don’t presume your traffic )
36
36
You can use swiftiply and evented mongrel to move away from the high cost of threads. This is useful
because rails sits in one big loop for each request ‐ so tieing up expensive threads waiting for your app
to get done is not necessarily efficient. Perhaps try running it in an event loop
haven’t tried this yet in any kind of real‐world example ‐ but really keen to see if it can scale (and stand
up)
37. Req/sec (mean)
250.00
Stats courtesy of http://blog.kovyrin.net/ 234
220
218.75
207
187
187.50
156.25
125.00
nginx litespeed lighttpd(fcgi) apache(fcgi)
37
37
Clear alternatives if you aren’t scaling past one appserver ‐ these numbers are sort of indicative
litespeed (pay for product) has some nice numbers and an apparently easy‐to‐use interface ‐ live tool
for adding new lsapis on the fly
lighttpd + apache, yes, straight fastcgi is good but you can’t scale past four FCGI processes, mongrel
can
38. KeepAlive
( no point if you’re dead )
38
38
KeepAlive almost never works. 99% of the time, you’re going to benefit just making your appserver/
webserver ignore it. Most browsers now work around this to help improve perceived performance.
You can get the same kind of benefit by parallelizing your asset requests ‐ ie randomize from server1/
server2 etc.
Edge rails supports this natively.
39. Hostname Lookup
( do not do this. ever. )
39
39
anything that interferes with the business of serving your webpage to the client is going to hurt your
performance.
turn off hostname lookup, excessive logs, unused modules ‐‐ anything you really really don’t need.
make sure your apps are compiled to perform the best with your setup (except for MySQL where you
should always use their compiled versions)
Do you use stats packages? Make sure the JS calls are right before the end </body> tag ‐‐ you may get
lucky and browsers will deal with complicated stuff like styles and so on, or render the page to the
screen whilst waiting ‐ these calls typically block and the browser can’t do much till they return.
So be sure your stats package can handle your traffic before you stick it up there. (Hint: self‐installable
stuff like mint can’t handle millions of hits per day without lots of hardware to support it)
Really bad stats?
perhaps use an async XMLHttpRequest to fire it, an IFrame or the onload handler....
40. NFS and Beyond
( sharing is good )
40
40
Are you pre‐caching on every server ? Then use a shared file store!
It’s also easier to expire one store than many.
be warned ‐ NFS traditionally hasn’t been known to scale as well as it could ‐ more recent versions are
more performant
Some NFS options you can turn off (you don’t always need to write, for example) and staying in sync is
not always important for a small share you can just remount if it gets crazy.
41. Write over NFS
( be super efficient )
41
41
Zed pointed out this really brain‐dead simple efficiency. If you use NFS ‐ use it to write to your asset
servers ‐ disk is cheap but the network tear down / start up is expensive. Don’t saturate your net card
just passing data around again and again.
Always look for the simplest path.
42. MogileFS, NFS Clusters
( brainy sharing )
42
42
If you’re struggling under the load of lots of static assets (think youtube or flickr) and you can’t quite
afford a network attached storage device with a petabyte of disk space,
consider using up the many multi gigabyte disks you have in your servers!
cluster up for NFS clusters (tricky but not impossible) where you can create a pseudo raid over
machines via software. google for it
or use mogileFS and its HTTP DAV style api for grabbing your data chunks. RobotCOOP have a
working library.
43. Tuning Recap
( were you listening? )
43
43
1. Check for bottlenecks. focus on perceived areas of slowness
2. Improve by making users happy
3. Look at your layout ‐ are your servers fighting for CPU/RAM time?
4. Are you on a shared host and being kept in strict limits?
5. Is your code optimal ‐ especially templates?
6. Can you get more servers?
7. Tuning your apps ‐ is the MySQL processlist showing lots of waiting queries?
8. Are you running the most optimal HTTP setup?
9. is your cache causing you problems on the disk?
10. Attend one of our scalability talks ‐ starting in May. ask the skillsmatter team here for more info.
10. Hire me.... or someone like me :)