2. Alexander Shopov
By day: Software Engineer at Cisco
By night: OSS contributor
Coordinator of Bulgarian Gnome TP
Contacts:
E-mail: ash@kambanaria.org
Jabber: al_shopov@jabber.minus273.org
LinkedIn: http://www.linkedin.com/in/alshopov
Google: Just search “al_shopov”
5. Why Cache At All?
● Lowers number of requests, improves latency,
provides scaling
● AJAX caching leads to lively applications
● Lowers server load for all kinds of content, but
especially important (and hard) for dynamic content
6. MOST IMPORTANT RESOURCE!
● RFC 2616 http://tools.ietf.org/html/rfc2616
● HTTP caching:
– http://tools.ietf.org/html/rfc2616#section-13
7. Purpose of caching
● Eliminate the need for requests
– No server round trip at all – fastest way
– Expiration – received data is fine
● Eliminate the need for full answers
– Lower traffic, narrow bandwidth
– Validation – received data probably fine, check it
8. HTTP participants
● All of them in the protocol from day 1 – not an
afterthought!
– Origin server
– Gateway/reverse proxy (shared cache)
– Proxy (shared cache)
– Client (can have internal cache – non shared cache)
● Gateway is similar to Proxy
– Proxies – chosen by client (or clients)
– Gateways – chosen by server
9. Client ↔ Intermediaries ↔ Server
● Easy/safe upgrade of protocol during conversation
● Caching principles:
– Semantically transparent
– Explicit permits for non transparent actions
– Intermediaries can add warnings
– Caching headers/directives can be one way
● Different behaviour for requests:
– Safe requests: GET/HEAD. Breaking this is server's fault, not
clients'. All other requests must reach origin server
– Idempotent requests – repeating ≡ doing them once:
GET/HEAD + PUT/DELETE/OPTIONS/TRACE
10. HTML: Meta tags
● Widely used and as widely ineffective:
– The only thing HTML designers can put
– Not read/used by intermediaries
– Not all browser caches honour it
● Do not rely on them! No real reason to use them.
(actually the real reason is that habit is second
nature).
11. HTTP 1.0
● Pragma: no-cache
– Pragmas are problematic – not all participants honour
them.
● Proper equivalent in HTTP 1.1:
– Cache-Control: no-cache
– Take from server even if available from cache
12. HTTP 1.1
● Expires – until then have it fresh
● ETag – (do) you have this version
● Cache-Control – fine grained tuning
13. Expires
● Expires: absolute_date
● To mark a resource already expired include header:
Expires = Date
14. ETag – 1
● No ordering, just value – either matches (single
value or a value from set) or does not.
● Per URI – no sense in comparing tags from different
URIs, E = entity
● ETag: resource tag
– ETag: "xyzzy" – strong, bit by bit equivalence
– ETag: W/"xyzzy" – weak, semantic equivalence
● Different matches
– Strong – matches and all tags are strong.
– Weak – matches, possible for tag to be weak.
15. ETag – 2
● Conditional requests: if matching – just
confirmation, otherwise – data itself
– If-Modified-Since
– If-Unmodified-Since
– If-Match
– If-None-Match
– If-Range
● Strong tags allow for caching of partial answers
16. Cache-Control
● All HTTP 1.1 participants MUST obey it (otherwise
they are broken.
● MUST reach all participants
● Cannot target a particular intermediary
18. Cache-Control Categories
● What is cacheable – only imposed by server
● What can be stored in cache – imposed by server
and client
● Modifications on expiration – imposed by server
and client
● Control over cache revalidation and reload – only
imposed by client
● Control over transformation of entities
● Extensions to the caching system
19. Cache-Control – Requests 1
● no-cache – cache should revalidate with server
● no-store – do not store on durable media
● max-age[=sec] – clients wants info no older than
this
● max-stale[=sec] – client accepts stale information
but no more stale than this
20. Cache-Control – Requests 2
● min-fresh[=sec] – clients wants info that will stay
fresh for this time
● no-transform – no trasnform by intermediary
– Medical Xray Photo from PNG to JPEG
● only-if-cached – when connection to server is bad.
Better to get 504 (Gateway Timeout) than wait
● cache-extension – extensions
21. Cache-Control – Responses 1
● public – may be cached by any cache
● private – must not be cached by shared cache
● no-cache – cache should revalidate with server
● no-store – do not store on durable media
● no-transform – no trasnform by intermediary
● must-revalidate – server requested revalidation of
stale data
● proxy-revalidate – same as above but not for user
agent cache
22. Cache-Control – Responses 2
● max-age[=sec] – for any cache
● s-maxage[=sec] – for shared cache, priority over
max-age and Expires.
● cache-extension – extensions
23. Status Codes 1
● 201 Created – can contain ETag, resource created
– (contrast with 202)
● 203 Non-Authoritative Information
– not from originating server but from cache
● 206 Partial Content – range partial GET request
– (contains ETag, Expires, Cache-Control, Vary if
changeable). Result to If-Range. If either ETag or
Last-Modified don't match – cache does not
combine them with others. If no support from
ranges in cache – 206 not cached.
24. Status Codes 2
● 302 Found – redirect that can change. Use Cache-
Control or Expires
● 304 Not Modified – conditional GET, resource not
changed, body of response empty (ETag/Content-
Location, Expires, Cache-Control or Vary)
● 305 Use Proxy – per request, generated by server
● 307 Temporary Redirect – similar to 302
25. Conditional requests/responses
● Origin servers
– Should provide both ETag (preferably strong unless
not feasible) and Last-Modified
– Must avoid reusing specific strong ETag for different
entities
● Clients
– Must/should use ETag Last-Modified and them in
conditional requests
26. AJAX
● Use cache directives in AJAX
● Try to make your AJAX responses cacheable (you
will have to think!)
● POSTs are mostly uncacheable, prefer GETs to
fetch information
● Generate Content-Length response headers and
reuse TCP/IP connection
27. Tools 1
● Firefox addons:
– Firebug
– LiveHTTPHeaders
– Modify Headers
● Chrome, Opera, Internet Explorer dev tools (F12)
28. Tools 2
● Mark Nottingham: Caching tutorial
● Redbot: Check cacheability
● Old, but gold: Cacheability