Based on my article published in the Microsoft Architecture Journal : Issue 17Available on-line at http://www.msarchitecturejournal.com/pdf/Journal17.pdfAbhijitGadkari1
1. Caching in the Distributed Environment
Abhijit Gadkari
Based on the article published in the Microsoft Architecture Journal : Issue 17
Available on-line at http://www.msarchitecturejournal.com/pdf/Journal17.pdf
1
3. Agenda
Background info and basics
Different types of cache like temporal , spatial , primed and demand cache
Some Examples
Caching in the ORM world!
Transactional cache and Shared cache
Managing the interaction
Size of a cache and its impact on application performance
Five minute introduction of “Velocity” – Microsoft ‘s Distributed Caching
platform
Open Forum !
3
4. Basics
Storage Size Cost per byte
On Board
Latency
RAM
Persistence
Hard Disk
Cloud
Data is stored in memory – i.e. L1, L2, L3 etc. known as cache. This
concept is extensively used in the von Neumann Architecture.
Memory Access time is measured in access time. Given an address ,
the memory presents the data at some other time
Memory Access Time = Latency + Transfer Size / Transfer Rate [2]
4
5. Types of Data
Data
Activity Resource
Reference
Data Data
Data
Understanding the different types of data and their semantics helps to understand the
different caching needs that comes with usage of that data type. [1]
5
6. Why ? – For Performance and Availability
Data Type [1] Caching Strategy [1]
Reference Data Practically immutable, non-volatile and long lasting in nature -
ideal candidate for caching. Can be shared across processes /
application. For example, zip code, state list, department list,
etc.
Activity Data Activity data is generated by the currently executing activity as
part of a business transaction. Only good for the life on the
transaction. Short lived in nature. For example, shopping cart
on e-commerce web site.
Resource Data Highly dependent on domain logic and volatile in nature. Cache
only when required. [a.k.a. don’t cache unless and until
absolutely required]. Commonly associated keywords –
concurrency , locking, ACID, dirty read, corrupt cache, business
logic, etc. For example, quantity information in an inventory
application.
Unknown DO NOT CACHE [ME]
“Keep a data item in electronic memory if its access frequency is five minutes or
higher, otherwise keep it in magnetic memory”[2]
Wikipedia defines cache as “a temporary area where frequently accessed data can
be stored for rapid access”[3]
6
7. Principle of Locality
Based on work done in 1959 on Atlas System’s
Virtual Memory [4]
Temporal Cache
Good for frequently accessed , relatively
nonvolatile data. For example, drop-down list on
a web page
Spatial Cache
Data adjacent to recently referenced data will be
requested in near future. For example, GridView
paging
7
8. Temporal Cache
public sealed class Cache : IEnumerable
using System.Web.Caching
8
9. Spatial Cache
In .NET, cache can be synchronized using SqlCacheDependency
9
10. Primed and Demand Cache [5,6]
Primed and Demand cache is based on the future use of the data.
Predating future is not easy and should be based on sound
engineering principals
The primed cache pattern is applicable when the cache or the part
of the cache can be predicted in advance. For example, a web
browser cache
The demand cache pattern is useful when cache can not be
predicted in advance. For example, a cached copy of user
credentials
The primed cache is populated at the beginning of the application,
whereas the demand cache is populated during the execution of
the application
10
11. Primed Cache
In .NET ICachedReport interface can be used to store the
pre-populated reports. The primed cache results in an almost
constant size cache structure
11
12. Demand Cache
1 user can have many roles
1 role can have many permissions
Managing demand cache
Minimize memory leak
Maximize hit-ratio
Effective eviction policy
In dynamic environment
Adaptive Caching Strategies
can be very effective
12
13. Caching in the ORM World!
Customer
I
M
cust_id type credit_allowed
P
3456 gold 1 E
D
7890 bronze 0 A
N
C
E
M
I Gold Silver Bronze
S
RDBMS
M
A
T
RDBMS – persistent storage In memory object graph
C
H
Ms Entity Framework /LINQ
JDO, TopLink, Hibernate, NHibernate
The ORM manager populates the data stored in persistent storage
like database in the form of an object graph. An object graph is a
good caching candidate
13
15. Layered Cache Architecture
The layering principle is based
on the explicit
Separation of responsibilities
Cache layering is prevalent in
many ORM solutions.
For Example, Velocity,
Hibernate
The first layer represents
the transactional cache and the
Second layer is the shared
cache designed as a process
or clustered cache
15
16. Transactional Cache
Objects formed in a valid state and participating in
a transaction can be stored in the transactional cache
Strictly bounded by the ACID rules
Transactional cache size is small size and short lived
Thrashing , cache corruption and caching conflicts should
be strictly avoided
Many caching frameworks offer out of the box
prepackaged transactional cache solution
16
17. Shared Cache
Can be implemented as a process cache or clustered
cache. The clustered cache introduces resource replication
overhead
Shared cache is a read-only cache
Distributed caching solutions typically implements a
shared cache solution
Can be implemented as an identity map. For example,
caching read-only, static reports using ICachedReport
17
19. Chasing the Right Size Cache
Remember the 80-20 rule a.k.a. Pareto principle and the bell shaped
graph
19
20. Microsoft project code named Velocity [1]
http://msdn.microsoft.com/fi-fi/library/cc645013(en-us).aspx
Distributed in-memory application cache platform
Can store any serializable CLR object
Allows clustering and provides ASP.NET session provider object so that ASP.NET session
objects can be stored in the distributed cache without having to write to database
20
21. Conventional Stack Stack with Distributed Cache
Application Application Application Application
Web Server[s] / App Server[s] Web Server[s] / App Server[s]
Database Distributed Cache
Application Application
Database
One Logical View
Velocity
Physical
implementation
Named Cache
Regions
Regions
Named Cache
21
Regions
22. Features [1]
Machine -> Cache Host -> Named Cache -> Regions -> Cache Items -> objects
Cache Operations
Get [select]– Returns object or entire Cache item
Add [insert]- Creates new entry else exception if entry exists
Put[update] - Replaces existing entry or creates a new one
Remove [delete]- Removes existing entry
Expiration and Eviction Policy is based on time-to-live [TTL] logic
Concurrency model supports optimistic version based updates and pessimistic
locking
“Velocity” can be deployed as a service or embedded within the application.
For example, host application can be ASP.NET / .NET application
22
23. Example [1]
// Create instance of cachefactory (reads appconfig)
CacheFactory fac = new CacheFactory();
// Get a named cache from the factory
Cache catalog = fac.GetCache(quot;catalogcachequot;);
// Simple Get/Put
catalog.Put(quot;toy-101quot;, new Toy(quot;thomasquot;, .,.));
// From the same or a different client
Toy toyObj = (Toy)catalog.Get(quot;toy-101quot;);
// Region based Get/Put
catalog.CreateRegion(quot;toyRegionquot;);
// Both toy and toyparts are put in the same region
catalog.Put(quot;toyRegionquot;, quot;toy-101quot;, new Toy( .,.));
Catalog.Put(quot;toyRegionquot;, quot;toypart-100quot;, new ToyParts(…));
Toy toyObj = (Toy)catalog.Get(quot;toyRegionquot;, quot;toy-101quot;);
23
24. Resources
Based on the paper “Caching in the Distributed Environment”
published in the Microsoft Architecture Journal : Issue 17
1. Microsft Project Code Named “Velocity” by N. Sampathkumar, M
Krishnaprasad and A. Nori
2.Transaction Processing : Concepts and Techniques by Jim Gray and
Andreas Reuter [ISBN: 1558601902]
3. http://en.wikipedia.org/wiki/Cache
4. “The Locality Principle” by Peter J. Denning ,
Communications of the ACM”, July 2005, Vol 48, No 7
5. “Caching Patterns and Implementation”, by Octavian Paul Rotaru,
Leonardo Journal of Sciences LJS: 5:8 , January-June 2006
6. Data Access Patterns: Database Interactions in
Object-Oriented Applications, by Clifton Nock, Addision Wesley
24
25. Open Forum !
Abhijit Gadkari
Abhijit.Gadkari@gmail.com
Blog : http://soaas.blogspot.com/
25