Shared personalization service. How to scale to 15 k rps (Patrice Pelland)
1. SPS – Scale to 15k RPS
Patrice Pelland
Microsoft
2. Overview and Goals of SPS
• SPS (Shared Personalization Service)
• It is a backend storage and service
• Enables following scenarios:
• Explicit personalization
• Implicit content optimization
• Geo based customization
3. Scenario #1
Scenario#1 – WL Anonymous ID and Machine
Anonymous ID - based Explicit Personalization
Examples: Locations for weather, news, events, favorite
sports team, personal shopping list, customized page
settings, etc.
4.
5. Scenario #2
Scenario#2 – WL Anonymous ID and Machine
Anonymous ID - based Implicit Content
Optimization
Examples: User demographic & behavior based content
optimizations and/or personalization (e.g. personal
recommendation)
6.
7. Scenario #3
Scenario#3 – GEO based customization
SPS provides a Geolookup service that allows partner to
enable IP based customizations (e.g. default location,
Location based contents, GEO fencing, etc.)
8.
9. Scaling? Availability? Perf?
• Why? 150 Million users visit US Home Page /
month and with peeks of 15,000 RPS and up
to 75 million users on other HP.
• Latency goals: Read < 25 ms – update < 50 ms
• Pages have to be up - $$$ loss if not
• Need to be stateless
10. Overall Architecture
SPS
AppFabric Cache
SPS FE Cluster Cluster
Cache
SPSAdapter Cache
Box
CMS Rendering Cache
(SPS MSN Geo
Box
Cache
Box
System CMS service Service Cached Data Box
wrapper)
Load
Bala Lookup
ncer System
Cache
Access
Partner web server
SPS Database Access
WCFService
Logic
Webstor
e DB
Access
Webstore Config
Server Database
Partitions
SPS Configuration
SPS Deployment
Data
Lookup
Deployment Data
11. How?
• Everything is Stateless
• Windows AppFabric Caching service with
many nodes – reliable and redundant
– Similar to memcache
– 240 GB of memory cache in the US
• SQL Server DB Partitioning with lookup system
master/backup at each level
12. Facts
• Availability
– Designed with no single point of failure
– Web - multiple web servers behind a LB.
– DB
• Each DB partition has a primary & secondary DB setup with multi master topology.
• Transactional replication is used by SQL to sync the primary & secondary. If a primary DB server goes down, requests are handled by secondary DB server.
– File share: WAN Sync is used to replicate critical files across primary & secondary file server. VIP ensures automatic read availability for SPS Service when
primary goes down. Write availability for backend services is ensured by manual fail over.
– Throttling to prevent outage from abnormal traffic – throttling is configurable both at server level and at partner level. Partner level throttling is based on
around 200% of normal peak traffic
– Load balancer also has a secondary backup
• Scalability
– Web & AppFabric cache: Scalability is achieved by adding new nodes. Everything is stateless…
– DB: Databases are hosted as webstore application. Scalability is achieved by partitioning. Adding additional data partition is very easy.
• Live site metrics
– Latency: 10 ms read, 30 ms update, 12ms (async update)
– US: 39 web servers, 15 AppFabric caching server, 10 SQL lookup server and 12 SQL backend (data) servers
– Asia: 17 web servers, 8 AppFabric caching server, 8 SQL lookup server and 10 SQL backend (data) servers
– Europe: 16 web servers, 8 AppFabric caching server, 8 SQL lookup server and 10 SQL backend (data) servers
– Current Peak RPS per web box in US is 375 (14.7K RPS US), Peak CPU 40%. Server capacity is around 600RPS with 70% CPU
12
13. High-level Features
• Support shared namespace definition – reduce # of calls
• Support multiple levels of access control of shared namespace
– Behind corp firewall
• Plug-in smart defaults for namespace
– Smart Defaults return faster for cases where the user doesn’t have
customizations yet.
13
14. High-level Features
• Plug-in smart data validation for namespace
– Small DLLs validate pre-compiled on the server
• Bulk upload of implicit user preference or clustering info
• Geolookup service – One stop shop – reduce calls
• Support both netTCP calls and WCF calls – if in the same DC
then netTCP 35% faster than normal TCP
• Service is available globally: US, Europe and Asia – Closer to
the user.
14
15. High-level Features
• Introduction of an API for Async update
– Designed to support implicit updates or storing session data. In this case, user does not
explicitly make an effort to update his/her setting. Instead, by just browsing a page, or click a
link, corresponding settings are stored on SPS.
• Examples: Recent stock list from doing stock quotes on MSN Money site, Search History, Article List where user clicked
thumb up/down, etc.
– Two stage updates: 1) data from client request is first saved in cache; then 2) batch updates to
DB, thus allowing faster response time to client. Optimized for writes
– Data is in memory for a short period of time before being written to DB. We are using
AppFabric high availability mode (i.e. dual cache copy) to minimize potential data loss. Data
loss may occur only if both cache servers are down at the same time.
– Async update can be turned on/off at attribute level via admin UI. E.g. User’s preferred
locations are not using Async update, but Money Recent Quotes may be.
15
16. (0) (12)
Partner Response
Request
Partition (3) (1)
Partition UserId for lookup Lookup Data in Cache
Lookup
Lookup (Cache miss) (2)
(4)
Return Data Found in AppFabric
Core Partition Information
for User Record SPS Cache
(11) Cache
Endpoint
Write to Cache
(WCF, CF)
(7)
MSN Geo User IP
(5)
Lookup Service (8) Query for records Core
User Core
Location/Connection
(6)
User Records
Core
Core
Info
(9) (10)
User
RevIPInfo and
Data missing
from DB
Defaults for
Missing Data Anatomy of a
Get API Call
Smart Defaults
Smart Defaults
Provider Defaults
Smart
Provider
Provider
16
17. (0) (10)
Partner Response
Request
(3)
Partition UserId for lookup
Partition (4)
Lookup
Lookup User Not Found AppFabric
(5)
Create lookup record SPS Cache
(7)
(6)
Endpoint Invalidate Cache
Core Partition Information
for User Record
(WCF, CF)
(8)
Write records Core
Core
(9)
Success/Fail
Core
Core
(1) (2)
Validate
Request
Smart
Success/Fail
Anatomy of an
Smart
Smart
Defaults
Defaults
Validator
Provider
Provider
Provider
Update API Call 17
18. Anatomy of an
1. Async Write Request 5. Response
Async Write Call
2. Invalidate Main cache
SPS Main Cache
Endpoint
(WCF, CF) 3. Write to Async cache
4. Return (success)
c. Invalidate Async cache Async Cache
a. Batch Read for DB Loading
CacheSweeper Core
Core
Core
b. DB Load 18
19. 1.Read Request 11. Response
4. Read from Main Cache
5. Cache miss from Main cache
Main Cache
SPS 10. Write to Main Cache
6. UserId for lookup
Endpoint
Partition (Cache miss) (WCF, CF) 2. Read from Async cache
Partition
Lookup
Lookup 7. Core Partition
3. Cache miss from Async cache
Information for User Record Async Cache
Anatomy of an 9. User Records
Core
Async Read Call 8. Query for records Core
Core
Core
19
SPS stands for Shared Personalization ServiceSPS is a service created by MSN to stop the proliferation of profiles. It is used by many teams at Microsoft mostly in MSN.Backend Storage of user customizations, optimization keys and a service backbone that offers different entry points to the data
Anonymous – no way to track who is the person from this.