This document discusses Viki's approach to breaking up its monolithic architecture into a distributed system of microservices. It describes how Viki used Redis and custom data structures like sets and sorted sets to build highly performant read-heavy services for features like content filtering and recommendations. It also explains how Viki uses a centralized message queue and event-driven architecture to connect its services and ensure consistency. The goal was to achieve response times around 25ms while supporting Viki's large global user base and content ecosystem.
14. ...viewers start leaving if video
doesn't play in 2 seconds...
viewers start leaving if
video doesn't play in 2
seconds
...and every second of
additional delay about 6%
more viewers jumping ship!
Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Designs. S. Shunmuga Krishnan, Ramesh K. Sitaraman, 2012
24. Bitmaps
genre:1 -> 010101001
genre:2 -> 010010011
type:music -> 011000001
intersect -> 010000001 (ids: 2 and 9)
Good: speed for intersect & memory efficiency
Bad: get the ids and the real data. sorting!,
paging...
25. Everything is a set
genre:1 = [1, 2, 4]
type:1 = [1, 2, 5]
type:2 = [3, 4]
Good: Sparse (too many 0s with bitmaps)
Complexity O(n+m) (m is sets, n elements in
the smallest set)
genre:1 -> 10 elements
videos -> 100K elements
complexity: O(10)
26. Building our own indexes
Data stored
Keeping track of the indexes
How do we find data...
redis.call('sort', my_set, 'BY', 'v:*->created_at',
'desc', 'LIMIT', offset, count, 'GET', 'v:*-
>details')
27. Holdbacks
Old system: not in (id, id, id)
first attempt: hash with a list of rules
permutations (too many countries in the world!)
CAP -> 10GB. meh~
alias matching permutations: 800mb ;)
Redis 32bit, even better!
CAP is just another set! (well, but a DIFF!)
28. Hacking Redis
Vfind: Building our own Redis function
Setlets: Pre-calculated sorted lists
Most requests 18~20ms, some cases 100ms
(depends on the bigger set)
Vfind only gets content to fill 1 page: 15ms
Paging just showing more: 9ms
Serialization of jsons: 5ms
Enough.. for now!
29. Lists
A list is just a sorted set. E.g. a list of
subscriptions, list of featured content...
Is a set!
You can apply holdbacks or any other filter.
31. Many web services
Each vertical is a source of truth
Logical and operational reasons
Everything routed through api.viki.io
Oceanus Activities Aphrodite Gaia
(Videos) (Behaviour) (Community) (Users)
32. Queue
Centralized queue for events and messages
Event-driven web services
Messages must be idempotent
Message / Events Queue