REmote DIctionary Server
Shared variables on the network
Shared memory isn’t always bad
Dealing with locks is hard, Redis does that for you
Atomic operations through a single thread, fast as hell
Caching and queues are regular use cases but it’s capable of much more
Relevant now more than ever as microservices/SOA gains in popularity, distributed processes need ways to co-ordinate
This can be done via your regular SQL database but it is churn-heavy
Added load to the database is bad
Store intermediary results elsewhere
You will end up implementing many of Redis’ features
Much better to isolate it into a separate process/service
Also much superior to having an in-memory cache per machine as values only need to be cached once for the whole system, not once per server/process
Going to
Redis is a key-value store at heart
Key can have an expiry set
Value can be one of many types
cached function hasn't need to change
Key has to vary with parameters, namespacing is common
Problem with this is it can grow infinitely, need to think about expiring data
Expiry
LRU cache
Always set one when possible
Relative is useful for "rolling windows" as demonstrated or caching "good enough" results
Absolute useful for "valid until" things such as hourly or daily reports
Rarely use absolute, instead build the time period into the key and set a relative expiry
Namespacing of keys (again)
Small overhead but good for your sanity
Named keys are the exception, better to be absolute
Allows for composite reports much more easily
Implicitly moves last_hour forward at the boundary
Can set expiry for the current hour differently in code
For example current hour cache for 1 minute, past hour cache for 1 month
Base value type is a string
Keys are always a string
Only going to cover the core ones, there's a few more specialised data types
The most simple case
Both the key and the value are a string
Can be treated like an integer in some case which we'll cover, highly optimised operations
The value is a list of strings
Useful where order is important, building block for queue structures
The value is a unique set of strings
Useful where uniqueness is important, for example click tracking
Similar to a raw cache, but about avoiding work rather than knowing work
Can improve the performance
High-churn so not well suited to a database
If value lost not end of world
Conditional GET is special case of "touch"
Value doesn't really matter, the presence of key is what matters
Something has been done recently or is being done so don't duplicate work
Distributed locks are a special case of this, basic is usually sufficient
Often need short-lived values
Password
Value contains user ID
Expiry set to make link invalid
Delete key on use to make single-use
Access tokens
Value contains details to look up full authorization
Also expire after a time, use refresh token to get a new one
Voucher codes
Similar to password
Could store JSON as value for more complex relationships
No concept of storing an integer, but has specialised methods for treating them as such
More efficient when it comes to looking things up
Granularity matters for different periods of time
24 hours for minute-level
4 weeks for hour-level
Forever for day level
Pipelining
Saves multiple round trips, perfect in this case
MGET - multiple GET
GET for a non-existent key (in our case never called INCR) returns nil
Trivial to map this to a single int in code
MGET essentially is inlined GETs so effectively unlimited, danger of multiple O(1) operations hence another reason to split to sensible granularity to reduce size of array
Building block of a rate limiter, though you would probably expire more aggressively
Along a similar line to the time-series counter, but with uniqueness
Perfect for a Set
S prefix to commands
SADD deals with uniqueness on our behalf
SCARD - set cardinality (count)
Pipelining could be used to get multiple SCARDs like MGET
MGETs just a more common case so has a special command
Lists
Key is name of list, value is a string
Returns nil if list empty
Returns nil if list empty
Prioritisation
More efficient to wait for work
Should always have a timeout (seconds)
Can know the process is alive if nothing else
JSON
Can expire complex data structures too
Cached Fibonacci
Distributed processing - workers could simple print
- Could send an email or text message
Possibly respond to sending process via response queue