Developing polyglot persistence applications #javaone 2012

DEVELOPING POLYGLOT
PERSISTENCE APPLICATIONS
Chris Richardson

Author of POJOs in Action
Founder of the original CloudFoundry.com

@crichardson
crichardson@vmware.com
http://plainoldobjects.com/

Presentation goal
The beneﬁts and drawbacks of
polyglot persistence
and
How to design applications that
use this approach

About Chris

http://www.theregister.co.uk/2009/08/19/springsource_cloud_foundry/

vmc push About-Chris

Developer Advocate for
CloudFoundry.com

Signup at http://cloudfoundry.com
promo code: cfjavaone

Agenda

• Why polyglot persistence?

• Using Redis as a cache

• Optimizing queries using Redis materialized views

• Synchronizing MySQL and Redis

• Tracking changes to entities

• Using a modular asynchronous architecture

Food to Go

• Take-out food delivery
service

• “Launched” in 2006

Food To Go Architecture
RESTAURANT
CONSUMER
OWNER

Order Restaurant
taking Management

MySQL
Database

Success Growth challenges

• Increasing trafﬁc

• Increasing data volume

• Distribute across a few data centers

• Increasing domain model complexity

Limitations of relational
databases

• Scalability

• Distribution

• Schema updates

• O/R impedance mismatch

• Handling semi-structured data

Solution: Spend Money

http://upload.wikimedia.org/wikipedia/commons/e/e5/Rising_Sun_Yacht.JPG

OR

http://www.trekbikes.com/us/en/bikes/road/race_performance/madone_5_series/madone_5_2/#

Solution: Use NoSQL
Beneﬁts Drawbacks

• Higher performance • Limited transactions

• Higher scalability • Limited querying

• Richer data-model • Relaxed consistency

• Schema-less • Unconstrained data

Example NoSQL Databases
Database Key features

Cassandra Extensible column store, very scalable, distributed

Neo4j Graph database
Document-oriented, fast, scalable
MongoDB

Redis Key-value store, very fast

http://nosql-database.org/ lists 122+ NoSQL
databases

Redis
K1 V1
• Advanced key-value store
K2 V2
• Very fast, e.g. 100K reqs/sec

• Optional persistence ... ...

• Transactions with optimistic locking

• Master-slave replication

• Sharding using client-side consistent hashing

Sorted sets
Value
Key

a b
myset
5.0 10.

Members are Score
sorted by score

Adding members to a sorted set
Redis Server

Key Score Value

a
zadd myset 5.0 a myset
5.0

Redis Server

a b
zadd myset 10.0 b myset
5.0 10.

Redis Server

c a b
zadd myset 1.0 c myset
1.0 5.0 10.

Retrieving members by index range
Start End
Key
Index Index Redis Server

zrange myset 0 1

c a b
myset
1.0 5.0 10.
c a

Retrieving members by score
Min Max
Key
value value Redis Server

zrangebyscore myset 1 6

c a b
myset
1.0 5.0 10.
c a

Redis use cases
• Replacement for Memcached • Handling tasks that overload an RDBMS
• Session state • Hit counts - INCR
• Cache of data retrieved from • Most recent N items - LPUSH and
system of record (SOR) LTRIM
• Replica of SOR for queries • Randomly selecting an item –
needing high-performance SRANDMEMBER
• Queuing – Lists with LPOP, RPUSH, ….
• High score tables – Sorted sets and
ZINCRBY
• …

Redis is great but there are
tradeoffs
• Low-level query language: PK-based access only

• Limited transaction model:

• Read first and then execute updates as batch

• Difficult to compose code

• Data must fit in memory

• Single-threaded server: run multiple with client-side sharding

• Missing features such as access control, ...

And don’t forget:

An RDBMS is ﬁne for many
applications

The future is polyglot

e.g. Netﬂix
• RDBMS
• SimpleDB
• Cassandra
• Hadoop/Hbase

IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg

Increase scalability by caching
RESTAURANT
CONSUMER
OWNER

Order Restaurant
taking Management

MySQL
Cache
Database

Caching Options
• Where:

• Hibernate 2nd level cache

• Explicit calls from application code

• Caching aspect

• Cache technologies: Ehcache, Memcached, Inﬁnispan, ...

Redis is also an option

Using Redis as a cache
• Spring 3.1 cache abstraction

• Annotations specify which methods to cache

• CacheManager - pluggable back-end cache

• Spring Data for Redis

• Simpliﬁes the development of Redis applications

• Provides RedisTemplate (analogous to JdbcTemplate)

• Provides RedisCacheManager

Using Spring 3.1 Caching
@Service
public class RestaurantManagementServiceImpl implements RestaurantManagementService {

private final RestaurantRepository restaurantRepository;

@Autowired
public RestaurantManagementServiceImpl(RestaurantRepository restaurantRepository) {
this.restaurantRepository = restaurantRepository;
}

@Override
public void add(Restaurant restaurant) {
Cache result
restaurantRepository.add(restaurant);
}

@Override
@Cacheable(value = "Restaurant")
public Restaurant findById(int id) {
return restaurantRepository.findRestaurant(id);
Evict from
} cache
@Override
@CacheEvict(value = "Restaurant", key="#restaurant.id")
public void update(Restaurant restaurant) {
restaurantRepository.update(restaurant);
}

Conﬁguring the Redis Cache
Manager
Enables caching

<cache:annotation-driven />

<bean id="cacheManager"
class="org.springframework.data.redis.cache.RedisCacheManager" >
<constructor-arg ref="restaurantTemplate"/>
</bean>

Speciﬁes CacheManager The RedisTemplate used
implementation to access Redis

Domain object to key-value
mapping?

Restaurant
K1 V1

TimeRange
TimeRange MenuItem
MenuItem K2 V2

... ...
ServiceArea

RedisTemplate

• Analogous to JdbcTemplate

• Encapsulates boilerplate code, e.g. connection management

• Maps Java objects Redis byte[]’s

Serializers: object byte[]

• RedisTemplate has multiple serializers

• DefaultSerializer - defaults to JdkSerializationRedisSerializer

• KeySerializer

• ValueSerializer

• HashKeySerializer

• HashValueSerializer

Serializing a Restaurant as JSON
@Configuration
public class RestaurantManagementRedisConfiguration {

@Autowired
private RestaurantObjectMapperFactory restaurantObjectMapperFactory;

private JacksonJsonRedisSerializer<Restaurant> makeRestaurantJsonSerializer() {
JacksonJsonRedisSerializer<Restaurant> serializer =
new JacksonJsonRedisSerializer<Restaurant>(Restaurant.class);
...
return serializer;
}

@Bean
@Qualifier("Restaurant")
public RedisTemplate<String, Restaurant> restaurantTemplate(RedisConnectionFactory factory) {
RedisTemplate<String, Restaurant> template = new RedisTemplate<String, Restaurant>();
template.setConnectionFactory(factory);
JacksonJsonRedisSerializer<Restaurant> jsonSerializer = makeRestaurantJsonSerializer();
template.setValueSerializer(jsonSerializer);
return template;
}
Serialize restaurants using Jackson
} JSON

Caching with Redis
RESTAURANT
CONSUMER
OWNER

Order Restaurant
taking Management

Redis MySQL
First Second
Cache Database

Finding available restaurants
Available restaurants =
Serve the zip code of the delivery address
AND
Are open at the delivery time

public interface AvailableRestaurantRepository {

List<AvailableRestaurant>

ﬁndAvailableRestaurants(Address deliveryAddress, Date deliveryTime);
...
}

Food to Go – Domain model (partial)
class Restaurant { class TimeRange {
long id; long id;
String name; int dayOfWeek;
Set<String> serviceArea; int openTime;
Set<TimeRange> openingHours; int closeTime;
List<MenuItem> menuItems;
}
}

class MenuItem {
String name;
double price;
}

Database schema
ID Name …
RESTAURANT table
1 Ajanta
2 Montclair Eggshop

Restaurant_id zipcode
RESTAURANT_ZIPCODE table
1 94707
1 94619
2 94611
2 94619
RESTAURANT_TIME_RANGE table
Restaurant_id dayOfWeek openTime closeTime
1 Monday 1130 1430
1 Monday 1730 2130
2 Tuesday 1130 …

Finding available restaurants on Monday, 6.15pm
for 94619 zipcode
Straightforward three-way join

select r.*
from restaurant r
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime

Option #1: Query caching

• [ZipCode, DeliveryTime] ⇨ list of available restaurants

BUT

• Long tail queries

• Update restaurant ⇨ Flush entire cache

Ineffective

Option #2: Master/Slave replication
Writes Consistent reads

Queries
MySQL
(Inconsistent reads)
Master

MySQL MySQL MySQL
Slave 1 Slave 2 Slave N

Master/Slave replication

• Mostly straightforward

BUT

• Assumes that SQL query is efﬁcient

• Complexity of administration of slaves

• Doesn’t scale writes

Option #3: Redis materialized
views
RESTAURANT
CONSUMER
OWNER

Order Restaurant
taking Management System
of
Copy update() Record
ﬁndAvailable()
MySQL
Redis Cache
Database

BUT how to implement ﬁndAvailableRestaurants()
with Redis?!

?
select r.*
from restaurant r K1 V1
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
K2 V2
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815 ... ...
and 1815 <= tr.closingtime

Where we need to be
ZRANGEBYSCORE myset 1 6

=
sorted_set
select value,score key value score
from sorted_set
where key = ‘myset’
and score >= 1
and score <= 6

We need to denormalize

Think materialized view

Simpliﬁcation #1:
Denormalization
Restaurant_id Day_of_week Open_time Close_time Zip_code

1 Monday 1130 1430 94707
1 Monday 1130 1430 94619
1 Monday 1730 2130 94707
1 Monday 1730 2130 94619
2 Monday 0700 1430 94619
…

SELECT restaurant_id
FROM time_range_zip_code
WHERE day_of_week = ‘Monday’ Simpler query:
 No joins
AND zip_code = 94619  Two = and two <
AND 1815 < close_time
AND open_time < 1815

Simpliﬁcation #2: Application
ﬁltering
SELECT restaurant_id, open_time
WHERE day_of_week = ‘Monday’ Even simpler query
• No joins
AND zip_code = 94619
• Two = and one <
AND open_time < 1815

Simpliﬁcation #3: Eliminate multiple =’s with
concatenation
Restaurant_id Zip_dow Open_time Close_time

1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…

SELECT restaurant_id, open_time
WHERE zip_code_day_of_week = ‘94619:Monday’
key
range

Simpliﬁcation #4: Eliminate multiple RETURN
VALUES with concatenation
zip_dow open_time_restaurant_id close_time
94707:Monday 1130_1 1430
94619:Monday 1130_1 1430
94707:Monday 1730_1 2130
94619:Monday 1730_1 2130
94619:Monday 0700_2 1430
...

SELECT open_time_restaurant_id,
WHERE zip_code_day_of_week = ‘94619:Monday’
✔

Using a Redis sorted set as an index
zip_dow open_time_restaurant_id close_time
94707:Monday 1130_1 1430
94619:Monday 1130_1 1430
94707:Monday 1730_1 2130
94619:Monday 1730_1 2130
94619:Monday 0700_2 1430
...

Key Sorted Set [ Entry:Score, …]

94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]

94707:Monday [1130_1:1430, 1730_1:2130]

Querying with ZRANGEBYSCORE
Key Sorted Set [ Entry:Score, …]

94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]

94707:Monday [1130_1:1430, 1730_1:2130]

Delivery zip and day Delivery time

ZRANGEBYSCORE 94619:Monday 1815 2359

{1730_1}

1730 is before 1815  Ajanta is open

Adding a Restaurant
@Component
public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository {

@Override
public void add(Restaurant restaurant) {
addRestaurantDetails(restaurant);
Store as
addAvailabilityIndexEntries(restaurant); JSON
}

Text
private void addRestaurantDetails(Restaurant restaurant) {
restaurantTemplate.opsForValue().set(keyFormatter.key(restaurant.getId()), restaurant);
}

private void addAvailabilityIndexEntries(Restaurant restaurant) {
for (TimeRange tr : restaurant.getOpeningHours()) {
String indexValue = formatTrId(restaurant, tr); key member
int dayOfWeek = tr.getDayOfWeek();
int closingTime = tr.getClosingTime();
for (String zipCode : restaurant.getServiceArea()) {
redisTemplate.opsForZSet().add(closingTimesKey(zipCode, dayOfWeek), indexValue,
closingTime);
}
}
} score

Finding available Restaurants
@Component
public class AvailableRestaurantRepositoryImpl implements AvailableRestaurantRepository {
@Override
public List<AvailableRestaurant>
findAvailableRestaurants(Address deliveryAddress, Date deliveryTime) { Find those that
String zipCode = deliveryAddress.getZip(); close after
int dayOfWeek = DateTimeUtil.dayOfWeek(deliveryTime);
int timeOfDay = DateTimeUtil.timeOfDay(deliveryTime);
String closingTimesKey = closingTimesKey(zipCode, dayOfWeek);

Set<String> trsClosingAfter =
redisTemplate.opsForZSet().rangeByScore(closingTimesKey, timeOfDay, 2359);

Set<String> restaurantIds = new HashSet<String>();
for (String tr : trsClosingAfter) { Filter out those that
String[] values = tr.split("_"); open after
if (Integer.parseInt(values[0]) <= timeOfDay)
restaurantIds.add(values[1]);
}
Collection<String> keys = keyFormatter.keys(restaurantIds);
return availableRestaurantTemplate.opsForValue().multiGet(keys); Retrieve open
} restaurants

Sorry Ted!

http://en.wikipedia.org/wiki/Edgar_F._Codd

MySQL & Redis
need to be consistent

Two-Phase commit is not an
option

• Redis does not support it

• Even if it did, 2PC is best avoided http://www.infoq.com/articles/ebay-scalability-best-practices

Atomic
Consistent Basically Available
Isolated Soft state
Durable Eventually consistent

BASE: An Acid Alternative http://queue.acm.org/detail.cfm?id=1394128

Updating Redis #FAIL
begin MySQL transaction
update MySQL Redis has update
update Redis MySQL does not
rollback MySQL transaction

update MySQL
MySQL has update
commit MySQL transaction
Redis does not
<<system crashes>>
update Redis

Updating Redis reliably
Step 1 of 2
update MySQL
ACID
queue CRUD event in MySQL
commit transaction

Event Id
Operation: Create, Update, Delete
New entity state, e.g. JSON

Updating Redis reliably
Step 2 of 2
for each CRUD event in MySQL queue
get next CRUD event from MySQL queue
If CRUD event is not duplicate then
Update Redis (incl. eventId)
end if
mark CRUD event as processed
commit transaction

Step 1 Step 2
Timer
EntityCrudEvent EntityCrudEvent apply(event) Redis
Repository Processor Updater

INSERT INTO ... SELECT ... FROM ...

ENTITY_CRUD_EVENT

ID JSON processed?
Redis

Optimistic
locking Updating Redis

WATCH restaurant:lastSeenEventId:≪restaurantId≫

lastSeenEventId = GET restaurant:lastSeenEventId:≪restaurantId≫
Duplicate
if (lastSeenEventId >= eventId) return; detection
MULTI
SET restaurant:lastSeenEventId:≪restaurantId≫ eventId
Transaction
... update the restaurant data...

EXEC

How do we generate CRUD
events?

Change tracking options

• Explicit code

• Hibernate event listener

• Service-layer aspect

• CQRS/Event-sourcing

HibernateEvent EntityCrudEvent
Listener Repository

ENTITY_CRUD_EVENT

ID JSON processed?

Hibernate event listener
public class ChangeTrackingListener
implements PostInsertEventListener, PostDeleteEventListener, PostUpdateEventListener {

@Autowired
private EntityCrudEventRepository entityCrudEventRepository;

private void maybeTrackChange(Object entity, EntityCrudEventType eventType) {
if (isTrackedEntity(entity)) {
entityCrudEventRepository.add(new EntityCrudEvent(eventType, entity));
}
}

@Override
public void onPostInsert(PostInsertEvent event) {
Object entity = event.getEntity();
maybeTrackChange(entity, EntityCrudEventType.CREATE);
}

@Override
public void onPostUpdate(PostUpdateEvent event) {
maybeTrackChange(entity, EntityCrudEventType.UPDATE);
}

@Override
public void onPostDelete(PostDeleteEvent event) {
maybeTrackChange(entity, EntityCrudEventType.DELETE);
}

Original architecture
WAR
Restaurant
Management

...

Drawbacks of this monolithic
architecture
• Obstacle
to frequent
WAR
deployments
Restaurant
Management • Overloads IDE and web
container
...
• Obstacle
to scaling
development

• Technology lock-in

Need a more modular
architecture

Using a message broker

Asynchronous is preferred

JSON is fashionable but binary
format is more efficient

Modular architecture
RESTAURANT
CONSUMER Timer
OWNER

Order Event Restaurant
taking Publisher Management

MySQL Redis
Redis RabbitMQ
Database Cache

Beneﬁts of a modular
asynchronous architecture
• Scales
development: develop, deploy and scale each service
independently

• Redeploy UI frequently/independently

• Improves fault isolation

• Eliminates long-term commitment to a single technology stack

• Message broker decouples producers and consumers

Step 2 of 2

for each CRUD event in MySQL queue
get next CRUD event from MySQL queue
Publish persistent message to RabbitMQ
mark CRUD event as processed
commit transaction

Message ﬂow
EntityCrudEvent
Processor
AvailableRestaurant
ManagementService
Redis
Updater

Spring Integration glue code

RABBITMQ REDIS

RedisUpdater AMQP
<beans>
Creates proxy
<int:gateway id="redisUpdaterGateway"
service-interface="net...RedisUpdater"
default-request-channel="eventChannel"
/>

<int:channel id="eventChannel"/>

<int:object-to-json-transformer
input-channel="eventChannel" output-channel="amqpOut"/>

<int:channel id="amqpOut"/>

<amqp:outbound-channel-adapter
channel="amqpOut"
amqp-template="rabbitTemplate"
routing-key="crudEvents"
exchange-name="crudEvents"
/>

</beans>

AMQP Available...Service
<beans>
<amqp:inbound-channel-adapter
channel="inboundJsonEventsChannel"
connection-factory="rabbitConnectionFactory"
queue-names="crudEvents"/>

<int:channel id="inboundJsonEventsChannel"/>

<int:json-to-object-transformer
input-channel="inboundJsonEventsChannel"
type="net.chrisrichardson.foodToGo.common.JsonEntityCrudEvent"
output-channel="inboundEventsChannel"/>

<int:channel id="inboundEventsChannel"/> Invokes service

<int:service-activator
input-channel="inboundEventsChannel"
ref="availableRestaurantManagementServiceImpl"
method="processEvent"/>
</beans>

Summary

• Each SQL/NoSQL database = set of tradeoffs

• Polyglot
persistence: leverage the strengths of SQL and
NoSQL databases

• Use Redis as a distributed cache

• Store denormalized data in Redis for fast querying

• Reliable database synchronization required

@crichardson crichardson@vmware.com
http://slideshare.net/chris.e.richardson/

Questions?
Sign up for CloudFoundry.com using promo code cfjavaone

Developing polyglot persistence applications #javaone 2012

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (19)

Destacado

Destacado (19)

Similar a Developing polyglot persistence applications #javaone 2012

Similar a Developing polyglot persistence applications #javaone 2012 (20)

Más de Chris Richardson

Más de Chris Richardson (20)

Último

Último (20)

Developing polyglot persistence applications #javaone 2012