SlideShare a Scribd company logo
1 of 176
Blending MongoDB
   with RDBMS
    for e-commerce
My name is
Steve Francia

    @spf13
• 15+ years building e-commerce
• Long time open source contributor
• Entrepreneur
• Hacker, father, husband, skate punk
• VP Engineering @ OpenSky
My name is
Justin Hileman

    @bobthecow
• 10+ years making the Internet
  awesomer

• Open Source contributor
• Vespa rider, swing dancer,
  coder, standardista

• Software Engineer @ OpenSky
We work for OpenSky

     http://opensky.com
OpenSky is
  a new way to shop

OpenSky connects you with innovators,
trendsetters and tastemakers.You choose
the ones you like and each week they invite
you to their private online sales.
OpenSky Loves
             Open Source
•   PHP 5.3
•   Apache2
•   Symfony2
•   Doctrine2
•   jQuery
•   Mule
•   HornetQ
•   MongoDB
•   nginx
•   varnish
We contribute to many
open source projects and
   pioneer innovative
  solutions using them
OpenSky was the first
e-commerce site built
    on MongoDB
... also the first e-commerce site built on Symfony2
Why NoSQL for
  e-commerce?


Using the right solution for each situation
Data dilemma of
 e-commerce
     Pick One
Data dilemma of
      e-commerce
                 Pick One


• Stick to one vertical (Sane schema)
Data dilemma of
      e-commerce
                 Pick One


• Stick to one vertical (Sane schema)
• Flexibility (Insane schema)
Sane schema
Sane schema

• Works ... for a while
Sane schema

• Works ... for a while
• Fine for a few types of products
Sane schema

• Works ... for a while
• Fine for a few types of products
• Not possible when more product types
  introduced
Let’s Use an Example
Let’s Use an Example
   How about we start with books
Book Product Schema
Product {

id:
sku:                                    General Product
product dimensions:
shipping weight:                        attributes
MSRP:
price:
description:
...
author:           Orson Scott Card
title:            Enders Game
binding:          Hardcover
publication date: July 15, 1994         Book Specific
publisher name:   Tor Science Fiction   attributes
number of pages: 352
ISBN:             0812550706
language:         English
...
Seems simple enough
Seems simple enough

What happens when we add another vertical...
            say music albums
Album Product Schema
Product {

id:
sku:                               General Product
product dimensions:                attributes stay the
shipping weight:
MSRP:
                                   same
price:
description:
...
artist:         MxPx
title:          Panic              Album Specific
release date:   June 7, 2005       attributes are
label:          Side One Dummy
track listing: [ The Darkest ...   different
language:       English
format:         CD
...
Okay, it’s getting hairy but
is still manageable, right?
Okay, it’s getting hairy but
is still manageable, right?

    Now the business want to sell jeans
Jeans Product Schema
Product {

id:                           General Product
sku:
product dimensions:
                              attributes stay the
shipping weight:              same
MSRP:
price:
description:
...
brand:         Lucky
gender:        Mens
                              Jeans specific
make:          Vintage        attributes are totally
style:         Straight Cut   different ... and not
length:        34
width:         34
                              consistent across
color:         Hipster        brands & make
material:      Cotten Blend
...
Now we’re screwed
We need a flexible
schema in RDBMS
We need a flexible
schema in RDBMS


    We got this ... right?
Many approaches
dealing with unknown
unknowns in RDBMS
Many approaches
dealing with unknown
unknowns in RDBMS


      None work well
EAV
             as popularized by Magento
“For purposes of flexibility, the Magneto database heavily utilizes
an Entity-Attribute-Value (EAV) data model.

As is often the case, the cost of flexibility is complexity -
Magento is no exception.

The process of manipulating data in Magento is often more
“involved” than that typically experienced using traditional
relational tables.”
                         - Varien
EAV

•   Crazy SQL queries

•   Hundreds of joins in a query...
    or

•   Hundreds of queries joined in
    the application

•   No database enforced integrity
Did I say crazy SQL
(this is a single query)
Did I say crazy SQL
(this is a single query)




You may have trouble reading this in the back
Selecting a single product
Single Table Inheritance
            (insanely wide tables)

•   No data integrity enforcement

•   Only can use FK for common
    elements

•   Very wasteful (but disk is cheap!)

•   Can’t effectively index
Generic Columns
•   No data integrity enforcement

•   No data type enforcement

•   Only can use FK for common
    elements

•   Wasteful (but disk is cheap!)

•   Can’t index
Serialized in Blob
•   Not searchable

•   No integrity

•   All the disadvantages of a document
    store, but none of the advantages

•   Never should be used

•   One exception is Oracle XML
    which operates similar to a
    document store
Concrete Table Inheritance
    (a table for each product attribute set)

•   Allows for data integrity

•   Querying across attribute
    sets quite hard to do (lots of
    joins, OR statements and full
    table scanning)

•   New table needs to be
    created for each new
    attribute set
Class table inheritance
                  (single product table,
             each attribute set in own table)
•   Likely best solution within the
    constraint of SQL

•   Supports data type enforcement

•   No data integrity enforcement

•   Easy querying across categories (for
    browse pages) since common data
    in single table

•   Every set needs a new table

•   Requires a ton of forsight, as
    changes are very complicated
MongoDB to the
   Rescue
MongoDB to the
        Rescue
• Flexible (and sane) Schema
MongoDB to the
        Rescue
• Flexible (and sane) Schema
• Easily searchable
MongoDB to the
        Rescue
• Flexible (and sane) Schema
• Easily searchable
• Easily accessible
MongoDB to the
        Rescue
• Flexible (and sane) Schema
• Easily searchable
• Easily accessible
• Fast
Flexible schema
{                                 {
    sku: "00e8da9c",                  sku: "00e8da9d",
    type: "Audio Album",              type: "Film",
    title: "Hoss",                    title: "The Matrix",
    description: "by Lagwagon",       description: "Set in the 22nd century, Th
    asin: "B0000007QG",               asin: "B000P0J0AQ",

    shipping: {                       shipping: {
       weight: 6,                        weight: 6,
       dimensions: {                     dimensions: {
          width: 10,                        width: 10,
          height: 10,                       height: 10,
          depth: 1                          depth: 1
       },                                },
    },                                },

    pricing: {                        pricing: {
       list: 1000,                       list: 1200,
       retail: 800,                      retail: 1100,
       savings: 200,                     savings: 100,
       pct_savings: 20                   pct_savings: 8.5
    },                                },

    details: {                        details: {
      title: "Hoss",                    title: "The Matrix",
pct_savings: 20                      pct_savings: 8.5
},                                   },

details: {                           details: {
  title: "Hoss",                        title: "The Matrix",
  artist: "Lagwagon",                   director: [ "Andy Wachowski", "Larry Wa
  genre: [ "Punk", "Hardcore", "Indie Rock" ], [ "Andy Wachowski", "Larry Wach
                                        writer:
  label: "Fat Wreck Chords",            actor: [ "Keanu Reeves" , "Lawrence Fis
  number_of_discs: 1,                   genre: [ "Science Fiction", "Action" ],
  issue_date: "November 21, 1995",      number_of_discs: 1,
  format: "CD",                         issue_date: "May 15 2007",
  alternate_formats: [ 'Vinyl', 'MP3' ],original_release_date: "1999",
  tracks: [                             disc_format: "DVD",
     "Kids Don't Like To Share",        rating: "R",
     "Violins",                         alternate_formats: [ 'VHS', 'Bluray' ],
     "Name Dropping",                   run_time: "136",
     "Bombs Away",                      studio: "Warner Bros",
     "Move The Car",                    language: "English",
     "Sleep",                           format: [ "AC-3", "Closed-captioned", "
     "Sick",                            aspect_ratio: "1.66:1"
     "Rifle",                        },
     "Weak",                       }
     "Black Eye",
     "Bro Dependent",
     "Razor Burn",
     "Shaving Your Head",
     "Ride The Snake",
  ],
Queries
db.products.find( { 'name': "The Matrix" } );
db.products.find( { 'name': "The Matrix" } );


 {
     "_id": ObjectId("4d8ad78b46b731a22943d3d3"),
     "sku": "00e8da9d",
     "type": "Film",
     "name": "The Matrix",
     "description": "Set in the 22nd century, The Matrix...",
     "asin": "B000P0J0AQ",
     "shipping": {
         "weight": 6,
         "dimensions": {
             "width": 10,
             "height": 10,
             "depth": 1
         }
     },
     "pricing": {
db.products.find( { 'details.actor': "Groucho Marx" } );
db.products.find( { 'details.actor': "Groucho Marx" } );


 },
 "pricing": {
     "list": 1000,
     "retail": 800,
     "savings": 200,
     "pct_savings": 20
 },
 "details": {
     "title": "A Night at the Opera",
     "director": "Sam Wood",
     "actor": ["Groucho Marx", "Chico Marx", "Harpo Marx"],
     "genre": "Comedy",
     "number_of_discs": 1,
     "issue_date": "May 4 2004",
     "original_release_date": "1935",
     "disc_format": "DVD",
db.products.find( {
     'details.genre': "Jazz", 'details.format': "CD"
} );
db.products.find( {
     'details.genre': "Jazz", 'details.format': "CD"
} );


     "list": 1200,
     "retail": 1100,
     "savings": 100,
     "pct_savings": 8
 },
 "details": {
     "title": "A Love Supreme [Original Recording Reissued]",
     "artist": "John Coltrane",
     "genre": ["Jazz", "General"],
     "format": "CD",
     "label": "Impulse Records",
     "number_of_discs": 1,
     "issue_date": "December 9, 1964",
     "alternate_formats": ["Vinyl", "MP3"],
     "tracks": [
     "A Love Supreme Part I: Acknowledgement",
db.products.find( { 'details.actor':
     { $all: ['James Stewart', 'Donna Reed'] }
} );
db.products.find( { 'details.actor':
     { $all: ['James Stewart', 'Donna Reed'] }
} );


 },
 "details": {
     "title": "It's a Wonderful Life",
     "director": "Frank Capra",
     "actor": ["James Stewart", "Donna Reed", "Lionel Barrymore"],
     "writer": [
     "Frank Capra",
     "Albert Hackett",
     "Frances Goodrich",
     "Jo Swerling",
     "Michael Wilson"
     ],
     "genre": "Drama",
     "number_of_discs": 1,
     "issue_date": "Oct 31 2006",
     "original_release_date": "1947",
Wanna Play?

•   grab products.js from
    http://github.com/spf13/mongoProducts
•   mongo --shell products.js

•   > use mongoProducts
Embedded documents
 are great for orders
• Ordered items need to be fixed at the time
  of purchase
• Embed them right in the order
db.order.find( { 'items.sku': '00e8da9f' } );
db.order.find( {
    'items.details.actor': 'James Stewart'
} ).count();
Why not NoSQL?


Using the right solution for each situation
Data (like people) are
really sensitive when it
   comes to money
Stricter data
requirements for $$
Stricter data
   requirements for $$

• For financial systems any data inconsistency
  is unacceptable
Stricter data
   requirements for $$

• For financial systems any data inconsistency
  is unacceptable
• Perhaps you’ve heard of ACID?
What about ACID?
What about ACID?


Q: Is MongoDB ACID?
What about ACID?


Q: Is MongoDB ACID?
A: Kinda
Atomicity
Atomicity

• MongoDB does atomic writes
Atomicity

• MongoDB does atomic writes
  ... for single document changesets
Atomicity

• MongoDB does atomic writes
    ... for single document changesets


•   $set, $unset, $inc, $push,
    $pushAll, $pull, $pullAll, $bit
Consistency
Consistency

• MongoDB can enforce unique keys
Consistency

• MongoDB can enforce unique keys
  ... but only on keys shared by every
  document in the collection
Consistency

• MongoDB can enforce unique keys
  ... but only on keys shared by every
  document in the collection
• MongoDB can't enforce referential integrity
Isolation
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );

•   But there are caveats...
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );

•   But there are caveats...

     •    Despite the $atomic keyword, this is not an atomic update,
          since atomicity implies “all or nothing”
Isolation
•   // Pseudo-isolated updates
    db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );

•   // Isolated updates
    db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false ,
    true );

•   But there are caveats...

     •    Despite the $atomic keyword, this is not an atomic update,
          since atomicity implies “all or nothing”

     •    An isolated update can only act on a single collection. Multi-
          collection updates are not transactional, thus not isolatable.
Durability
Durability


• Mongo has this one covered
What does
MongoDB Support?
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Unique indexes
    •   This only works on keys used by the entire collection
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Unique indexes
    •   This only works on keys used by the entire collection
•   Isolated (not atomic) single collection updates.
    •   Mongo does not support locking
    •   There are ways to work around this
•   Atomic single document writes
    •   If you need atomic writes across multi-document
        transactions don't use Mongo
    •   Many e-commerce transactions could be
        accomplished within a single document write
•   Unique indexes
    •   This only works on keys used by the entire collection
•   Isolated (not atomic) single collection updates.
    •   Mongo does not support locking
    •   There are ways to work around this
•   It’s durable
There are ways to
guarantee ACID properties
 in inconsistent databases
There are ways to
guarantee ACID properties
 in inconsistent databases
 (or, as we call them, consistency impaired databases)
Optimistic concurrency
Optimistic concurrency
• Read the current state of a product
Optimistic concurrency
• Read the current state of a product
• Make your changes with the assertion that
  your product has the same state as it did
  when you last read it
Optimistic concurrency
    in MongoDB
Optimistic concurrency
    in MongoDB
We’ll use an update-if-current strategy.
Optimistic concurrency
    in MongoDB
We’ll use an update-if-current strategy.
This example is straight from the documentation:
Optimistic concurrency
    in MongoDB
    We’ll use an update-if-current strategy.
    This example is straight from the documentation:

>   t = db.inventory
>   p = t.findOne({sku:'abc'})
>   t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}});
>   db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}
// it worked
Optimistic concurrency
    in MongoDB
    We’ll use an update-if-current strategy.
    This example is straight from the documentation:

>   t = db.inventory
>   p = t.findOne({sku:'abc'})
>   t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}});
>   db.$cmd.findOne({getlasterror:1});

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1}
// it worked



    ... If that didn't work, try again until it does.
Optimistic concurrency
• Read the current state of a product.
• Make your changes with the assertion that
  your product has the same state as it did
  when you last read it.
Optimistic concurrency
• Read the current state of a product.
• Make your changes with the assertion that
    your product has the same state as it did
    when you last read it.
•   It's possible to use OCC to bootstrap
    pessimistic concurrency and fake row level
    locking
Optimistic concurrency
• Read the current state of a product.
• Make your changes with the assertion that
    your product has the same state as it did
    when you last read it.
•   It's possible to use OCC to bootstrap
    pessimistic concurrency and fake row level
    locking
    ... ask me about this some time
Optimistic concurrency
 control assumes an
environment with low
   data contention
OCC works great for
companies like Amazon

• Amazon has a long-tail catalog
• A long tail catalog lends itself well to
  optimistic concurrency, because it has low
  data contention
OCC fails miserably for
OCC fails miserably for
• eBay
OCC fails miserably for
• eBay
• Gilt
OCC fails miserably for
• eBay
• Gilt
• Groupon
OCC fails miserably for
• eBay
• Gilt
• Groupon
• OpenSky
OCC fails miserably for
• eBay
• Gilt
• Groupon
• OpenSky
• Living Social
OCC fails miserably for
• eBay
• Gilt
• Groupon
• OpenSky
• Living Social
• InsertFlashSaleSiteOfTheMinute
Flash sales and auctions
are defined by high data
       contention
Flash sales and auctions
are defined by high data
       contention

• The model doesn't work otherwise
Flash sales and auctions
are defined by high data
       contention

• The model doesn't work otherwise
• They can't afford to be optimistic
Can we use pessimistic
 concurrency with a
 distributed NoSQL
       database?
Yep.
Blending
NoSQL & RDBMS


Using the right solution for each situation
Our goal is to put as much
  in Mongo as possible

• What makes more sense in RDBMS?
 • Inventory
 • Orders
Inventory requires


• Row level locking (or table level locking)
Orders require

• Row level locking (or table level locking)
• Atomic writes (inventory decremented)
• Transactions (3rd party processing)
Inventory & checkout
     transactions
Commerce is ACID
   In Real Life
1. I go to Barneys and see a pair of shoes I just have to
   buy.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
5. If my card was declined, the shoes are “rolled back”
1. I go to Barneys and see a pair of shoes I just have to
   buy.
2. I call “dibs” (by grabbing them off the shelf).
3. I take them up to the cash register and purchase them:
    •   Store inventory has been manually decremented.
    •   I pay for them with my trusty AmEx.
4. If all goes according to plan, I walk out of the store.
5. If my card was declined, the shoes are “rolled back”
   ... out onto the shelves and sold to the next customer
   who wants them.
We follow the same
model for e-commerce
1. Select a product.
1. Select a product.

2. Lock the row or table and confirm inventory.
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory

  •   Process payment
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory

  •   Process payment

4. Commit the transaction.
1. Select a product.

2. Lock the row or table and confirm inventory.

3. Purchase the product:

  •   Decrement product inventory

  •   Process payment

4. Commit the transaction.

5. Roll back if anything went wrong.
Doctrine (ORM/ODM)
    to the rescue
Doctrine (ORM/ODM)
    to the rescue
   It would be possible without them,
      but we're not that masochistic
Data we store in SQL

• Order
• Order/Shipment
• Order/Transaction
• Inventory
Data we store in
  MongoDB
Data we store in
             MongoDB
•   User               •   Event

•   Product            •   TaxRate

•   Product/Sellable   •   ... and then I got tired of
                           typing them in
•   Address
                       •   Just imagine this list has
•   Cart                   40 more classes

•   CreditCard         •   ...
We have the
most boring SQL
  schema ever
CREATE TABLE `product_inventory` (
   `product_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`product_id`)
);

CREATE TABLE `sellable_inventory` (
   `sellable_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`sellable_id`)
);

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` char(32) NOT NULL,
  `shippingName` varchar(255) DEFAULT NULL,
  `shippingAddress1` varchar(255) DEFAULT NULL,
  `shippingAddress2` varchar(255) DEFAULT NULL,
  `shippingCity` varchar(255) DEFAULT NULL,
  `shippingState` varchar(2) DEFAULT NULL,
  `shippingZip` varchar(255) DEFAULT NULL,
  `billingName` varchar(255) DEFAULT NULL,
  `billingAddress1` varchar(255) DEFAULT NULL,
  `billingAddress2` varchar(255) DEFAULT NULL,
  `billingCity` varchar(255) DEFAULT NULL,
Wait. How does
 inventory live in SQL?
Isn’t that a property in one of your Mongo collections?
I thought you’d
   never ask!
CREATE TABLE `product_inventory` (
   `product_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`product_id`)
);

CREATE TABLE `sellable_inventory` (
   `sellable_id` char(32) NOT NULL,
   `inventory` int(11) NOT NULL DEFAULT '0',
   PRIMARY KEY (`sellable_id`)
);

CREATE TABLE `orders` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` char(32) NOT NULL,
  `shippingName` varchar(255) DEFAULT NULL,
  `shippingAddress1` varchar(255) DEFAULT NULL,
  `shippingAddress2` varchar(255) DEFAULT NULL,
  `shippingCity` varchar(255) DEFAULT NULL,
  `shippingState` varchar(2) DEFAULT NULL,
  `shippingZip` varchar(255) DEFAULT NULL,
  `billingName` varchar(255) DEFAULT NULL,
  `billingAddress1` varchar(255) DEFAULT NULL,
  `billingAddress2` varchar(255) DEFAULT NULL,
  `billingCity` varchar(255) DEFAULT NULL,
Inventory is transient
•   Product::$inventory is effectively a
    transient property
• Note how I said “effectively”? ... we cheat
    and persist our transient property to
    MongoDB as well
• We can do this because we never really
    trust the value stored in Mongo
Accuracy is only important
 when there’s contention
Accuracy is only important
 when there’s contention
• For display, sorting and alerts, we can use
  the value stashed in MongoDB
  • It’s faster
  • It’s accurate enough
Accuracy is only important
 when there’s contention
• For display, sorting and alerts, we can use
  the value stashed in MongoDB
  • It’s faster
  • It’s accurate enough
• For financial transactions, we want the
  security and comfort of our RDBMS.
We keep inventory in
 sync with listeners
We keep inventory in
 sync with listeners
• Every time a new product is created, its
  inventory is inserted in SQL
We keep inventory in
 sync with listeners
• Every time a new product is created, its
  inventory is inserted in SQL
• Every time an order is placed, inventory is
  verified and decremented
We keep inventory in
 sync with listeners
• Every time a new product is created, its
  inventory is inserted in SQL
• Every time an order is placed, inventory is
  verified and decremented
• Whenever the SQL inventory changes, it is
  saved to MongoDB as well
Be careful what you lock
Be careful what you lock
1. Acquire inventory row lock and begin transaction
2. Check current product inventory
3. Decrement product inventory
4. Write the Order to SQL
5. Update affected MongoDB documents
6. Commit the transaction
7. Release product inventory lock
Making MongoDB
and RDBMS relations
      play nice
Products are
documents stored
  in MongoDB
/** @mongodb:Document(collection="products") */
class Product
{
    /** @mongodb:Id */
    private $id;

    /** @mongodb:String */
    private $title;

    public function getId()
    {
        return $this->id;
    }

    public function getTitle()
    {
        return $this->title;
    }

    public function setTitle($title)
    {
        $this->title = $title;
    }
}
Orders are entities
stored in an RDBMS
/**
 * @orm:Entity
 * @orm:Table(name="orders")
 * @orm:HasLifecycleCallbacks
 */
class Order
{
    /**
     * @orm:Id @orm:Column(type="integer")
     * @orm:GeneratedValue(strategy="AUTO")
     */
    private $id;

    /**
     * @orm:Column(type="string")
     */
    private $productId;

    /**
     * @var DocumentsProduct
     */
    private $product;

    // ...
}
So how does an
     RDBMS have a
reference to something
 outside the database?
Setting the Product
class Order {

    // ...

    public function setProduct(Product $product)
    {
        $this->productId = $product->getId();
        $this->product = $product;
    }
}
•   $productId is mapped and persisted

•   $product which stores the Product
    instance is not a persistent entity property
Retrieving our
product later
OrderPostLoadListener
use DoctrineORMEventLifecycleEventArgs;

class OrderPostLoadListener
{
    public function postLoad(LifecycleEventArgs $eventArgs)
    {
        // get the order entity
        $order = $eventArgs->getEntity();

        // get odm reference to order.product_id
        $productId = $order->getProductId();
        $product = $this->dm->getReference('MyBundle:DocumentProduct', $productId);

        // set the product on the order
        $em = $eventArgs->getEntityManager();
        $productReflProp = $em->getClassMetadata('MyBundle:EntityOrder')
            ->reflClass->getProperty('product');
        $productReflProp->setAccessible(true);
        $productReflProp->setValue($order, $product);
    }
}
All Together Now
// Create a new product and order
$product = new Product();
$product->setTitle('Test Product');
$dm->persist($product);
$dm->flush();

$order = new Order();
$order->setProduct($product);
$em->persist($order);
$em->flush();

// Find the order later
$order = $em->find('Order', $order->getId());

// Instance of an uninitialized product proxy
$product = $order->getProduct();

// Initializes proxy and queries the monogodb database
echo "Order Title: " . $product->getTitle();
print_r($order);
Read more about
       this technique
Jon Wage, one of OpenSky’s engineers, first
wrote about this technique on his personal
blog: http://jwage.com

You can read the full article here:
http://jwage.com/2010/08/25/blending-the-
doctrine-orm-and-mongodb-odm/
Questions?
             http://spf13.com
                @spf13

             http://justinhileman.com
                @bobthecow

             http://opensky.com


PS: We’re hiring!! Contact us at jobs@opensky.com
Blending MongoDB and RDBMS for ecommerce

More Related Content

What's hot

Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
Sadayuki Furuhashi
 

What's hot (20)

Migrating to MongoDB: Best Practices
Migrating to MongoDB: Best PracticesMigrating to MongoDB: Best Practices
Migrating to MongoDB: Best Practices
 
Product catalog using MongoDB
Product catalog using MongoDBProduct catalog using MongoDB
Product catalog using MongoDB
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platforms
 
MongoDB, E-commerce and Transactions
MongoDB, E-commerce and TransactionsMongoDB, E-commerce and Transactions
MongoDB, E-commerce and Transactions
 
Understanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIsUnderstanding Query Plans and Spark UIs
Understanding Query Plans and Spark UIs
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
SHACL by example
SHACL by exampleSHACL by example
SHACL by example
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
Sharding Methods for MongoDB
Sharding Methods for MongoDBSharding Methods for MongoDB
Sharding Methods for MongoDB
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
 
Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1Understanding Presto - Presto meetup @ Tokyo #1
Understanding Presto - Presto meetup @ Tokyo #1
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
An introduction to MongoDB
An introduction to MongoDBAn introduction to MongoDB
An introduction to MongoDB
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 

Similar to Blending MongoDB and RDBMS for ecommerce

Forking Oryx at Intalio
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at Intalio
Antoine Toulme
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source Bridge
Chris Anderson
 

Similar to Blending MongoDB and RDBMS for ecommerce (20)

Augmenting RDBMS with MongoDB for ecommerce
Augmenting RDBMS with MongoDB for ecommerceAugmenting RDBMS with MongoDB for ecommerce
Augmenting RDBMS with MongoDB for ecommerce
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
Optimize drupal using mongo db
Optimize drupal using mongo dbOptimize drupal using mongo db
Optimize drupal using mongo db
 
NUS iOS Swift Talk
NUS iOS Swift TalkNUS iOS Swift Talk
NUS iOS Swift Talk
 
03 introduction to graph databases
03   introduction to graph databases03   introduction to graph databases
03 introduction to graph databases
 
Using Mongo At Shopwiki
Using Mongo At ShopwikiUsing Mongo At Shopwiki
Using Mongo At Shopwiki
 
OrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KWOrientDB & Node.js Overview - JS.Everywhere() KW
OrientDB & Node.js Overview - JS.Everywhere() KW
 
Azure DocumentDB: Advanced Features for Large Scale-Apps
Azure DocumentDB: Advanced Features for Large Scale-AppsAzure DocumentDB: Advanced Features for Large Scale-Apps
Azure DocumentDB: Advanced Features for Large Scale-Apps
 
Forking Oryx at Intalio
Forking Oryx at IntalioForking Oryx at Intalio
Forking Oryx at Intalio
 
Retail referencearchitecture productcatalog
Retail referencearchitecture productcatalogRetail referencearchitecture productcatalog
Retail referencearchitecture productcatalog
 
CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source Bridge
 
HTML5, CSS3, and other fancy buzzwords
HTML5, CSS3, and other fancy buzzwordsHTML5, CSS3, and other fancy buzzwords
HTML5, CSS3, and other fancy buzzwords
 
CouchDB introduction
CouchDB introductionCouchDB introduction
CouchDB introduction
 
Text Analytic Summit 2010
Text Analytic Summit 2010Text Analytic Summit 2010
Text Analytic Summit 2010
 
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
NOSQL101, Or: How I Learned To Stop Worrying And Love The Mongo!
 
iOS Visual F/X Using GLSL
iOS Visual F/X Using GLSLiOS Visual F/X Using GLSL
iOS Visual F/X Using GLSL
 
MongoDB at ZPUGDC
MongoDB at ZPUGDCMongoDB at ZPUGDC
MongoDB at ZPUGDC
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaParallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
DotNetNuke World CSS3
DotNetNuke World CSS3DotNetNuke World CSS3
DotNetNuke World CSS3
 

More from Steven Francia

OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
Steven Francia
 
MongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataMongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous Data
Steven Francia
 

More from Steven Francia (20)

State of the Gopher Nation - Golang - August 2017
State of the Gopher Nation - Golang - August 2017State of the Gopher Nation - Golang - August 2017
State of the Gopher Nation - Golang - August 2017
 
Building Awesome CLI apps in Go
Building Awesome CLI apps in GoBuilding Awesome CLI apps in Go
Building Awesome CLI apps in Go
 
The Future of the Operating System - Keynote LinuxCon 2015
The Future of the Operating System -  Keynote LinuxCon 2015The Future of the Operating System -  Keynote LinuxCon 2015
The Future of the Operating System - Keynote LinuxCon 2015
 
7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)7 Common Mistakes in Go (2015)
7 Common Mistakes in Go (2015)
 
What every successful open source project needs
What every successful open source project needsWhat every successful open source project needs
What every successful open source project needs
 
7 Common mistakes in Go and when to avoid them
7 Common mistakes in Go and when to avoid them7 Common mistakes in Go and when to avoid them
7 Common mistakes in Go and when to avoid them
 
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Go for Object Oriented Programmers or Object Oriented Programming without Obj...Go for Object Oriented Programmers or Object Oriented Programming without Obj...
Go for Object Oriented Programmers or Object Oriented Programming without Obj...
 
Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go Painless Data Storage with MongoDB & Go
Painless Data Storage with MongoDB & Go
 
Getting Started with Go
Getting Started with GoGetting Started with Go
Getting Started with Go
 
Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013Build your first MongoDB App in Ruby @ StrangeLoop 2013
Build your first MongoDB App in Ruby @ StrangeLoop 2013
 
Modern Database Systems (for Genealogy)
Modern Database Systems (for Genealogy)Modern Database Systems (for Genealogy)
Modern Database Systems (for Genealogy)
 
Introduction to MongoDB and Hadoop
Introduction to MongoDB and HadoopIntroduction to MongoDB and Hadoop
Introduction to MongoDB and Hadoop
 
Future of data
Future of dataFuture of data
Future of data
 
MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012MongoDB, Hadoop and humongous data - MongoSV 2012
MongoDB, Hadoop and humongous data - MongoSV 2012
 
Big data for the rest of us
Big data for the rest of usBig data for the rest of us
Big data for the rest of us
 
OSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB TutorialOSCON 2012 MongoDB Tutorial
OSCON 2012 MongoDB Tutorial
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center Strategies
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
MongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous DataMongoDB, Hadoop and Humongous Data
MongoDB, Hadoop and Humongous Data
 

Blending MongoDB and RDBMS for ecommerce

  • 1. Blending MongoDB with RDBMS for e-commerce
  • 2. My name is Steve Francia @spf13
  • 3. • 15+ years building e-commerce • Long time open source contributor • Entrepreneur • Hacker, father, husband, skate punk • VP Engineering @ OpenSky
  • 4. My name is Justin Hileman @bobthecow
  • 5. • 10+ years making the Internet awesomer • Open Source contributor • Vespa rider, swing dancer, coder, standardista • Software Engineer @ OpenSky
  • 6. We work for OpenSky http://opensky.com
  • 7. OpenSky is a new way to shop OpenSky connects you with innovators, trendsetters and tastemakers.You choose the ones you like and each week they invite you to their private online sales.
  • 8. OpenSky Loves Open Source • PHP 5.3 • Apache2 • Symfony2 • Doctrine2 • jQuery • Mule • HornetQ • MongoDB • nginx • varnish
  • 9. We contribute to many open source projects and pioneer innovative solutions using them
  • 10. OpenSky was the first e-commerce site built on MongoDB ... also the first e-commerce site built on Symfony2
  • 11. Why NoSQL for e-commerce? Using the right solution for each situation
  • 12. Data dilemma of e-commerce Pick One
  • 13. Data dilemma of e-commerce Pick One • Stick to one vertical (Sane schema)
  • 14. Data dilemma of e-commerce Pick One • Stick to one vertical (Sane schema) • Flexibility (Insane schema)
  • 16. Sane schema • Works ... for a while
  • 17. Sane schema • Works ... for a while • Fine for a few types of products
  • 18. Sane schema • Works ... for a while • Fine for a few types of products • Not possible when more product types introduced
  • 19. Let’s Use an Example
  • 20. Let’s Use an Example How about we start with books
  • 21. Book Product Schema Product { id: sku: General Product product dimensions: shipping weight: attributes MSRP: price: description: ... author: Orson Scott Card title: Enders Game binding: Hardcover publication date: July 15, 1994 Book Specific publisher name: Tor Science Fiction attributes number of pages: 352 ISBN: 0812550706 language: English ...
  • 23. Seems simple enough What happens when we add another vertical... say music albums
  • 24. Album Product Schema Product { id: sku: General Product product dimensions: attributes stay the shipping weight: MSRP: same price: description: ... artist: MxPx title: Panic Album Specific release date: June 7, 2005 attributes are label: Side One Dummy track listing: [ The Darkest ... different language: English format: CD ...
  • 25. Okay, it’s getting hairy but is still manageable, right?
  • 26. Okay, it’s getting hairy but is still manageable, right? Now the business want to sell jeans
  • 27. Jeans Product Schema Product { id: General Product sku: product dimensions: attributes stay the shipping weight: same MSRP: price: description: ... brand: Lucky gender: Mens Jeans specific make: Vintage attributes are totally style: Straight Cut different ... and not length: 34 width: 34 consistent across color: Hipster brands & make material: Cotten Blend ...
  • 29. We need a flexible schema in RDBMS
  • 30. We need a flexible schema in RDBMS We got this ... right?
  • 31. Many approaches dealing with unknown unknowns in RDBMS
  • 32. Many approaches dealing with unknown unknowns in RDBMS None work well
  • 33. EAV as popularized by Magento “For purposes of flexibility, the Magneto database heavily utilizes an Entity-Attribute-Value (EAV) data model. As is often the case, the cost of flexibility is complexity - Magento is no exception. The process of manipulating data in Magento is often more “involved” than that typically experienced using traditional relational tables.” - Varien
  • 34. EAV • Crazy SQL queries • Hundreds of joins in a query... or • Hundreds of queries joined in the application • No database enforced integrity
  • 35. Did I say crazy SQL (this is a single query)
  • 36. Did I say crazy SQL (this is a single query) You may have trouble reading this in the back
  • 38. Single Table Inheritance (insanely wide tables) • No data integrity enforcement • Only can use FK for common elements • Very wasteful (but disk is cheap!) • Can’t effectively index
  • 39. Generic Columns • No data integrity enforcement • No data type enforcement • Only can use FK for common elements • Wasteful (but disk is cheap!) • Can’t index
  • 40. Serialized in Blob • Not searchable • No integrity • All the disadvantages of a document store, but none of the advantages • Never should be used • One exception is Oracle XML which operates similar to a document store
  • 41. Concrete Table Inheritance (a table for each product attribute set) • Allows for data integrity • Querying across attribute sets quite hard to do (lots of joins, OR statements and full table scanning) • New table needs to be created for each new attribute set
  • 42. Class table inheritance (single product table, each attribute set in own table) • Likely best solution within the constraint of SQL • Supports data type enforcement • No data integrity enforcement • Easy querying across categories (for browse pages) since common data in single table • Every set needs a new table • Requires a ton of forsight, as changes are very complicated
  • 43. MongoDB to the Rescue
  • 44. MongoDB to the Rescue • Flexible (and sane) Schema
  • 45. MongoDB to the Rescue • Flexible (and sane) Schema • Easily searchable
  • 46. MongoDB to the Rescue • Flexible (and sane) Schema • Easily searchable • Easily accessible
  • 47. MongoDB to the Rescue • Flexible (and sane) Schema • Easily searchable • Easily accessible • Fast
  • 49. { { sku: "00e8da9c", sku: "00e8da9d", type: "Audio Album", type: "Film", title: "Hoss", title: "The Matrix", description: "by Lagwagon", description: "Set in the 22nd century, Th asin: "B0000007QG", asin: "B000P0J0AQ", shipping: { shipping: { weight: 6, weight: 6, dimensions: { dimensions: { width: 10, width: 10, height: 10, height: 10, depth: 1 depth: 1 }, }, }, }, pricing: { pricing: { list: 1000, list: 1200, retail: 800, retail: 1100, savings: 200, savings: 100, pct_savings: 20 pct_savings: 8.5 }, }, details: { details: { title: "Hoss", title: "The Matrix",
  • 50. pct_savings: 20 pct_savings: 8.5 }, }, details: { details: { title: "Hoss", title: "The Matrix", artist: "Lagwagon", director: [ "Andy Wachowski", "Larry Wa genre: [ "Punk", "Hardcore", "Indie Rock" ], [ "Andy Wachowski", "Larry Wach writer: label: "Fat Wreck Chords", actor: [ "Keanu Reeves" , "Lawrence Fis number_of_discs: 1, genre: [ "Science Fiction", "Action" ], issue_date: "November 21, 1995", number_of_discs: 1, format: "CD", issue_date: "May 15 2007", alternate_formats: [ 'Vinyl', 'MP3' ],original_release_date: "1999", tracks: [ disc_format: "DVD", "Kids Don't Like To Share", rating: "R", "Violins", alternate_formats: [ 'VHS', 'Bluray' ], "Name Dropping", run_time: "136", "Bombs Away", studio: "Warner Bros", "Move The Car", language: "English", "Sleep", format: [ "AC-3", "Closed-captioned", " "Sick", aspect_ratio: "1.66:1" "Rifle", }, "Weak", } "Black Eye", "Bro Dependent", "Razor Burn", "Shaving Your Head", "Ride The Snake", ],
  • 52. db.products.find( { 'name': "The Matrix" } );
  • 53. db.products.find( { 'name': "The Matrix" } ); { "_id": ObjectId("4d8ad78b46b731a22943d3d3"), "sku": "00e8da9d", "type": "Film", "name": "The Matrix", "description": "Set in the 22nd century, The Matrix...", "asin": "B000P0J0AQ", "shipping": { "weight": 6, "dimensions": { "width": 10, "height": 10, "depth": 1 } }, "pricing": {
  • 55. db.products.find( { 'details.actor': "Groucho Marx" } ); }, "pricing": { "list": 1000, "retail": 800, "savings": 200, "pct_savings": 20 }, "details": { "title": "A Night at the Opera", "director": "Sam Wood", "actor": ["Groucho Marx", "Chico Marx", "Harpo Marx"], "genre": "Comedy", "number_of_discs": 1, "issue_date": "May 4 2004", "original_release_date": "1935", "disc_format": "DVD",
  • 56. db.products.find( { 'details.genre': "Jazz", 'details.format': "CD" } );
  • 57. db.products.find( { 'details.genre': "Jazz", 'details.format': "CD" } ); "list": 1200, "retail": 1100, "savings": 100, "pct_savings": 8 }, "details": { "title": "A Love Supreme [Original Recording Reissued]", "artist": "John Coltrane", "genre": ["Jazz", "General"], "format": "CD", "label": "Impulse Records", "number_of_discs": 1, "issue_date": "December 9, 1964", "alternate_formats": ["Vinyl", "MP3"], "tracks": [ "A Love Supreme Part I: Acknowledgement",
  • 58. db.products.find( { 'details.actor': { $all: ['James Stewart', 'Donna Reed'] } } );
  • 59. db.products.find( { 'details.actor': { $all: ['James Stewart', 'Donna Reed'] } } ); }, "details": { "title": "It's a Wonderful Life", "director": "Frank Capra", "actor": ["James Stewart", "Donna Reed", "Lionel Barrymore"], "writer": [ "Frank Capra", "Albert Hackett", "Frances Goodrich", "Jo Swerling", "Michael Wilson" ], "genre": "Drama", "number_of_discs": 1, "issue_date": "Oct 31 2006", "original_release_date": "1947",
  • 60. Wanna Play? • grab products.js from http://github.com/spf13/mongoProducts • mongo --shell products.js • > use mongoProducts
  • 61. Embedded documents are great for orders • Ordered items need to be fixed at the time of purchase • Embed them right in the order db.order.find( { 'items.sku': '00e8da9f' } ); db.order.find( { 'items.details.actor': 'James Stewart' } ).count();
  • 62. Why not NoSQL? Using the right solution for each situation
  • 63. Data (like people) are really sensitive when it comes to money
  • 65. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable
  • 66. Stricter data requirements for $$ • For financial systems any data inconsistency is unacceptable • Perhaps you’ve heard of ACID?
  • 68. What about ACID? Q: Is MongoDB ACID?
  • 69. What about ACID? Q: Is MongoDB ACID? A: Kinda
  • 71. Atomicity • MongoDB does atomic writes
  • 72. Atomicity • MongoDB does atomic writes ... for single document changesets
  • 73. Atomicity • MongoDB does atomic writes ... for single document changesets • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit
  • 75. Consistency • MongoDB can enforce unique keys
  • 76. Consistency • MongoDB can enforce unique keys ... but only on keys shared by every document in the collection
  • 77. Consistency • MongoDB can enforce unique keys ... but only on keys shared by every document in the collection • MongoDB can't enforce referential integrity
  • 79. Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true );
  • 80. Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true );
  • 81. Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true ); • But there are caveats...
  • 82. Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true ); • But there are caveats... • Despite the $atomic keyword, this is not an atomic update, since atomicity implies “all or nothing”
  • 83. Isolation • // Pseudo-isolated updates db.foo.update( { x : 1 } , { $inc : { y : 1 } } , false , true ); • // Isolated updates db.foo.update( { x : 1 , $atomic : 1 } , { $inc : { y : 1 } } , false , true ); • But there are caveats... • Despite the $atomic keyword, this is not an atomic update, since atomicity implies “all or nothing” • An isolated update can only act on a single collection. Multi- collection updates are not transactional, thus not isolatable.
  • 85. Durability • Mongo has this one covered
  • 87.
  • 88. Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write
  • 89. Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write • Unique indexes • This only works on keys used by the entire collection
  • 90. Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write • Unique indexes • This only works on keys used by the entire collection • Isolated (not atomic) single collection updates. • Mongo does not support locking • There are ways to work around this
  • 91. Atomic single document writes • If you need atomic writes across multi-document transactions don't use Mongo • Many e-commerce transactions could be accomplished within a single document write • Unique indexes • This only works on keys used by the entire collection • Isolated (not atomic) single collection updates. • Mongo does not support locking • There are ways to work around this • It’s durable
  • 92. There are ways to guarantee ACID properties in inconsistent databases
  • 93. There are ways to guarantee ACID properties in inconsistent databases (or, as we call them, consistency impaired databases)
  • 95. Optimistic concurrency • Read the current state of a product
  • 96. Optimistic concurrency • Read the current state of a product • Make your changes with the assertion that your product has the same state as it did when you last read it
  • 98. Optimistic concurrency in MongoDB We’ll use an update-if-current strategy.
  • 99. Optimistic concurrency in MongoDB We’ll use an update-if-current strategy. This example is straight from the documentation:
  • 100. Optimistic concurrency in MongoDB We’ll use an update-if-current strategy. This example is straight from the documentation: > t = db.inventory > p = t.findOne({sku:'abc'}) > t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}}); > db.$cmd.findOne({getlasterror:1}); {"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked
  • 101. Optimistic concurrency in MongoDB We’ll use an update-if-current strategy. This example is straight from the documentation: > t = db.inventory > p = t.findOne({sku:'abc'}) > t.update({_id:p._id, qty:p.qty}, {'$inc': {qty: -1}}); > db.$cmd.findOne({getlasterror:1}); {"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked ... If that didn't work, try again until it does.
  • 102. Optimistic concurrency • Read the current state of a product. • Make your changes with the assertion that your product has the same state as it did when you last read it.
  • 103. Optimistic concurrency • Read the current state of a product. • Make your changes with the assertion that your product has the same state as it did when you last read it. • It's possible to use OCC to bootstrap pessimistic concurrency and fake row level locking
  • 104. Optimistic concurrency • Read the current state of a product. • Make your changes with the assertion that your product has the same state as it did when you last read it. • It's possible to use OCC to bootstrap pessimistic concurrency and fake row level locking ... ask me about this some time
  • 105. Optimistic concurrency control assumes an environment with low data contention
  • 106. OCC works great for companies like Amazon • Amazon has a long-tail catalog • A long tail catalog lends itself well to optimistic concurrency, because it has low data contention
  • 108. OCC fails miserably for • eBay
  • 109. OCC fails miserably for • eBay • Gilt
  • 110. OCC fails miserably for • eBay • Gilt • Groupon
  • 111. OCC fails miserably for • eBay • Gilt • Groupon • OpenSky
  • 112. OCC fails miserably for • eBay • Gilt • Groupon • OpenSky • Living Social
  • 113. OCC fails miserably for • eBay • Gilt • Groupon • OpenSky • Living Social • InsertFlashSaleSiteOfTheMinute
  • 114. Flash sales and auctions are defined by high data contention
  • 115. Flash sales and auctions are defined by high data contention • The model doesn't work otherwise
  • 116. Flash sales and auctions are defined by high data contention • The model doesn't work otherwise • They can't afford to be optimistic
  • 117. Can we use pessimistic concurrency with a distributed NoSQL database?
  • 118. Yep.
  • 119. Blending NoSQL & RDBMS Using the right solution for each situation
  • 120. Our goal is to put as much in Mongo as possible • What makes more sense in RDBMS? • Inventory • Orders
  • 121. Inventory requires • Row level locking (or table level locking)
  • 122. Orders require • Row level locking (or table level locking) • Atomic writes (inventory decremented) • Transactions (3rd party processing)
  • 123. Inventory & checkout transactions
  • 124. Commerce is ACID In Real Life
  • 125.
  • 126. 1. I go to Barneys and see a pair of shoes I just have to buy.
  • 127. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf).
  • 128. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them:
  • 129. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented.
  • 130. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx.
  • 131. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx. 4. If all goes according to plan, I walk out of the store.
  • 132. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx. 4. If all goes according to plan, I walk out of the store. 5. If my card was declined, the shoes are “rolled back”
  • 133. 1. I go to Barneys and see a pair of shoes I just have to buy. 2. I call “dibs” (by grabbing them off the shelf). 3. I take them up to the cash register and purchase them: • Store inventory has been manually decremented. • I pay for them with my trusty AmEx. 4. If all goes according to plan, I walk out of the store. 5. If my card was declined, the shoes are “rolled back” ... out onto the shelves and sold to the next customer who wants them.
  • 134. We follow the same model for e-commerce
  • 135.
  • 136. 1. Select a product.
  • 137. 1. Select a product. 2. Lock the row or table and confirm inventory.
  • 138. 1. Select a product. 2. Lock the row or table and confirm inventory. 3. Purchase the product:
  • 139. 1. Select a product. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory
  • 140. 1. Select a product. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory • Process payment
  • 141. 1. Select a product. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory • Process payment 4. Commit the transaction.
  • 142. 1. Select a product. 2. Lock the row or table and confirm inventory. 3. Purchase the product: • Decrement product inventory • Process payment 4. Commit the transaction. 5. Roll back if anything went wrong.
  • 143. Doctrine (ORM/ODM) to the rescue
  • 144. Doctrine (ORM/ODM) to the rescue It would be possible without them, but we're not that masochistic
  • 145. Data we store in SQL • Order • Order/Shipment • Order/Transaction • Inventory
  • 146. Data we store in MongoDB
  • 147. Data we store in MongoDB • User • Event • Product • TaxRate • Product/Sellable • ... and then I got tired of typing them in • Address • Just imagine this list has • Cart 40 more classes • CreditCard • ...
  • 148. We have the most boring SQL schema ever
  • 149. CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`product_id`) ); CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`sellable_id`) ); CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • 150. Wait. How does inventory live in SQL? Isn’t that a property in one of your Mongo collections?
  • 151. I thought you’d never ask!
  • 152. CREATE TABLE `product_inventory` ( `product_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`product_id`) ); CREATE TABLE `sellable_inventory` ( `sellable_id` char(32) NOT NULL, `inventory` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`sellable_id`) ); CREATE TABLE `orders` ( `id` int(11) NOT NULL AUTO_INCREMENT, `userId` char(32) NOT NULL, `shippingName` varchar(255) DEFAULT NULL, `shippingAddress1` varchar(255) DEFAULT NULL, `shippingAddress2` varchar(255) DEFAULT NULL, `shippingCity` varchar(255) DEFAULT NULL, `shippingState` varchar(2) DEFAULT NULL, `shippingZip` varchar(255) DEFAULT NULL, `billingName` varchar(255) DEFAULT NULL, `billingAddress1` varchar(255) DEFAULT NULL, `billingAddress2` varchar(255) DEFAULT NULL, `billingCity` varchar(255) DEFAULT NULL,
  • 153. Inventory is transient • Product::$inventory is effectively a transient property • Note how I said “effectively”? ... we cheat and persist our transient property to MongoDB as well • We can do this because we never really trust the value stored in Mongo
  • 154. Accuracy is only important when there’s contention
  • 155. Accuracy is only important when there’s contention • For display, sorting and alerts, we can use the value stashed in MongoDB • It’s faster • It’s accurate enough
  • 156. Accuracy is only important when there’s contention • For display, sorting and alerts, we can use the value stashed in MongoDB • It’s faster • It’s accurate enough • For financial transactions, we want the security and comfort of our RDBMS.
  • 157. We keep inventory in sync with listeners
  • 158. We keep inventory in sync with listeners • Every time a new product is created, its inventory is inserted in SQL
  • 159. We keep inventory in sync with listeners • Every time a new product is created, its inventory is inserted in SQL • Every time an order is placed, inventory is verified and decremented
  • 160. We keep inventory in sync with listeners • Every time a new product is created, its inventory is inserted in SQL • Every time an order is placed, inventory is verified and decremented • Whenever the SQL inventory changes, it is saved to MongoDB as well
  • 161. Be careful what you lock
  • 162. Be careful what you lock 1. Acquire inventory row lock and begin transaction 2. Check current product inventory 3. Decrement product inventory 4. Write the Order to SQL 5. Update affected MongoDB documents 6. Commit the transaction 7. Release product inventory lock
  • 163. Making MongoDB and RDBMS relations play nice
  • 165. /** @mongodb:Document(collection="products") */ class Product { /** @mongodb:Id */ private $id; /** @mongodb:String */ private $title; public function getId() { return $this->id; } public function getTitle() { return $this->title; } public function setTitle($title) { $this->title = $title; } }
  • 167. /** * @orm:Entity * @orm:Table(name="orders") * @orm:HasLifecycleCallbacks */ class Order { /** * @orm:Id @orm:Column(type="integer") * @orm:GeneratedValue(strategy="AUTO") */ private $id; /** * @orm:Column(type="string") */ private $productId; /** * @var DocumentsProduct */ private $product; // ... }
  • 168. So how does an RDBMS have a reference to something outside the database?
  • 169. Setting the Product class Order { // ... public function setProduct(Product $product) { $this->productId = $product->getId(); $this->product = $product; } }
  • 170. $productId is mapped and persisted • $product which stores the Product instance is not a persistent entity property
  • 172. OrderPostLoadListener use DoctrineORMEventLifecycleEventArgs; class OrderPostLoadListener { public function postLoad(LifecycleEventArgs $eventArgs) { // get the order entity $order = $eventArgs->getEntity(); // get odm reference to order.product_id $productId = $order->getProductId(); $product = $this->dm->getReference('MyBundle:DocumentProduct', $productId); // set the product on the order $em = $eventArgs->getEntityManager(); $productReflProp = $em->getClassMetadata('MyBundle:EntityOrder') ->reflClass->getProperty('product'); $productReflProp->setAccessible(true); $productReflProp->setValue($order, $product); } }
  • 173. All Together Now // Create a new product and order $product = new Product(); $product->setTitle('Test Product'); $dm->persist($product); $dm->flush(); $order = new Order(); $order->setProduct($product); $em->persist($order); $em->flush(); // Find the order later $order = $em->find('Order', $order->getId()); // Instance of an uninitialized product proxy $product = $order->getProduct(); // Initializes proxy and queries the monogodb database echo "Order Title: " . $product->getTitle(); print_r($order);
  • 174. Read more about this technique Jon Wage, one of OpenSky’s engineers, first wrote about this technique on his personal blog: http://jwage.com You can read the full article here: http://jwage.com/2010/08/25/blending-the- doctrine-orm-and-mongodb-odm/
  • 175. Questions? http://spf13.com @spf13 http://justinhileman.com @bobthecow http://opensky.com PS: We’re hiring!! Contact us at jobs@opensky.com

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. actually, just the first 1/3 of it. \n
  30. \n
  31. Ironically this is how magento solves the performance problems associated with EAV, by caching the data into insanely wide tables.\n
  32. \n
  33. \n
  34. \n
  35. Can’t create a FK as each set references a different table. “Key” really made of attribute table name id and attribute table name\n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. \n\nkyle idea... but we are fairly atomic with authorize.net\n\n
  57. Whenever you use a inter system coordination you need to implement your own atomic checks in the application... But SOAP does have transactions.. so not quite accurate. \n\nkyle idea... but we are fairly atomic with authorize.net\n\n
  58. atomicity, consistency, isolation, durability.\n\n
  59. atomicity, consistency, isolation, durability.\n\n
  60. Mongo has a grip of atomic operations: set, unset, etc.\n
  61. Mongo has a grip of atomic operations: set, unset, etc.\n
  62. Mongo has a grip of atomic operations: set, unset, etc.\n
  63. \n
  64. \n
  65. \n
  66. update( { where }, { values }, upsert?, multiple? )\n\n\n
  67. update( { where }, { values }, upsert?, multiple? )\n\n\n
  68. update( { where }, { values }, upsert?, multiple? )\n\n\n
  69. update( { where }, { values }, upsert?, multiple? )\n\n\n
  70. update( { where }, { values }, upsert?, multiple? )\n\n\n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. lemme show you an example\n
  79. lemme show you an example\n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. Imagine what would happen if everyone tried to access the same record at the same time. Just think of all those spinning while loops :)\n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. And I’ll show you how OpenSky does it.\n
  98. \n
  99. Since we really like MongoDB, we want to keep as much of our data in Mongo as possible.\n
  100. \n
  101. \n
  102. \n
  103. Mind if I tell you a story?\n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. \n
  110. \n
  111. \n
  112. \n
  113. But this sounds like it could get COMPLICATED...\n
  114. But this sounds like it could get COMPLICATED...\n
  115. But this sounds like it could get COMPLICATED...\n
  116. But this sounds like it could get COMPLICATED...\n
  117. But this sounds like it could get COMPLICATED...\n
  118. But this sounds like it could get COMPLICATED...\n
  119. But this sounds like it could get COMPLICATED...\n
  120. \n
  121. \n
  122. \n
  123. \n
  124. \n
  125. \n
  126. \n
  127. \n
  128. \n
  129. \n
  130. \n
  131. \n
  132. \n
  133. Given that split, we just happen to have the most boring SQL schema ever\n
  134. This is pretty much it.\n\nIt goes on for a few more lines, with a few other properties flattened onto the order table. \n
  135. \n
  136. \n
  137. Back to the schema for a second.\n\n- Product ID here is a fake foreign key.\n- Inventory is a real integer.\n\nThat’s all there is to this table.\n
  138. \n
  139. \n
  140. \n
  141. And here’s why we like Doctrine so much.\n
  142. And here’s why we like Doctrine so much.\n
  143. And here’s why we like Doctrine so much.\n
  144. This will look a bit like when I bought those shoes.\n
  145. This will look a bit like when I bought those shoes.\n
  146. This will look a bit like when I bought those shoes.\n
  147. This will look a bit like when I bought those shoes.\n
  148. This will look a bit like when I bought those shoes.\n
  149. This will look a bit like when I bought those shoes.\n
  150. This will look a bit like when I bought those shoes.\n
  151. \n
  152. \n
  153. The interesting parts here are the annotations.\n\nIf you don’t speak PHP annotation, this stores a document with two properties—ID and title—in the `products` collection of a Mongo database.\n
  154. \n
  155. \n
  156. Did you notice the property named `product`? That’s not just a reference to another document, that’s a reference to an entirely different database paradigm.\n\nCheck out the setter:\n
  157. This is key: we set both the product id and a reference to the product itself.\n
  158. When this document is saved in Mongo, the productId will end up in the database, but the product reference will disappear.\n
  159. \n
  160. This is one of those listeners I was telling you about. At a high level:\n\n1. Every time an Order is loaded from the database, this listener is called.\n2. The listener gets the Order’s product id, and creates a Doctrine proxy object.\n3. It uses magick (e.g. reflection) to set the product property of the order to this new proxy.\n
  161. Here’s our inter-db relationship in action.\n\nNote that the product is lazily loaded from MongoDB. Because $product is a proxy, we don’t actually query Mongo until we try to access a property of $product (in this case the title).\n
  162. \n
  163. \n
  164. \n