SlideShare una empresa de Scribd logo
1 de 99
Descargar para leer sin conexión
In-memory databases offer
                     significant gains in               But three issues
                     performance as all data is         have stunted their
                     freely available. There is no      uptake: Address
Traditional disk-                                       spaces only being
                     need to page to and from disk.
oriented database                                       large enough for a
architecture is                                         subset of a typical
showing its age.    This makes joins a                  user’s data. The ‘one
                    problem. When data                  more bit’ problem
                    must be joined across               and durability.
                    multiple machines
                    performance degradation
 Snowflake          is inevitable.                    Distributed in-
 Schemas allow                                        memory databases
 us to mix             But this model only goes       solve these three
 Partitioning          so far. “Connected             problems but at the
 and Replication       Replication” takes us a        price of loosing the
 so joins never        step further allowing us       single address space.
 hit the wire.         to make the best possible
                       use of replication.
The lay of the land: 
  The main architectural
constructs in the database
         industry
Shared
 Disk
ms
   μs
       ns
           ps

1MB Disk/Network
        1MB Main Memory


          0.000,000,000,000
Cross Continental    Main Memory
                  L1 Cache Ref
Round Trip
          Ref
         Cross Network             L2 Cache Ref
         Round Trip
       * L1 ref is about 2 clock cycles or 0.7ns. This is
                           the time it takes light to travel 20cm
Distributed Cache
Taken from “OLTP Through
the Looking Glass, and What
We Found There”
Harizopoulos et al
Shared
                                  Nothing            

                                Teradata, Vertica,
                                 Greenplumb…


                                                                            SN
 Regular                        In-Memory
                                                                        In-Memory
Database
                        Database
                  Drop Disk
                                            Exasol, VoltDB,
Oracle, Sybase,
    MySql
                                Times Ten, HSQL,
                                      KDB
                                                          Distribute
       Hana


                                                                    ODC
                                Distributed
                                 Caching
                               Coherence, Gemfire,
                                  Gigaspaces
Distributed Architecture


 Simplify the Contract.


     Stick to RAM
450 processes
      2TB of RAM
                                         Oracle 
                                        Coherence




Messaging (Topic Based) as a system of record
                (persistence)
Access Layer      Java      Java
                    client
   client

                     API
      API
 Query Layer




                                        Transactions
  Data Layer




                                              Mtms

                                           Cashflows
Persistence Layer
Indexing




Partitioning
               Replication
But your storage is limited by
the memory on a node
Keys Fs-Fz
     Keys Xa-Yd




Scalable storage, bandwidth
and processing
Trader
            Party
         Version 1
        Trade
                 Trader
                           Party
         Version 2
                       Trade
                                Trader
                                          Party
         Version 3
                                      Trade
                                               Trader
                                                         Party
   Version 4
                                                     Trade
…and you need
versioning to do MVCC
Trade
            Trader
         Party


         Party
   Trader
Trade


         Party
                  Trader

Trade
         Party
So better to use
partitioning, spreading
   data around the
         cluster.
Trader
                             Party

                         Trade




Trade
   Trader
                      Party
Trader
                             Party

                         Trade




Trade
   Trader
                      Party
!
This is what using Snowflake Schemas and
  the Connected Replication pattern is all
                   about!
Crosscutting
   Keys




 Common
  Keys
Replicated
Trader
                   Party


          Trade
                            Partitioned
Valuation Legs

             Valuations

art Transaction Mapping

     Cashflow Mapping                                                  Facts:
             Party Alias

            Transaction
                                                                       =>Big, 
             Cashflows                                                 common
                   Legs

                 Parties
                                                                       keys
           Ledger Book

           Source Book
                                                                       Dimensions
            Cost Centre

                Product                                                =>Small,
  Risk Organisation Unit

          Business Unit
                                                                       crosscutting 
             HCS Entity                                                Keys
           Set of Books

                           0   37,500,000   75,000,000   112,500,000          150,000,000
Coherence’s
                     KeyAssociation
                      gives us this
Trades
      MTMs



          Common
            Key
Replicated
Trader
                   Party


          Trade
                            Partitioned
                                 (
Query Layer
Trader
           Party

      Trade




                       Transactions




                                       Data Layer
                            Mtms

                         Cashflows



                    Fact Storage
                    (Partitioned)
Dimensions
                   (repliacte)

   Transactions

        Mtms
                      Facts
     Cashflows
                  (distribute/
                    partition)
Fact Storage
(Partitioned)
Valuation Legs

             Valuations



                                                                       Facts:
art Transaction Mapping

     Cashflow Mapping

             Party Alias
                                                                       =>Big
                                                                       =>Distribute
            Transaction

             Cashflows

                   Legs

                 Parties

           Ledger Book

           Source Book                                                 Dimensions
                                                                       =>Small 
            Cost Centre

                Product

  Risk Organisation Unit
                                                                       => Replicate
          Business Unit

             HCS Entity

           Set of Books

                           0   37,500,000   75,000,000   112,500,000         150,000,000
We use a variant on a
   Snowflake Schema to
 partition big stuff, that has
the same key and replicate
     small stuff that has
     crosscutting keys.
Replicate




Distribute
Select Transaction, MTM, ReferenceData From
MTM, Transaction, Ref Where Cost Centre = ‘CC1’
Select Transaction, MTM, ReferenceData From
MTM, Transaction, Ref Where Cost Centre = ‘CC1’


                                      LBs[]=getLedgerBooksFor(CC1)
                                      SBs[]=getSourceBooksFor(LBs[])
                                      So we have all the bottom level
                                      dimensions needed to query facts



                                   Transactions


                                         Mtms


                                     Cashflows



                                Partitioned
Select Transaction, MTM, ReferenceData From
MTM, Transaction, Ref Where Cost Centre = ‘CC1’


                                      LBs[]=getLedgerBooksFor(CC1)
                                      SBs[]=getSourceBooksFor(LBs[])
                                      So we have all the bottom level
                                      dimensions needed to query facts



                                   Transactions

                         Get all Transactions and
                                         Mtms
                         MTMs (cluster side join) for
                         the passed Source Books
                                      Cashflows



                                Partitioned
Select Transaction, MTM, ReferenceData From
                      MTM, Transaction, Ref Where Cost Centre = ‘CC1’


Populate raw facts                                          LBs[]=getLedgerBooksFor(CC1)
(Transactions) with                                         SBs[]=getSourceBooksFor(LBs[])
dimension data
                                                            So we have all the bottom level
before returning to
                                                            dimensions needed to query facts
client.


                                                         Transactions

                                               Get all Transactions and
                                                               Mtms
                                               MTMs (cluster side join) for
                                               the passed Source Books
                                                            Cashflows



                                                      Partitioned
Replicated                  Partitioned
                  Java
                  client


Dimensions
                   Facts
                   API




We never have to do a distributed join!
So all the big stuff is
  held paritioned



   And we can join
without shipping keys
  around and having
 intermediate results
Trader
                             Party

                         Trade




Trade
   Trader
                      Party
Trader
          Party
         Version 1
      Trade
               Trader
                         Party
         Version 2
                     Trade
                              Trader
                                        Party
         Version 3
                                    Trade
                                             Trader
                                                       Party
   Version 4
                                                   Trade
Trade
            Trader
         Party


         Party
   Trader
Trade


         Party
                  Trader

Trade
         Party
Valuation Legs

            Valuations

rt Transaction Mapping

    Cashflow Mapping

            Party Alias
                                                       Facts
           Transaction

            Cashflows

                  Legs

                Parties       This is a dimension
          Ledger Book
                               •  It has a different
          Source Book

           Cost Centre            key to the Facts.
   Dimensions
               Product
                               •  And it’s BIG
 Risk Organisation Unit

         Business Unit

            HCS Entity

          Set of Books

                          0                                     125,000,000
Party Alias



               Parties



         Ledger Book



         Source Book



          Cost Centre



              Product



Risk Organisation Unit



        Business Unit



           HCS Entity



         Set of Books


                         0   1,250,000   2,500,000   3,750,000   5,000,000
Party Alias



               Parties



         Ledger Book



         Source Book



          Cost Centre



              Product



Risk Organisation Unit



        Business Unit



           HCS Entity



         Set of Books


                         20   1,250,015   2,500,010   3,750,005   5,000,000
So we only replicate
‘Connected’ or ‘Used’
     dimensions
Processing Layer
                       Dimension Caches
                          (Replicated)


                                    Transactions




                                                     Data Layer
As new Facts are added                    Mtms
relevant Dimensions that
they reference are moved
                                      Cashflows
to processing layer caches


                                 Fact Storage
                                 (Partitioned)
Query Layer
     Save Trade
                     (With connected
                                     dimension Caches)

                                     Data Layer
Cache
              Trade
                 (All Normalised)
Store

                                            Partitioned 
              Trigger
   Source             Cache
     Party                        Ccy
     Alias
               Book
Query Layer
                              (With connected
                              dimension Caches)

                              Data Layer
         Trade
               (All Normalised)



Party             Source   Ccy
Alias
             Book
Query Layer
                                        (With connected
                                        dimension Caches)

                                        Data Layer
         Trade
                         (All Normalised)



Party              Source            Ccy
Alias
              Book
 

          Party
            Ledger
           
                 Book
‘Connected Replication’
    A simple pattern which
recurses through the foreign
 keys in the domain model,
 ensuring only ‘Connected’
  dimensions are replicated
Java
               client

Java schema
    API
     Java ‘Stored
                          Procedures’
                         and ‘Triggers’
Partitioned
 Storage
Balancing Replication and Partitioning in a Distributed Java Database
Balancing Replication and Partitioning in a Distributed Java Database

Más contenido relacionado

La actualidad más candente

Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentationsharonyb
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedFlyData Inc.
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentationadvaitdeo
 
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex QueriesGetting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queriestimonk
 
Stream Application Development with Apache Kafka
Stream Application Development with Apache KafkaStream Application Development with Apache Kafka
Stream Application Development with Apache KafkaMatthias J. Sax
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesAmazon Web Services
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013Amazon Web Services
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseAmazon Web Services
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesAmazon Web Services
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentationMat Wall
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbaseDipti Borkar
 

La actualidad más candente (20)

Couchbase presentation
Couchbase presentationCouchbase presentation
Couchbase presentation
 
Scalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query SpeedScalability of Amazon Redshift Data Loading and Query Speed
Scalability of Amazon Redshift Data Loading and Query Speed
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentation
 
Getting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex QueriesGetting Maximum Performance from Amazon Redshift: Complex Queries
Getting Maximum Performance from Amazon Redshift: Complex Queries
 
Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討Amazon DynamoDB 深入探討
Amazon DynamoDB 深入探討
 
Stream Application Development with Apache Kafka
Stream Application Development with Apache KafkaStream Application Development with Apache Kafka
Stream Application Development with Apache Kafka
 
Couchbase 101
Couchbase 101 Couchbase 101
Couchbase 101
 
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar SeriesDeep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
Deep Dive Amazon Redshift for Big Data Analytics - September Webinar Series
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
SmugMug: From MySQL to Amazon DynamoDB (DAT204) | AWS re:Invent 2013
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
Couchbase Day
Couchbase DayCouchbase Day
Couchbase Day
 
NoSQL and Couchbase
NoSQL and CouchbaseNoSQL and Couchbase
NoSQL and Couchbase
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Leveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data WarehouseLeveraging Amazon Redshift for Your Data Warehouse
Leveraging Amazon Redshift for Your Data Warehouse
 
New Database Migration Services & RDS Updates
New Database Migration Services & RDS UpdatesNew Database Migration Services & RDS Updates
New Database Migration Services & RDS Updates
 
NoSql presentation
NoSql presentationNoSql presentation
NoSql presentation
 
Introduction to couchbase
Introduction to couchbaseIntroduction to couchbase
Introduction to couchbase
 

Similar a Balancing Replication and Partitioning in a Distributed Java Database

Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopfordBen Stopford
 
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...JAX London
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...Ben Stopford
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."Basho Technologies
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure javaRoman Elizarov
 
Top Technology Trends
Top Technology Trends Top Technology Trends
Top Technology Trends InnoTech
 
Databases for Storage Engineers
Databases for Storage EngineersDatabases for Storage Engineers
Databases for Storage EngineersThomas Kejser
 
Memory-Based Cloud Architectures
Memory-Based Cloud ArchitecturesMemory-Based Cloud Architectures
Memory-Based Cloud Architectures小新 制造
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non RelazionaliNoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non RelazionaliSteve Maraspin
 
Seattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / CassandraSeattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / Cassandraclive boulton
 
Top Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallasTop Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallasInnoTech
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase
 
Re-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw AwayRe-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw AwayDATAVERSITY
 
There is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky VersionThere is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky VersionOpen Spectrum Inc
 
Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Brent Salisbury
 
Acunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu
 
Benchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseBenchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseChristopher Choi
 

Similar a Balancing Replication and Partitioning in a Distributed Java Database (20)

Advanced databases ben stopford
Advanced databases   ben stopfordAdvanced databases   ben stopford
Advanced databases ben stopford
 
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...
 
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for H...
 
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
Basho and Riak at GOTO Stockholm:  "Don't Use My Database."Basho and Riak at GOTO Stockholm:  "Don't Use My Database."
Basho and Riak at GOTO Stockholm: "Don't Use My Database."
 
Millions quotes per second in pure java
Millions quotes per second in pure javaMillions quotes per second in pure java
Millions quotes per second in pure java
 
Top Technology Trends
Top Technology Trends Top Technology Trends
Top Technology Trends
 
Databases for Storage Engineers
Databases for Storage EngineersDatabases for Storage Engineers
Databases for Storage Engineers
 
Memory-Based Cloud Architectures
Memory-Based Cloud ArchitecturesMemory-Based Cloud Architectures
Memory-Based Cloud Architectures
 
NoSQL
NoSQLNoSQL
NoSQL
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non RelazionaliNoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
 
Seattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / CassandraSeattle Scalability - GigaSpaces / Cassandra
Seattle Scalability - GigaSpaces / Cassandra
 
Top Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallasTop Technology Trends for Virtualization dallas
Top Technology Trends for Virtualization dallas
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
 
Re-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw AwayRe-inventing the Database: What to Keep and What to Throw Away
Re-inventing the Database: What to Keep and What to Throw Away
 
There is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky VersionThere is NO CLOUD: Geeky Version
There is NO CLOUD: Geeky Version
 
Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012Software Defined Data Centers - June 2012
Software Defined Data Centers - June 2012
 
Acunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFP
 
Benchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBaseBenchmarking MongoDB and CouchBase
Benchmarking MongoDB and CouchBase
 

Más de Ben Stopford

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache KafkaBen Stopford
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven MicroservicesBen Stopford
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessBen Stopford
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationBen Stopford
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBen Stopford
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with StreamsBen Stopford
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Ben Stopford
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBen Stopford
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsBen Stopford
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache KafkaBen Stopford
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Ben Stopford
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Ben Stopford
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache KafkaBen Stopford
 
A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojureBen Stopford
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Big Data & the Enterprise
Big Data & the EnterpriseBig Data & the Enterprise
Big Data & the EnterpriseBen Stopford
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Ben Stopford
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Ben Stopford
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideBen Stopford
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 

Más de Ben Stopford (20)

10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka10 Principals for Effective Event-Driven Microservices with Apache Kafka
10 Principals for Effective Event-Driven Microservices with Apache Kafka
 
10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices10 Principals for Effective Event Driven Microservices
10 Principals for Effective Event Driven Microservices
 
The Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and ServerlessThe Future of Streaming: Global Apps, Event Stores and Serverless
The Future of Streaming: Global Apps, Event Stores and Serverless
 
A Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices GenerationA Global Source of Truth for the Microservices Generation
A Global Source of Truth for the Microservices Generation
 
Building Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka StreamsBuilding Event Driven Services with Kafka Streams
Building Event Driven Services with Kafka Streams
 
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017  - The Data Dichotomy- Rethinking Data and Services with StreamsNDC London 2017  - The Data Dichotomy- Rethinking Data and Services with Streams
NDC London 2017 - The Data Dichotomy- Rethinking Data and Services with Streams
 
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
Building Event Driven Services with Apache Kafka and Kafka Streams - Devoxx B...
 
Building Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful StreamsBuilding Event Driven Services with Stateful Streams
Building Event Driven Services with Stateful Streams
 
Devoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful StreamsDevoxx London 2017 - Rethinking Services With Stateful Streams
Devoxx London 2017 - Rethinking Services With Stateful Streams
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
 
Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy Event Driven Services Part 1: The Data Dichotomy
Event Driven Services Part 1: The Data Dichotomy
 
Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...Event Driven Services Part 3: Putting the Micro into Microservices with State...
Event Driven Services Part 3: Putting the Micro into Microservices with State...
 
Data Pipelines with Apache Kafka
Data Pipelines with Apache KafkaData Pipelines with Apache Kafka
Data Pipelines with Apache Kafka
 
A little bit of clojure
A little bit of clojureA little bit of clojure
A little bit of clojure
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Big Data & the Enterprise
Big Data & the EnterpriseBig Data & the Enterprise
Big Data & the Enterprise
 
Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012Where Does Big Data Meet Big Database - QCon 2012
Where Does Big Data Meet Big Database - QCon 2012
 
Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?Test-Oriented Languages: Is it time for a new era?
Test-Oriented Languages: Is it time for a new era?
 
Ideas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental DivideIdeas for Distributing Skills Across a Continental Divide
Ideas for Distributing Skills Across a Continental Divide
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 

Último

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Balancing Replication and Partitioning in a Distributed Java Database

  • 1.
  • 2. In-memory databases offer significant gains in But three issues performance as all data is have stunted their freely available. There is no uptake: Address Traditional disk- spaces only being need to page to and from disk. oriented database large enough for a architecture is subset of a typical showing its age. This makes joins a user’s data. The ‘one problem. When data more bit’ problem must be joined across and durability. multiple machines performance degradation Snowflake is inevitable. Distributed in- Schemas allow memory databases us to mix But this model only goes solve these three Partitioning so far. “Connected problems but at the and Replication Replication” takes us a price of loosing the so joins never step further allowing us single address space. hit the wire. to make the best possible use of replication.
  • 3.
  • 4.
  • 5. The lay of the land: The main architectural constructs in the database industry
  • 6.
  • 8.
  • 9.
  • 10. ms μs ns ps 1MB Disk/Network 1MB Main Memory 0.000,000,000,000 Cross Continental Main Memory L1 Cache Ref Round Trip Ref Cross Network L2 Cache Ref Round Trip * L1 ref is about 2 clock cycles or 0.7ns. This is the time it takes light to travel 20cm
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 20.
  • 21. Taken from “OLTP Through the Looking Glass, and What We Found There” Harizopoulos et al
  • 22.
  • 23. Shared Nothing Teradata, Vertica, Greenplumb… SN Regular In-Memory In-Memory Database Database Drop Disk Exasol, VoltDB, Oracle, Sybase, MySql Times Ten, HSQL, KDB Distribute Hana ODC Distributed Caching Coherence, Gemfire, Gigaspaces
  • 24. Distributed Architecture Simplify the Contract. Stick to RAM
  • 25. 450 processes 2TB of RAM Oracle Coherence Messaging (Topic Based) as a system of record (persistence)
  • 26. Access Layer Java Java client client API API Query Layer Transactions Data Layer Mtms Cashflows Persistence Layer
  • 27. Indexing Partitioning Replication
  • 28. But your storage is limited by the memory on a node
  • 29. Keys Fs-Fz Keys Xa-Yd Scalable storage, bandwidth and processing
  • 30.
  • 31.
  • 32.
  • 33. Trader Party Version 1 Trade Trader Party Version 2 Trade Trader Party Version 3 Trade Trader Party Version 4 Trade …and you need versioning to do MVCC
  • 34. Trade Trader Party Party Trader Trade Party Trader Trade Party
  • 35. So better to use partitioning, spreading data around the cluster.
  • 36. Trader Party Trade Trade Trader Party
  • 37. Trader Party Trade Trade Trader Party
  • 38.
  • 39.
  • 40.
  • 41. ! This is what using Snowflake Schemas and the Connected Replication pattern is all about!
  • 42.
  • 43.
  • 44. Crosscutting Keys Common Keys
  • 45. Replicated Trader Party Trade Partitioned
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52. Valuation Legs Valuations art Transaction Mapping Cashflow Mapping Facts: Party Alias Transaction =>Big, Cashflows common Legs Parties keys Ledger Book Source Book Dimensions Cost Centre Product =>Small, Risk Organisation Unit Business Unit crosscutting HCS Entity Keys Set of Books 0 37,500,000 75,000,000 112,500,000 150,000,000
  • 53.
  • 54. Coherence’s KeyAssociation gives us this Trades MTMs Common Key
  • 55. Replicated Trader Party Trade Partitioned (
  • 56. Query Layer Trader Party Trade Transactions Data Layer Mtms Cashflows Fact Storage (Partitioned)
  • 57. Dimensions (repliacte) Transactions Mtms Facts Cashflows (distribute/ partition) Fact Storage (Partitioned)
  • 58. Valuation Legs Valuations Facts: art Transaction Mapping Cashflow Mapping Party Alias =>Big =>Distribute Transaction Cashflows Legs Parties Ledger Book Source Book Dimensions =>Small Cost Centre Product Risk Organisation Unit => Replicate Business Unit HCS Entity Set of Books 0 37,500,000 75,000,000 112,500,000 150,000,000
  • 59. We use a variant on a Snowflake Schema to partition big stuff, that has the same key and replicate small stuff that has crosscutting keys.
  • 61. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’
  • 62.
  • 63. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’ LBs[]=getLedgerBooksFor(CC1) SBs[]=getSourceBooksFor(LBs[]) So we have all the bottom level dimensions needed to query facts Transactions Mtms Cashflows Partitioned
  • 64. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’ LBs[]=getLedgerBooksFor(CC1) SBs[]=getSourceBooksFor(LBs[]) So we have all the bottom level dimensions needed to query facts Transactions Get all Transactions and Mtms MTMs (cluster side join) for the passed Source Books Cashflows Partitioned
  • 65.
  • 66. Select Transaction, MTM, ReferenceData From MTM, Transaction, Ref Where Cost Centre = ‘CC1’ Populate raw facts LBs[]=getLedgerBooksFor(CC1) (Transactions) with SBs[]=getSourceBooksFor(LBs[]) dimension data So we have all the bottom level before returning to dimensions needed to query facts client. Transactions Get all Transactions and Mtms MTMs (cluster side join) for the passed Source Books Cashflows Partitioned
  • 67.
  • 68. Replicated Partitioned Java client Dimensions Facts API We never have to do a distributed join!
  • 69. So all the big stuff is held paritioned And we can join without shipping keys around and having intermediate results
  • 70. Trader Party Trade Trade Trader Party
  • 71. Trader Party Version 1 Trade Trader Party Version 2 Trade Trader Party Version 3 Trade Trader Party Version 4 Trade
  • 72. Trade Trader Party Party Trader Trade Party Trader Trade Party
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78. Valuation Legs Valuations rt Transaction Mapping Cashflow Mapping Party Alias Facts Transaction Cashflows Legs Parties This is a dimension Ledger Book •  It has a different Source Book Cost Centre key to the Facts. Dimensions Product •  And it’s BIG Risk Organisation Unit Business Unit HCS Entity Set of Books 0 125,000,000
  • 79.
  • 80.
  • 81.
  • 82.
  • 83. Party Alias Parties Ledger Book Source Book Cost Centre Product Risk Organisation Unit Business Unit HCS Entity Set of Books 0 1,250,000 2,500,000 3,750,000 5,000,000
  • 84. Party Alias Parties Ledger Book Source Book Cost Centre Product Risk Organisation Unit Business Unit HCS Entity Set of Books 20 1,250,015 2,500,010 3,750,005 5,000,000
  • 85.
  • 86. So we only replicate ‘Connected’ or ‘Used’ dimensions
  • 87. Processing Layer Dimension Caches (Replicated) Transactions Data Layer As new Facts are added Mtms relevant Dimensions that they reference are moved Cashflows to processing layer caches Fact Storage (Partitioned)
  • 88.
  • 89. Query Layer Save Trade (With connected dimension Caches) Data Layer Cache Trade (All Normalised) Store Partitioned Trigger Source Cache Party Ccy Alias Book
  • 90. Query Layer (With connected dimension Caches) Data Layer Trade (All Normalised) Party Source Ccy Alias Book
  • 91. Query Layer (With connected dimension Caches) Data Layer Trade (All Normalised) Party Source Ccy Alias Book Party Ledger Book
  • 92. ‘Connected Replication’ A simple pattern which recurses through the foreign keys in the domain model, ensuring only ‘Connected’ dimensions are replicated
  • 93.
  • 94. Java client Java schema API Java ‘Stored Procedures’ and ‘Triggers’
  • 95.
  • 96.

Notas del editor

  1. Big data sets are held distributed and only joined on the grid to collocated objects. Small data sets are held in replicated caches so they can be joined in process (only ‘active’ data is held)
  2. Big data sets are held distributed and only joined on the grid to collocated objects. Small data sets are held in replicated caches so they can be joined in process (only ‘active’ data is held)