Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Próximo SlideShare
Geode - Day 1
Geode - Day 1
Cargando en…3
×
1 de 38

Apache Geode Clubhouse - WAN-based Replication

0

Compartir

Descargar para leer sin conexión

How to use the WAN Gateway feature of Apache Geode to implement multi-site and active-active failover, disaster recovery, and global scale applications.

Libros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Apache Geode Clubhouse - WAN-based Replication

  1. 1. © Copyright 2016 Pivotal. All rights reserved.© Copyright 2013 Pivotal. All rights reserved. WAN Gateway Multi-site and Active-active Design Patterns
  2. 2. © Copyright 2016 Pivotal. All rights reserved. • The multi-site capability connects geographically separated distributed systems. • It is important to understand that while you are presented with a façade that APPEARS like it is all one system, each of the distributed systems actually behaves autonomously • A multi-site installation based on the WAN Gateway consists of two or more distributed systems that are loosely coupled. • Each site manages its own distributed system, and region data is distributed to remote sites using one or more connections. Using WAN Gateways
  3. 3. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways The connections consist of a gateway sender in the sending site and a corresponding gateway receiver in the receiving site.
  4. 4. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies DISASTER RECOVERY Disaster Recovery or Business Continuity is still the most popular design pattern for multi-site replication using the WAN Gateway. One important distinction Geode WAN Gateway is designed to be bi-directional. That makes fail-back much easier.
  5. 5. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Active/Active If your use-case is read-only, or updates are limited to specific entries belonging to a given user, then it is extremely easy to load balance across multiple sites hosting Geode clusters in an Active/Active configuration. Apache Geode IIS Farm DB Clients Apache Geode IIS Farm DB Site 1 Site 2 WAN Gateway
  6. 6. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Active/Passive Alternatively, if you have heavy updates on random data entries, you might want to use an Active/Passive configuration. Apache Geode IIS Farm DB Clients Apache Geode IIS Farm DB Site 1 Site 2 WAN Gateway
  7. 7. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Business Unit Active/Passive Or if you have different groups of users who update different datasets, you can make each cluster active for one Business Unit, and backup for the other. Apache Geode IIS Farm DB Equity Users Apache Geode IIS Farm DB Site 1 Site 2 WAN Gateway Debt Users
  8. 8. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Geographically Separated Many times you are using a multi-site topology to achieve locality of reference for performance purposes
  9. 9. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 1. Exchange Pattern NYSE LSE LSE TSE NYSE, TSE Read--only LSE, TSE Read--only NYSE, LSE Read--only Client connects to all exchanges it needs for writing, uses local copy for read only access.
  10. 10. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 2. The "Realm Manager" Pattern: Use the “Command” pattern to request that an action be performed on your behalf. Request gets forwarded to all distributed systems but only the one with the right permission actually takes the action. Read Only For This Customer Read Only For This Customer Write Permission For This Customer
  11. 11. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 3. Follow the Sun Pattern: This is the "Global book" pattern common in Financial Services. The token is here
  12. 12. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 4. Inventory Allocation Pattern: This pattern is commonly used when there are multiple trading venues and selling short is not allowed. Partial Inventory Partial Inventory Partial Inventory Partial Inventory
  13. 13. © Copyright 2016 Pivotal. All rights reserved. Multi-site Active-active Design Patterns 5. Apology based computing: This is the pattern that Max Feingold refers to when he says: “At global scale, getting the truth is really really expensive.”
  14. 14. © Copyright 2016 Pivotal. All rights reserved. Configuring WAN Gateway Simple Configuration The minimum configuration you need to do is this… <cache> <gateway-sender id="NY" parallel=”true" remote-distributed-system-id="1” /> ... </cache>
  15. 15. © Copyright 2016 Pivotal. All rights reserved. Enabling Persistence for Queues Overflowing Gateway Queues to disk To overflow the Gateway Queues to disk to conserve memory do this… <cache> <gateway-sender id="NY" parallel="false" remote-distributed-system-id="1" enable-persistence="true" disk-store-name="gateway-disk-store" maximum-queue-memory="200” /> ... </cache>
  16. 16. © Copyright 2016 Pivotal. All rights reserved. Multiple dispatcher threads Multiple dispatcher threads for Parallel WAN and Async Event Queues • Geode now defaults to 5 dispatcher threads for a parallel WAN gateway or async event queue. • If you detect that your system is using too much CPU, modify the dispatcher-threads=1 in the gateway-sender attributes. • Default is ordering by key
  17. 17. © Copyright 2016 Pivotal. All rights reserved. Multiple dispatcher threads Configuring Dispatcher Threads and Ordering Policy for a Serial Gateway To increase the number of dispatcher threads and set the ordering policy for a serial gateway sender, use one of the following mechanisms. <cache> <gateway-sender id="NY" parallel="false" remote-distributed-system-id="1" enable-persistence="true" disk-store-name="gateway-disk-store" maximum-queue-memory="200" dispatcher-threads=10 order-policy="key"/> ... </cache>
  18. 18. © Copyright 2016 Pivotal. All rights reserved. Additional Benefits There are additional benefits that the WAN Gateway gives you • Site-wide rolling release--roll each site independently • Geode major release upgrades • Mix of Cloud and On-prem
  19. 19. © Copyright 2016 Pivotal. All rights reserved. Technical Details
  20. 20. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Consistency for WAN Updates • With a distributed WAN configuration, one or more gateway senders asynchronously queue and send region updates to another Geode cluster. • It is possible for multiple sites to send updates for the same region entry at the same time. • It is also possible that, due to a slow WAN connection, a cluster might receive region updates after a considerable delay, and after it has applied more recent updates to the region.
  21. 21. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Consistency for WAN Updates • To ensure that WAN propagated regions eventually reach a consistent state, Geode first ensures that each cluster performs consistency checking to regions before queuing updates to a gateway sender for WAN distribution. • In other words, region conflicts are first detected and resolved in the local cluster, using local timestamp and conflict detection algorithms
  22. 22. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Partitioned Region Consistency • For a partitioned region, Geode maintains consistency by routing all updates on a given key to the Geode member that holds the primary copy of that key. • That member holds a lock on the key while distributing updates to other members that host a copy of the key. • Because all updates to a partitioned region are initially processed on the primary Geode member, all members apply the updates in the same order and consistency is maintained at all times.
  23. 23. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Replicated Region Consistency • For a replicated region, any member that hosts the region can update an entry and distribute that update to other members without locking the entry. • It is possible that two members can update the same entry at the same time (a concurrent update). • It is also possible that, due to network latency, an update in one member is received by other members at a later time, after those members have already applied more recent updates to the entry (an out-of-order update)
  24. 24. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Replicated Region Consistency • For a replicated region, any member that hosts the region can update an entry and distribute that update to other members without locking the entry. • It is possible that two members can update the same entry at the same time (a concurrent update). • It is also possible that, due to network latency, an update in one member is received by other members at a later time, after those members have already applied more recent updates to the entry (an out-of-order update)
  25. 25. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Replicated Region Consistency • If two members update the same entry at the same time, conflict checking ensures that all members eventually arrive at the same value, which is the value of one of the two concurrent updates. • If a member receives an out-of-order update (an update that is received after one or more recent updates were applied), conflict checking ensures that the out-of-order update is discarded and not applied to the cache.
  26. 26. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Consistency for WAN Updates • In a default configuration, the cluster that receives the event examines the timestamp to determine whether or not the event should be applied. • If the timestamp of the update is earlier than the local timestamp, the cluster discards the event. • If the timestamp is the same as the local timestamp, then the entry having the highest distributed system ID is applied (or kept).
  27. 27. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Discovery for Multi-Site Systems • Each Geode cluster in a WAN configuration uses locators to discover remote Geode clusters • In the configuration for each locator in a WAN configuration you must define a unique distributed-system-id property that identifies the local cluster • A locator uses the remote-locators property to define the addresses of one or more locators in remote Geode clusters to use for WAN distribution.
  28. 28. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Discovery for Multi-Site Systems • When a locator starts up, it contacts each locator that is configured in the remote-locators property to exchange information about the available locators and gateway receivers in the cluster. • The locator also shares information about locators and gateway receivers in any other Geode clusters that have connected to the cluster. • Connected clusters can then use the shared gateway receiver information to distribute region events according to their configured gateway senders. • Each time a new locator starts up or an existing locator shuts down, the changed information is broadcast to other connected Geode clusters.
  29. 29. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Gateway Senders • A Geode cluster uses a gateway sender to distribute region events to another, remote Geode cluster. • You can create multiple gateway sender configurations to distribute region events to multiple remote clusters, • A gateway sender always communicates with a gateway receiver in a remote cluster. • Gateway senders do not communicate directly with other cache server instances • Geode provides two types of gateway sender configurations: serial gateway senders and parallel gateway senders
  30. 30. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Serial Gateway Senders • A serial gateway sender distributes region events from a single Geode server in the local cluster to a remote Geode cluster. • Although multiple regions can use the same serial gateway for distribution, a serial gateway uses a single logical event queue to dispatch events for all regions that use the gateway sender. • Because a serial gateway sender distributes all of a region's events through a single Geode member, it provides the most control over ordering region events as they are propagated across the WAN. • However, a serial gateway sender does not provides horizontal scale of throughput for propagating events.
  31. 31. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Parallel Gateway Senders • While parallel gateway senders provide the best throughput for WAN propagation, they provide less control for ordering events. • With a parallel gateway sender, you cannot preserve event ordering for the region as a whole because multiple Geode servers distribute the region events at the same time. • However, the ordering of events for a given partition can be preserved Note: Replicated regions can only be configured to use serial gateway senders.
  32. 32. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Gateway Sender Queues • The queue that a gateway sender uses to distribute events to a remote site can be overflowed to disk as needed, in order to prevent the Geode member from running out of memory. • You should configure the maximum amount of memory that each queue uses, as well as the batch size and frequency for processing batches • You should also configure these queues to persist to disk, so that a gateway sender can pick up where it left off when its member shuts down and is later restarted.
  33. 33. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Multi-threaded dispatcher • By default gateway sender queues (even in serial gateway senders) use 5 threads to dispatch queued events. • If ordering is required on a serial gateway sender you should set the number of dispatcher threads to 1.
  34. 34. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways High Availability for Gateway Senders • When a serial gateway sender configuration is deployed to multiple Geode members, only one "primary” sender is active at a given time. All other serial gateway sender instances are inactive "secondaries" that are available as backups if the primary sender shuts down. • Geode designates the first gateway sender to start up as the primary sender, and all other senders become secondaries. • As gateway senders start up and shut down in the distributed system, Geode ensures that the oldest running gateway sender operates as the primary
  35. 35. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways High Availability for Gateway Senders • A parallel gateway sender is deployed to multiple Geode members by default, and each Geode member that hosts primary buckets for a partitioned region actively distributes data to the remote Geode site. • When you use parallel gateway senders, high availability for WAN distribution is provided if you configure the partitioned region for redundancy. • With a redundant partitioned region, if a member that hosts primary buckets fails or is shut down, then a Geode member that hosts a redundant copy of those buckets takes over WAN distribution for those buckets.
  36. 36. © Copyright 2016 Pivotal. All rights reserved. Using WAN Gateways Gateway Receivers • A gateway receiver configures a physical connection for receiving region events from gateway senders in one or more remote Geode clusters. • A gateway receiver applies each region event to the same region or partition that is hosted in the local Geode member. (An exception is thrown if the receiver receives an event for a region that it does not define.) • Gateway senders use any available gateway receiver in the target cluster to send region events. • You can deploy gateway receiver configurations to multiple Geode members as needed for high availability and load balancing. There are issues with balancing of senders and receivers
  37. 37. © Copyright 2016 Pivotal. All rights reserved. Multi-site Topologies Parallel Multi-site Topology • This is the most often recommended topology • This is one where all sites know about each other. • This is the most robust configuration, where any one of the sites can go down without disrupting communication between the other sites. • A parallel topology also guarantees that no site receives multiple copies of the same message. Parallel Multi-site Topology is the recommended topology for most use-cases. Think of it as a “mesh” configuration.
  38. 38. © Copyright 2016 Pivotal. All rights reserved. Thank you

×