Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

High availability lync server 2010

Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Anuncio
Próximo SlideShare
Flume @ Austin HUG 2/17/11
Flume @ Austin HUG 2/17/11
Cargando en…3
×

Eche un vistazo a continuación

1 de 11 Anuncio

Más Contenido Relacionado

Presentaciones para usted (18)

Anuncio

Similares a High availability lync server 2010 (20)

Más de Peter Diaz (20)

Anuncio

Más reciente (20)

High availability lync server 2010

  1. 1. High Availability<br />1<br />
  2. 2. High Availability in OCS 2007 / 2007 R2<br />Office Communications Server (OCS) 2007 and R2<br />Registration<br />Routing<br />Presence<br />Conferencing<br />HLB required for all traffic<br />Bob’s OC<br />Bob’s Phone<br />Architecture:<br /><ul><li>One monolithic Front End Service
  3. 3. Dependency on single shared backend database (Registration, Routing, Presence, Conferencing)</li></ul>2<br />
  4. 4. High Availability – Communications Server “14”<br />Microsoft Communications Server “14”<br />User Services Database<br />(Presence and Conferencing)<br />Architecture:<br /><ul><li>Registrar Role (Registration and Routing). Each registrar has its own SQL Express database
  5. 5. User Services Role (Presence and Conferencing)
  6. 6. Registrar and User Services are collocated in the datacenter (but on different servers)
  7. 7. All user end points register with same Front End
  8. 8. Users are load balanced by Registrars using a Distributed Hash Algorithm
  9. 9. Registrar can be installed in remote locations</li></ul>Registrar<br />Database<br />(Registration and <br />Routing)<br />HLB is optional for SIP traffic<br />(DNS LB is recommended)<br />HLB still required for client-server<br />HTTP Traffic<br />Bob’s OC<br />Bob’s Phone<br />3<br />
  10. 10. Resiliency Architecture<br />4<br />Branch Office<br />Registrar<br />Data Center - EE Pool 1<br />Presence<br />Conferencing<br />SBA<br />Backup<br />Registrar<br />Pool<br />AD & DNS<br />Joe’s Primary Registrar = SBA., User Services = EE Pool1 <br />Registrar<br />(Registration<br />& Routing)<br />Data Center - EE Pool 2<br />Backup<br />Registrar<br />Pool<br />Presence<br />Conferencing<br />AD & DNS<br />Registrar<br />(Registration<br />& Routing)<br />Alice’s Primary Registrar <br />& User Services = EE Pool 2<br />Bob’s Primary Registrar <br />& User Services = EE Pool 1<br />Architecture:<br />Each user has a “Primary Registrar Pool”. Each Registrar Pool can have a “Backup Registrar Pool”<br />User’s client discovers a Registrar Pool through DNS SRV. Directed to “Primary & Backup Registrar Pool”<br />Backup Registrar heart-beats Primary Registrar. If heart-beat not received within Configurable Failover Interval (default = 120 sec for branch offices), Backup starts accepting client registrations<br />
  11. 11. Data Center Voice Resiliency<br />5<br />
  12. 12. Data Center Voice Resiliency (EE)Failover to Backup Data Center<br />North America Data Center<br />Europe Data Center<br />Backup<br />Registrar<br />CS “14”Edge2<br />CS “14” Pool 2<br />CS “14”Edge1<br />CS “14”Pool 1<br />Failover<br />WAN<br /><ul><li>Communications Server “14” Pool. That Communications Server “14” Pool directs client to primary and backup SIP registrar
  13. 13. Client attempts connect to Primary Registrar Pool, if fails, connects to Backup
  14. 14. Limited feature set available on failover
  15. 15. Enable/Disable Automatic failover, Configurable Failover interval
  16. 16. Automatic Failback, Configurable Failback interval (No manual failback. Workaround: Stop Front End Services on Primary Registrar pool servers)
  17. 17. What happens if Primary Data Center cannot be restored?</li></ul>6<br />
  18. 18. Data Center Voice Resiliency (SE)Failover to Backup Data Center<br />North America Data Center<br />Europe Data Center<br />Backup<br />Registrar<br />CS “14”Edge2<br />CS “14” SE 2<br />CS “14”Edge1<br />CS “14” SE 1<br />Failover<br />WAN<br />WAN<br /><ul><li>SE Servers operate as separate systems
  19. 19. Client DNS SRV request discovers (one or multiple) Communications Server “14” SE. That Communications Server “14” SE sever directs client to primary and backup SIP registrar
  20. 20. Client attempts connect to Primary Registrar, if fails, connects to Backup
  21. 21. Limited feature set available on failover
  22. 22. Enable/Disable Automatic failover, Configurable Failover interval
  23. 23. Automatic Failback, Configurable Failback interval (No manual failback. Workaround: Stop Front End Services on Primary Registrar servers)
  24. 24. If Primary Data Center cannot be restored:
  25. 25. Restore Central management Server in backup datacenter
  26. 26. Restore other services including Presence, Conferencing by “moving” users to other Pool</li></ul>7<br />7<br />
  27. 27. Data Center Voice ResiliencyFailover to Backup Data Center (Discovery)<br />North America Data Center<br />Europe Data Center<br />Backup<br />Registrar<br />CS “14”Edge1<br />CS “14”Edge2<br />CS “14”Pool 2<br />CS “14” DirectorPool <br />CS “14” Pool 1<br />AD DS & DNS<br />(6)<br />(5)<br />(4)<br />(3)<br />(2)<br />(1)<br />WAN<br />Client DNS SRV request. Example: DNS SRV for _sipinternaltls._tcp.contoso.com<br />DNS SRV Response includes<br /><ul><li>CS Director Pool.contoso.com:5061 Priority=0, Weight=10
  28. 28. CSPool2.contoso.com:5061 Priority=1 , Weight=10</li></ul>Client connects via TLS to Communications Server “14” Director Pool. Sends SIP Register. Authenticates.<br />Communications Server “14” Director Pool redirects client. SIP 301 includes Primary & Backup Registrar pool<br />If Primary Registrar Pool is available, client connects and registers with it<br />Else client connects and registers with Backup Registrar Pool (CS Pool 2)<br />8<br />
  29. 29. Metropolitan Data Center Resiliency<br />9<br />
  30. 30. Metropolitan Data Center ResiliencyCS “14” Pool Extended Across Two Data Centers<br />NY Data Center<br />NJ Data Center<br />Passive SQL<br />Active SQL<br />CS “14”Edge<br />FE 3-4<br />FE 1-2<br />CS “14”Edge<br />Low-Latency<br />WAN<br /><ul><li>Communications Server “14” pools operate as one logical system
  31. 31. Split Front End pool across two datacenters (all FEs active)
  32. 32. SQL Geo cluster for backend (Stretched Virtual Local Area Network (VLAN))
  33. 33. Data replication is done by storage arrays (Ex: EMC SRDF, HP CLX EVA)
  34. 34. Requires low latency WAN (15 milliseconds)
  35. 35. In one site is down, clients are serviced by FEs in other site
  36. 36. Nearly all features available
  37. 37. PSTN termination may affect inbound calls
  38. 38. Failback has to be manually initiated</li></ul>10<br />
  39. 39. Metropolitan Data Center ResiliencyCS “14” Pool Extended Across Two Data Centers<br />NY Data Center<br />NJ Data Center<br />Passive SQL<br />Active SQL<br />CS “14”Edge<br />CS “14”Edge<br />FE 1-2<br />FE 3-4<br />Low-Latency<br />WAN<br />11<br />DNS Srv<br />DNS Server<br />Pool.contoso.com<br />

Notas del editor

  • Slide Objective: In the following section explain briefly what high availability means to us, and what the major challenges have been during the last versions of OCS.Notes:
  • Slide Objective: Explain HA Architecture from W14 legacy systems in few words.Notes:Also for a clear understanding of the new system, highlight that Bob’s OC and his Phone is registered on different Front End Server. It is possible that they connect to a different front end server. In CS “14”, this behavior is no longer possible. Highlight also that a granular separation of the services (registration, routing, presence, and conferencing) is not possible with OCS 2007 &amp; OCS 2007 R2.Also highlight that a pool with multiple front end servers always requires a Hardwareloadbalancer. Remind students that with OCS 2007 R2 only one instance of SQL server was used within one pool.
  • Slide Objective: Explain the CS “14” HA processNotes:Start to explain, that with in CS “14” every user logs on against a predefined frontend server within one pool. (briefly describe the hash algorithm procedure to generate a log on sequence per SIP UIR). Highlight clearly that bob is connected to ONE front end server, regardless of how many sip endpoints he uses in parallel. Looking for details? -&gt; nexthop.info (Introducing DNS Load Balancing in CS 14)Explain that every server within CS 14 has its own SQL express database which is used for several purposes in this example for registration &amp; routing.HLB Is Optional and is NOT Recommended for SIP traffic. DNS LB is recommended to make HLB configuration easier (Web only). In order to reduce failed call incidents due to HLB misconfiguration, it is desirable to get the HLB out of the main SIP routing path HLB is advised for SIP traffic only for the scenario where a customer plans to be in co-existence (OCS 2007 &amp; CS “14”, OCS 2007 R2 &amp; CS “14”) for a large amount of time (ex: six months+)
  • Slide Objective: Discuss architecture within CS “14”Notes:Start with the explanation that within CS “14” every sip account has its primary and secondary (backup) registrar pool. With this feature a sip account is able to log on against various pools, the example above shows that a user (bob) can also register against the SBA in the branch office. Make sure that students understand that presence and conferencing info from pool 1 (data center) are not transferred to pool 2 (data center 2). In case of a failure from pool 1 users are able to log on against backup pool (2) but not all features will be available when they are signed in on their backup pool. A diagram with details will be shown later in this presentation.What does the branch office user count?Branch User’s Primary Registrar Pool = Survivable Branch Appliance (SBA) Backup Registrar Pool = Data Center CS PoolBranch Users always register with the SBA Registrar (Primary) unless it is unavailable
  • Slide Objective:Notes:
  • Slide Objective: Discuss failoverNotes:In this case, CS is NOT split across two data centers, but instead, there are two separate CS installations. Users can fail over from one to the other, but functionality is lost at failover if it relies on information stored in the CS backend. This includes the vast majority of “unavailable features” – user call forwarding settings, etc., can only be restored by manually copying those settings from a backup into the backend in the second pool. Voicemail issues here are the same as in the previous slide.Two modes for failover and failbackAutomatic: backup connection after configurable intervalManual: administrator switch enables connection (manual capability avoids failover on transient)What happens if primary datacenter cannot be restored?Restore Central management Server in backup datacenterRestore other services including Presence, Conferencing by “moving” users to other Pool
  • Slide Objective: Discuss failoverNotes:In this case, CS is NOT split across two data centers, but instead, there are two separate CS installations. Users can fail over from one to the other, but functionality is lost at failover if it relies on information stored in the CS backend. This includes the vast majority of “unavailable features” – user call forwarding settings, etc., can only be restored by manually copying those settings from a backup into the backend in the second pool. Voicemail issues here are the same as in the previous slide.Two modes for failover and failbackAutomatic: backup connection after configurable intervalManual: administrator switch enables connection (manual capability avoids failover on transient
  • Slide Objective: Discuss failoverNotes:1.As the same within OCS 2007 &amp; OCS 2007R2 the client queries DNS Srv to provide a CS 14 Pool FQDN (in this case a Director Server)DNS server the returns a Director Pool FQDN (he can also return multiple addresses for DNS load balancing purposes)2. A TLS sip register request is sent to the director server.The server returns a 401 certificate challenge, (ensure that audience understands that this certificate request is different from an AD cert authority)CS “14” provides an “own” CA which is only used for authentication purposes3. The client connect the CS “14” certificate service with its windows credentials.4. the pool creates a certificate and returns it to the client as well as the server. (its more like a token which is issued from CS “14” to the client). It can only be used within CS. 5. With the issued certificate the client tries again to register against the pool. In this case the director returns a 301 redirect message to redirect the server to the Pool6. If the primary pool becomes unavailable the client automatically connects to the backup registrar.
  • Slide Objective:Notes:
  • Slide Objective: Explain how CS “14” works when split over two data centersNotes:In this case, CS is effectively split across two data centers – if one data center is lost, the user fails over to the second and still has almost all features, much like the case where a single server is lost in an R2 CS pool.Ability to leave voicemail for the user is lost if the user’s DID number terminates in the failed data center and the associated gateways or SIP trunk connections are lost. This is expected to be a likely case.Response group service is an app running on local pool.Further talking points:If you plan to deploy this CS architecture you need to ensure that you have a geo cluster for SQL Server (this is necessary for the functionality of the SQL Server not the CS Server).The entire pool works as a logical unit.
  • Slide Objective: Explain how CS “14” works when split over two data centersNotes:Once again a more detailed flow:CS client queries a local DNS Server (if necessary) to get the poolfqdn (remember this can also happen via Clientcache or DCHP option 120)Afterwards the client connects to its primary front end server. In the event of a failover in datacenter NYThe SQL Server cluster initiates a failover (SQL Server in the backup datacenter becomes the active SQL Server)the client automatically tries to connect to the next server in the generated log on list for the specific SIP URI. This happens till the client is able to log on to the next available server. What happens to the client:The client should sign out and sign in in a short amount of time (media conversation should not break)Why are two options marked as not working in the deck?Ability to leave voicemail for the user is lost if the user’s DID number terminates in the failed data center and the associated gateways or SIP trunk connections are lost. This is expected to be a likely case.

×