4. Agenda
1 Cluster Introduction
- Cluster, What's it?
- Why cluster?
2
Basic configuration
- Basic cluster infrastructure with load balancing
- Two-node PLF cluster
3
Advanced configuration
- Hot spare server
- Asymmetric load
4
PLF cluster current status
- Story
- Bugs relating to configuration
- Bugs in the implementation
www.exoplatform.com - Copyright 2012 eXo Platform 4
5. Cluster, What is it?
− High scalable and high available virtual server
− Fully transparent to end user
Basic Cluster Infrastructure
www.exoplatform.com - Copyright 2012 eXo Platform 5
6. Why cluster? (1)
− Challenges
− Increase dramatically the traffic of Internet ( annual rate > 100%)
− Increase rapidly workload of servers
− Enhance the availability of servers
− Solutions
− Upgrade the software/hardware of servers
− Multi-server
− Cluster
− Grid
− Cloud
www.exoplatform.com - Copyright 2012 eXo Platform 6
7. Why cluster? (2)
− Cluster Notions
− Scalability
− Load Balancing
− Load Balancing with an appliance
− Load Balancing with a software/hardware combination
mod_proxy, mod_jk, mod_cluster
− High Availability
www.exoplatform.com - Copyright 2012 eXo Platform 7
8. Agenda
1 Cluster Introduction
- Cluster, What's it?
- Why cluster?
2
Basic configuration
- Basic cluster infrastructure with load balancing
- Two-node PLF cluster
3
Advanced configuration
- Hot spare server
- Asymmetric load
4
PLF cluster current status
- Story
- Bugs relating to configuration
- Bugs in the implementation
www.exoplatform.com - Copyright 2012 eXo Platform 8
10. Two-node PLF cluster
− Load Balancer: mod_jk
− Server cluster: 2 nodes ( 2
instances of PLF in same
machine or in 2 distinguished
machines)
− Shared Storage:
− NFS: indexing data, key
for gadget
− DB server: shared PLF
data
www.exoplatform.com - Copyright 2012 eXo Platform 10
11. Mod_jk(1)
− Connector used to connect the Tomcat Servlet Container with Web
Servers
− Use AJP (Apache Jserv Protocol)
− Roles
− Load the servlet container adapter library and initialize it (prior to
serving requests)
− Check and redirect requests to servlet to handle it
www.exoplatform.com - Copyright 2012 eXo Platform 11
12. Mod_jk(2)
− mod_jk configuration step-by-step
− Install apache
− Add mod_jk modules to apache
− Configure mod_jk parameters in apache configuration (httpd.conf)
− Assign URL to apache
− JkMount <URL prefix> <Worker name>
− Configure worker properties: IP, port for AJP, hostname
www.exoplatform.com - Copyright 2012 eXo Platform 12
13. PLF cluster configuration(1)
− Notes
− Use profile all to deploy PLF server in cluster mode
− PLF user external connector to connect DB
− Need to add a external connector if necessary
− Configure datasource
− Config target
$TOMCAT_HOME/conf/server.xml (tomcat server)
$JBOSS_HOME/server/all/gatein-ds.xml (jboss server).
− Parameters
− User name/password, connector, database address
www.exoplatform.com - Copyright 2012 eXo Platform 13
14. PLF cluster configuration(2)
− Configure NFS
Config target
− $TOMCAT_HOME/conf/server.xml (tomcat server)
− $JBOSS_HOME/server/all/deploy/jbossweb.sar/server.xml (jboss
server)
− Configure AJP route
− Config target
− $TOMCAT_HOME/gatein/conf/configuration.properties (tomcat server)
− $JBOSS_HOME/server/all/conf/gatein/configuration.properties (jboss
server)
− Parameters
− Add JVM Route to mark the name of server used by mod_jk
www.exoplatform.com - Copyright 2012 eXo Platform 14
15. PLF cluster configuration(3)
− Activate PLF configuration
Tomcat
− Switch jcr mode to cluster
− Add profile cluster to eXo Profile
− Add preferIPv4Stack parameter to route message for synchronized
Jboss
− Switch jcr mode to cluster
− Add profile cluster to eXo Profile
− Add preferIPv4Stack parameter to route message for synchronized
− Activate Cluster Single On
www.exoplatform.com - Copyright 2012 eXo Platform 15
17. PLF cluster configuration(5)
− Verify cluster
In console, when a new member is added or any downs
INFO [MyClusterPartition] Dead members: 0 ([])
INFO [MyClusterPartition] New Members : 1 ([127.0.0.1:1099])
INFO [MyClusterPartition] All Members : 2 ([127.0.0.1:1199,
127.0.0.1:1099])
www.exoplatform.com - Copyright 2012 eXo Platform 17
18. PLF cluster configuration(6)
− Common errors in configuring and use cluster mode
Error on connection to DB server
− Lack of connector in lib
Add connector
− Wrong database address parameters
Verify database parameters
− DB isn't set to accept remote access
Configure to accept remote access
Indexing data and database aren't matched
− Remove indexing data and database if possible
www.exoplatform.com - Copyright 2012 eXo Platform 18
19. Agenda
1 Cluster Introduction
- Cluster, What's it?
- Why cluster?
2
Basic configuration
- Basic cluster infrastructure with load balancing
- Two-node PLF cluster
3
Advanced configuration
- Hot spare server
- Asymmetric load
4
PLF cluster current status
- Story
- Bugs relating to configuration
- Bugs in the implementation
www.exoplatform.com - Copyright 2012 eXo Platform 19
20. Advanced Configuration(1)
− Hot Spare server
l
Failover mechanism to provide reliability
l
A key component fails, the hot spare is switched into
operation
www.exoplatform.com - Copyright 2012 eXo Platform 20
21. Advanced configuration(2)
− Hot spare configuration
− Set hot spare server for main server
# Define preferred failover node for plfnode1
worker.plfnode1.redirect=plfnode2
− Deactivate hot spare server
# Disable plfnode2 for all requests except failover
worker.plfnode2.activation=disabled
www.exoplatform.com - Copyright 2012 eXo Platform 21
22. Advanced Configuration(3)
•Asymmetric load
l
User to divide the workload among servers
l
Use lbfactor for each server
worker.plfnode1.lbfactor=2
worker.plfnode2.lbfactor=8
www.exoplatform.com - Copyright 2012 eXo Platform 22
23. Agenda
1 Cluster Introduction
- Cluster, What's it?
- Why cluster?
2
Basic configuration
- Basic cluster infrastructure with load balancing
- Two-node PLF cluster
3
Advanced configuration
- Hot spare server
- Asymmetric load
4
PLF cluster current status
- Story
- Bugs relating to configuration
- Bugs in the implementation
www.exoplatform.com - Copyright 2012 eXo Platform 23
24. Story
− Support cluster from PLF 3.0.x
− EPP SP 5.1.x in cluster mode with RedHat
− Lack of test and of evaluation in PLF 3.0.x
− Support fully cluster mode from PLF 3.5.1
− FQA test campaign for cluster
− Cluster configuration changes from PLF 3.0.x to 3.5.x
− Use shared file-system for security token store (EXOGTN-237)
− Reconfigure the parameters for cluster mode (PLF-630)
www.exoplatform.com - Copyright 2012 eXo Platform 24
25. Bugs relating to configuration
− Too many warnings on console
− Related issue: PLF-2652
− Description: Too many warning on console repeatedly
− Workaround: Hide log
− Tomcat: Add to conf/logging.properties
org.jgroups.level = ERROR
org.jgroups.handlers=
java.util.logging.ConsoleHandler,6gatein.org.apache.juli.FileHandler
- Jboss: Add to server/all/conf/jboss-log4j.xml
<category name="org.jgroups">
<priority value="ERROR"/>
</category>
www.exoplatform.com - Copyright 2012 eXo Platform 25
26. Bugs relating to configuration(2)
− Replication time out
− Related issue: PLF-2752
− Description: Replication Timeout console on running PLF in
cluster mode with multi instances
− Workaround: Increase Replication Timeout
www.exoplatform.com - Copyright 2012 eXo Platform 26
27. Bugs in implementation(1)
− Lack of Synchronized function
− Related issue: ECMS-2324
− Description: Job runs only on a server
→ Another job try to access the resource using by another job
− Suggested Solution: Synchronize functions. At one time, only a
thread access a resource
www.exoplatform.com - Copyright 2012 eXo Platform 27
28. Bugs in implementation(2)
− Cache problem
− Related issue: ECMS-3337, SOC-1368, EXOGTN-978
− Description: To increase performance, the local cache is
implemented. Cache among another nodes isn't synchronized
→ Cannot see update in other nodes or some objects are
unavailable.
− Suggested: Solution: Only use cache for necessary cases. Evict
cache immediately if the changes have to take effect on cluster
nodes.
www.exoplatform.com - Copyright 2012 eXo Platform 28
29. Bugs in implementation(3)
− Problem with login session
− Related issue: EXOGTN-943, PLF-2730
− Description: The user login session isn't well shared among
servers in cluster.
→ Problem 1: Cannot relogin after a login failure
→ Problem 2: When a instance fails, all clients using it have to relogin.
− Suggested Solution: Broadcast user login session among servers
in cluster
www.exoplatform.com - Copyright 2012 eXo Platform 29
31. Conclusion and Perspective
− Conclusion
− PLF supports quite well in cluster mode
− There are some problems in cluster mode
− Perspective
− Use mod_cluster for cluster configuration
− Organize server in many groups. Among in a group, all server
shared the same session
www.exoplatform.com - Copyright 2012 eXo Platform 31