The document summarizes performance tests comparing the open source WMS servers MapServer and GeoServer. It describes testing various configurations including PostGIS versus shapefiles, concurrent requests, and reprojection. The results show that both servers can achieve fast performance but require optimization of settings like using FastCGI for MapServer and Java 6 for GeoServer. Overall it aims to identify areas for improvement in both software packages to enhance WMS performance.
1. W W W . R E F R A C T I O N S . N E T
WMS Performance Tests!
Mapserver & Geoserver
FOSS4G 2007
Shapefiles vs. PostGIS,
Concurrency,
and other exciting tests...
Presented by
Brock Anderson and
Justin Deoliveira
2. W W W . R E F R A C T I O N S . N E T
Presentation Outline
• Goals of testing.
• Quick review of WMS.
• Description of the test environment.
• Discussion of performance tests and results.
• Questions.
3. W W W . R E F R A C T I O N S . N E T
Goals
1. Compare performance of WMS GetMap
requests in Mapserver and Geoserver.
2. Identify configuration settings that will
improve performance.
3. Identify and fix inefficiencies in Geoserver.
We do not test stability, usability, etc.,
* We do not test styling or labelling.
We focus on vector input.
4. W W W . R E F R A C T I O N S . N E T
Keeping the tests fair
• Not an easy job!
• We tried to understand what each server does
under the hood to ensure we're not accidentally
performing unnecessary processing on either server.
5. W W W . R E F R A C T I O N S . N E T
Web Map Service (WMS)
http://server.org/wms?
request=getmap&
layers=states,lakes&
bbox=-85,36,-60,49&
format=png&... WMS
User A Map
6. W W W . R E F R A C T I O N S . N E T
Test Environment
Client Computer Server Computer
Apache 2.2.4 (with mod_fcgi)
Vector Data
Mapserver
4.10.2
WMS requests
JMeter Data
2.2 Tomcat 6.0.14
WMS requests Shapefiles
Geoserver
1.6 beta 3
Additional Server Specs: Dual core (1.8Ghz per core). 2GB RAM. 7200RPM disk. Linux. PostgreSQL
8.2.4. PostGIS 1.2.
7. W W W . R E F R A C T I O N S . N E T
Test #1: PostGIS vs. Shapefiles
• Two Data Sets:
3,000,000 Tiger roads in Texas
10,000 Tiger roads in Dallas, Texas
• Both data sets are in PostGIS and shapefile format.
• Spatial indexes on both data sets.
• Mapserver and Geoserver layers point at the data.
• Minimal styling.
• JMeter issues WMS requests to fetch ~1,000
features, limited by the 'bbox' parameter.
And the results are...
8. W W W . R E F R A C T I O N S . N E T
Test #1: PostGIS vs. Shapefiles
Mapserver Geoserver
400 386 400
350 350
Response time (millisec onds)
300 300
250 250
200 200
150 150
100 100
50 39 47 42 42 33
50 50 27
0 0
1,000 of 10,000 1,000 of 3,000,000 1,000 of 10,000 1,000 of 3,000,000
Notes: This test uses two different data sets: one with 3 million features, the other with 10,000. Each bar is
an average of 30 sample WMS requests, each using a different bounding box to fetch and draw appx. 1000
features (+/- 15%). The same 30 requests are executed for each scenario. One request at a time (no
concurrency). Mapserver and Geoserver use the same data. Mapserver is using FastCGI via Apache/mod_fcgi.
Spatial indexes on both data sets. Quadtree indexes generated by 'shptree'. No reprojection required.
Minimal styling. Responses are 1-bit PNG images.
9. W W W . R E F R A C T I O N S . N E T
Test #2: Concurrent Requests
• Using the same tiger roads data set with 10,000
records.
• We issue multiple requests with pseudo-random
BBOXes that fetch approximately 1,000 features.
• The main difference is that now we're issuing
multiple concurrent requests.
Let's see what happened...
10. W W W . R E F R A C T I O N S . N E T
Test #2: Concurrent Requests
Mapserver Geoserver
1500
1400
Response time (milliseconds)
1300
1200
1100
1000
900
800
700
500 600
300 400
100 200
-100 0
1 2 5 10 15 20 40 60 1 2 5 10 15 20 40 60
Notes: Data in PostGIS and shapefile formats. Mapserver and Geoserver use the same data. Mapserver is
using FastCGI via Apache/mod_fcgi. 20 FastCGI mapserv processes. Geoserver uses connection pooling with
20 connections. Spatial indexes on both data sets. No reprojection required. Minimal styling. Responses are
2-color PNG images. More details in the appendix.
11. W W W . R E F R A C T I O N S . N E T
... or Throughput, if you prefer
Mapserver Throughput Geoserver Throughput
80 80
70 70
Responses per second
60 60
50 50
40 40
30 30
20 20
10 10
0 0
1 2 5 10 15 20 40 60 1 2 5 10 15 20 40 60
# Concurrent Requests # Concurrent Requests
An alternative way to summarize the data collected
for the concurrency test. (Higher lines are better
here.)
12. W W W . R E F R A C T I O N S . N E T
Test #3: Reprojection
Mapserver (using PROJ to reproject) Geoserver (using Geotools to reproject)
80 80
31
70
22 70
19 21
6 9 12
Response time (milliseconds)
60
60
50 0 50
0 4
40
40
30
30
20
20
10
10
0
0 None Geog WGS84 – Geog WGS84 – Geog WGS84 – UTM 14N
UTM 14N UTM 14N SPS NAD83 WGS84 - SPS
None Geog WGS84 – Geog WGS84 – Geog WGS84 – UTM 14N WGS84
UTM 14N WGS84 UTM 14N NAD27 SPS NAD83 - SPS NAD83 WGS84 NAD27 NAD83
PROJ optimizes by assuming these source Geotools is slightly faster than
and target datums are equivalent. PROJ for these cases.
Currently Mapserver calls PROJ for every Geoserver simplifies geometry
before reprojecting.
vertex, but it could improve by batching
those into a single call.
13. W W W . R E F R A C T I O N S . N E T
CGI vs. FastCGI (Mapserver only)
90
81
80
Response time (milliseconds)
70
57
60
52
50
42 PostGIS
40 Shapefile
30
20
10
0
CGI FastCGI
Notes: Average of 30 samples. One request at a time (no concurrency). Each request fetches one
layer with 1000 features from a data set of 10,000. Spatial indices on both data sets. No
reprojection required. Minimal styling. Responses are 1-bit PNG images. The same binary file was
used for both CGI and FastCGI. FastCGI through Apache and mod_fcgi.
14. W W W . R E F R A C T I O N S . N E T
Breakdown of Mapserver Response Time
90
Time (in milliseconds)
80
70
Network delay
60 Write image
50 Draw
Fetch & store
40 Query
30 Connect to DB
Load map file
20
Start mapserv process
10
0
PostGIS Shapefile
• FastCGI eliminates Start mapserv process and
Connect to DB costs.
• The Write image step is dependant on output format.
15. W W W . R E F R A C T I O N S . N E T
Breakdown of Geoserver Response Time
404: Document not found
16. W W W . R E F R A C T I O N S . N E T
Servlet Container and Java (Geoserver
only)
Jetty 6.0.2 Tomcat 6.0.14
Response time (milliseconds)
200 179 200
160 160
120 120 Tomcat 6
95 95
doesn't
80 64 80 support 63
Java 1.4
40 40
0 0
Java 1.4 Java 5 Java 6 Java 1.4 Java 5 Java 6
• These results show average response times for the same
WMS request when Geoserver is backed by different
Servlet containers and Java versions.
• Using shapefile backend.
• Conclusion: Use Java 6!
17. W W W . R E F R A C T I O N S . N E T
Outcome of the tests
• Lots of performance optimizations to Geoserver
which will be available in version 1.6.
• Identified a few places where Mapserver can
improve too. (These will be reported as “bugs”
as time permits.)
• Both servers can be FAST, but require some
special configuration.
18. W W W . R E F R A C T I O N S . N E T
The Road to Speed
Mapserver Geoserver
Response time (milliseconds)
1000 1000
800 800
600 600
400 400
200 200
0
0
Start (CGI) Switch to Re-order Output Start Logging Transp. Output JVM Code
FastCGI 'epsg' file format Off styles off format settings change
Data sources with high All will be in Geoserver 1.6
connection overhead will
benefit much more from
FastCGI.
19. W W W . R E F R A C T I O N S . N E T
Performance Tips (Mapserver)
• Beware of
PROJECTION
'init=epsg:4326'
END
The “init=” syntax causes one lookup in the PROJ4 'epsg'
file for every occurrence in the map file. (Move your
most-used EPSG codes to the top of the 'epsg' file.)
• Use FastCGI instead of ordinary CGI. Instruction here:
http://mapserver.gis.umn.edu/docs/howto/fastcgi
• Ensure you have enough FastCGI processes.
20. W W W . R E F R A C T I O N S . N E T
Performance Tips (Geoserver)
• Geoserver has many features enabled by default. Gain
performance by disabling features you don't need.
– Transparent styles double draw time. Use opacity=1
in your SLD to disable.
– Antialiasing linework is costly. Try
'&format_options=antialias:none' to disable.
– Experiment with disabling “PNG native acceleration”
• Favour Java 6 over Java 5 over Java 1.4.
• JVM Settings: Increase heap size. Use -server switch.
• Experiment with different shapefile index depths.
• Turn off logging
21. W W W . R E F R A C T I O N S . N E T
How can the servers improve?
Mapserver Geoserver
• More efficient scanning • Various optimizations to
of shapefile quadtree the renderer.
indexes. [ Bug Reported ] [ Fixes Committed ]
• Batch PROJ calls when • More efficient scanning
doing on-the-fly of shapefile quadtree
reprojection. index. [ Fixes Committed ]
• Reduce number of 'epsg'
lookups on map files.
22. W W W . R E F R A C T I O N S . N E T
Questions? Contact Us.
Brock Anderson:
banders@refractions.net
Justin Deolivera:
jdeolive@openplans.org
23. W W W . R E F R A C T I O N S . N E T
General WMS Performance Tips
• Only fetch from your data source the features that will
be drawn, otherwise the servers have to spend time
scanning and discarding the unused ones.
• Output format affects response time. 256 color PNG is
faster to create than PNG24 on both servers.
• On-the-fly reprojection has a price. Store data in the
same projection it's most commonly requested in.
24. W W W . R E F R A C T I O N S . N E T
Appendix
Breakdown of Mapserver Response Time
The graph represents mapserv running in CGI mode to show all startup costs. Metrics for “Load map file”,
“Connect to DB”, “Fetch & store”, “Draw” and “Write image” were collected by modifying source code to
capture and log durarions of those operations. “Query” time measured with PostgreSQL's explain analyze
command. “Start mapserv process” + “Network delay” = difference between response times recorded by
JMeter and my custom mapserv logging which recorded the total time servicing a request.
PostGIS Shapefile
Start mapserv process 15ms 15ms
Load map file 3ms 3ms
Connect to DB 14ms n/a
Query 20ms n/a
Fetch 7ms n/a
Draw 11ms 28ms
Write image 8ms 8ms
Network delay 3ms 3ms
PostGIS vs Shapefiles
This test uses two different data sets: one with 3,000,000 features, the other with 10,000. Each request
fetches 1000 features by limiting with a 'bbox' WMS parameter. Each bar is an average of 30 samples.
One request at a time (no concurrency). Mapserver and Geoserver use the same data. Mapserver is using
FastCGI via Apache/mod_fcgi. Spatial indices on both data sets. The shapefile indices were generated
with 'shptree'. No reprojection required. Minimal styling. Responses are 2-color PNG images (indexed
color).
The unusual Mapserver result for the case of a 3 million record shapefile has been reported to the
Mapserver bug tracker: http://trac.osgeo.org/mapserver/ticket/2282
25. W W W . R E F R A C T I O N S . N E T
Appendix
Concurrency and Throughput
Notes: Data in PostGIS and shapefile formats. Mapserver and Geoserver use the same data. Mapserver is
using FastCGI via Apache/mod_fcgi. 20 FastCGI mapserv processes. Geoserver uses connection pooling
with 20 connections. Spatial indexes on both data sets. No reprojection required. Minimal styling.
Responses are 2-color PNG images (indexed color). “Concurrent” requests were fired in bursts with zero
ramp up (as near to simultaneously as possible). I.e. For the test of 10 concurrent requests, all ten
requests were fired at the same time. Once all the responses came back then the next burst of requests
went out. Requests use random bboxes which fetch ~1000 features. The same random bboxes are used
against both servers.
Mapserver (Response times) Geoserver (Response times)
PostGIS Shapefile PostGIS Shapefile
1 50 39 1 42 27
2 51 40
Response times are measured in
2 43 30
5 91 75 5 81 47
milliseconds. Throughput times
10 182 147 10 166 103
represent responses per second.
15 269 229 15 261 162 The concurrency level is the left-most
20 315 283 20 378 252 column in each table (1, 2, 5, 10, ...).
40 784 612 40 747 514
60 1269 905 60 1170 773
Mapserver (Throughput times) Geoserver (Throughput times)
PostGIS Shapefile PostGIS Shapefile
1 19.6 24.9 1 24.6 35.6
2 28.2 33.4 2 32.3 41.8
5 35.4 51.6 5 47.1 68.6
10 38.4 53.8 10 49.9 74.1
15 42.5 55 15 49.2 73.3
20 42.4 54.1 20 47.7 68
40 43.2 54.9 40 48.3 68
60 43.1 51.5 60 47.8 70.7
26. W W W . R E F R A C T I O N S . N E T
Appendix
Summary of Geoserver code changes made to improve performance:
• optimized access to the shapefile spatial index (it was reading tiny sections of the file instead of doing
some buffered access)
• figure out the optmimal palette out of the SLD style (when possible, that is, when antialiasing is off)
* don't access the dbf file when not necessary
* avoid unecessary operations, like duplicating over and over the same coordinate[] during rendering
(loading it, generalize, reproject, copy back in the geometry and so on, now the array it's copied just
once)
Raw list of changes here:
http://jira.codehaus.org/secure/ManageLinks.jspa?id=55176
Notas del editor
Brock:
Brock: WMS Performance
Brock: WMS Performance
Justin: -explain the two out of the box behave quite differently - fast cgi vs database connection pooling - tried to make as equal possible ==> GeoSever doing < 1 bit png output (ask andrea about appropriate terminology)
Justin: - Breif overview of WMS for the layman WMS Performance
Brock: WMS Performance
Brock: - random bounding box generation - exact same requests for each server - 10,000 subset of 3 million WMS Performance
Brock: Response time in milliseconds on the Y-axis. Three interesting ways to look at the graph - PostGIS vs. Shapefile - Indexing (large data set vs. small data set) - Mapserver vs. Geoserver - Geoserver had many code changes to improve performance. WMS Performance
Justin: - explain how the concurrency, we chose a “worst case” concurrency model, there are many others WMS Performance
Justin: - suprising results - expected postgis to perform much better under a load - however, read-only access means no locking for shapefiles - Mapserver with fast cgi with 20 processes - GeoServer with connectino pooling at 20 connections WMS Performance
Justin: - same data, different metric, througput - mapserver remains consistent for longer, geoserver starts to degrade around 15 concurrent requests (garbage collector?) Wms Performance
Brock: This test only scratches the surface. At least it gives a sense of how much on-the-fly reprojection affects response time. WMS Performance
Brock: What is FastCGI? Conclusion: use FastCGI unless you have compelling reasons no to. WMS Performance
Brock WMS Performance
WMS Performance
Justin WMS Performance
WMS Performance
Brock does mapserver Justin does GeoServer - out of the box geoserver much slower ==> defautl 24 bit rendering, logging on, default styling has transparency , anti-aliasing on by default