Gabriel Kerneis — Web server benchmark

Home | Research | Software | Contact

How to benchmark a webserver

During my Master of Science, I benchmarked a number of web servers. I learned the hard way a few things that must and must not be done, and decided to share them to help whoever might need it.

The result of my experiments is summarized a technical report.

There is a lot of documentation scattered over the Internet. You will find a lot of (hopefully helpful) links in the rest of this document. The Linux HTTP Benchmarking HOWTO is good reference to start with.

What do you want to benchmark?

First of all, it is very important to decide what you want to benchmark. Unless you do this very precisely, it is pointless to seek technical solutions.

In my case, I wanted to compare a number of concurrency implementations. Benchmarking web servers was only a means to have a realistic application using those libraries. Keep in mind that, although I benchmarked realistic applications, I did not reproduce realistic load conditions. All I was interested in was to see how well the concurrency was handled, so I wrote very naive web servers, spawning a thread (or forking) on every incoming connection, and used the number of concurrent requests as the sole parameter of my study.

If you want to know whether a given web server is efficient, this is most certainly not what you want to do. A realistic load implies some number a requests per second, with potential burst, not a constant number of concurrent connections. But you should consider reading the rest of this document anyway, since some tricks are common to both situations.

You also need to decide whether you want to benchmark static or dynamic content.

The right tool for the right task

If you want to produce a simple benchmark:

If you want a realistic load:

  • httperf potentially combined with autobench
  • Tsung (should be useful if you need patterns, for instance for dynamic content)

Tuning your server

Some tips I wish I had known when I started.

Here is a checklist of things easily forgotten:

  • CPU: avoid power-saving mode. cpufreq-set -g performance.
  • File descriptors: raise the limit to (at least) the number of concurrent connections you wish to handle, using ulimit -n in your shell, or setrlimit(RLIMIT_NOFILE) in your server. Beware, some systems forbid you to raise the limit, you might need to investigate a bit to find how to unlock it.
  • Disable the logs of your server (you do not want to lose time logging thousands of requests instead of answering them).
  • Raise /proc/sys/net/somaxconn to the number of concurrent connections you want to handle. To understand why this is necessary, read the technical report or the excellent paper Measuring the Capacity of a Web Server (Banga and Druschel, Usenix 97). More on the fascinating topic of the accept() queue can be found in accept()able Strategies for Improving Web Server Performance (Brecht et al., Usenix 04) and in the libev manual. You should really read these references: this is one of the trickiest part of a web server behavior under heavy load, very easy to misinterpret (or to get wrong when you write a server). Using http_load is helpful when debugging this part because it gives the effective number of concurrent requests it has been able to perform, unlike Apache Bench which will happily pretend it managed to reach 1024 concurrent connections when the server is in fact limited to 128.

You also need to tune the client: the same advices apply. Do not forget to use a client faster than your server (or to use several client simultaneously) and to link them through a dedicated switch to ensure the bottleneck does not lie in the network.

Collecting and processing the data

It very much depends on the tools you use. I used the -g option of Apache Bench to get a gnuplot file I analyzed with R. I made some scripts and graphics available, it might help you to get started. If you are totally lost with R, you might find Vincent Zoonekynd’s site useful (I learned almost everything I know about R there). Since processing many data with R can take a lot of time (especially when you do things as naively as I did), there is first step where I process every information I need and dump it in .Rdata files, and a second step where I use those files to plot the results (that way, you can tweak the graphs without recomputing everything).

The first and last n requests (with n = your concurrency level) are often not significant and might be safely discarded. See the technical report and the graphs for more details.

Contributed tips

Dariusz Panasiuk suggests:

In my tests I have used "wget" with "-p" option to fetch all images. I
started multiple wgets from if loop and used my firefox browsing history.
Curl is better to report time, but I can't find how to tell it to fetch all
page, and not only index.html
curl -s -w "%{time_total}\n"  -o /dev/null -m 5 --url "http://www.bbc.co.uk" \
    -x proxy.local:8080

For more advanced tests I have used Polymix
http://www.web-polygraph.org/docs/workloads/polymix-3/