Optimization / Benchmarks

A collection of benchmark results and findings regarding code optimization.

MATSim Benchmark

Overview

The performance of MATSim depends on a lot of different factors:

To get a better understanding, under which circumstances MATSim performs best, we created a simple benchmark (performance test) that runs 20 iterations of a sample scenario with different settings. If you run the benchmark on your machine, we would be happy if you could send us your results.

Download and Installation

Download the following zip-file: benchmark.zip [35MB]

Unzip the downloaded file.

Running the benchmark

java -Xmx500m -jar Benchmark.jar

This will generate a directory output with some files in it from the run. The test will usually run between 25 and 40 minutes. The benchmark requires Java 1.5 or newer and 150MB free disk space.

If you want to re-run the benchmark, rename or delete the output directory and run the test again.

Submitting benchmark results

Please send an email to benchmark AT matsim DOT org containing:

Benchmark results analysis

We collected some results and did a short analysis on them. Have a look at the results. If you want your results included as well or have some interesting findings yourself, please submit us our benchmark results.

MATSim Benchmark Results

The benchmark contains parts of the code running in parallel (the replanning part, using 4 threads) and other parts running single-threaded.

Speed comparison

The benchmark was run on some of our production servers with different versions of Java Virtual Machines. The servers have 2 Single-Core "AMD Opteron Processor 248" running at 2.2 GHz (date of purchase: fall 2004, so they have to be considered as "old").
The Java Virtual Machines tested were:

benchmark performance results 1

The shown number in above graph are the average of two runs of the benchmark for each configuration (Yeah, two isn't that big a sample, we know... but it should still be valid to demonstrate some findings).
What can be observed is the huge difference of execution time in general between 32-bit and 64-bit versions of the virtual machines. The "compressed object pointers" (COP) feature of Suns JVM 6u14 seems to compensate for this nicely, making the 64-bit version about the same speed than the 32-bit version. The aggressive optimization options (AO) on the other hand doesn't seem to influence the performance of MATSim drastically.
While the difference between Suns Java 5 and Java 6 versions in the total execution time seem more or less random, there seems to be a performance improvement for the multithreaded replanning part by changing from Java 5 to Java 6. Interestingly, despite IBM's worse multithreaded performance, it is able to catch up in the single-threaded parts to come in with a similar total execution time than Sun's 64-bit JVMs.

The benchmark was also run on other server machines:

benchmark speed results 2

Comparing the AMD servers running at 2.2 GHz and 3.0 GHz, the most obvious difference is the time for the replanning, explainable by the different number of cores the machines have. Interestingly, the remaining execution time didn't really improve by the change in CPU speed, leading to the guess that the performance of the memory controller or the memory bus is limiting the speed of MATSim (both AMD servers seem to have a front side bus of 1000 MHz).

The Intel servers were massively faster than the AMD machines. We do not yet know if it's the different memory controller, faster memory bus, or if Sun's JDK is just more optimized for Intel processors. Anyway, the difference is striking. And each newer generation of Intel processors seems to deliver a real performance upgrade, even when running at a lower clock-speed.

At last, the benchmark was also run on some of our laptop machines: An Apple MacBook Pro with a Intel Core 2 Duo processor clocked at 2.33 GHz (model from fall 2006), running Mac OS X 10.5.7, and an IBM/Lenovo Laptop with the same Intel Core 2 Duo processor, 2.33 GHz, running a Gentoo 64-bit Linux (Kernel 2.6.28). Both laptops have a front side bus of 667 MHz, so from a technical view point they are very similar.

On the Apple MacBook Pro, the following Java Virtual Machines were used:

On the Lenovo Laptop, an OpenJDK 6u0 64-bit JVM was used.

benchmark speed results 3

 

Surprisingly, Apple pulled the trick to make their 64-bit Java 6 a lot faster than the older 32-bit Java 5—well, it could also mean that their Java 5 offering is just very slow... The Mac-port of OpenJDK 6 ("Soylatte") is even a bit faster, but that may be likely due to the difference between 64-bit and 32-bit.

Are you able to run the MATSim benchmark even faster? Please tell us so! We're very interested in your benchmark results.

Memory usage comparison

MATSim writes out information about memory usage from time to time into the logfile. Plotting this information gives a jagged line running from left to right. Heights and lows in the plot can be explained with the Java Garbage Collector, only freeing up the memory from time to time. Still, one can guess the absolute minimum of memory required by MATSim by looking at the lower parts of the curve (that's then when a Garbage Collection just ran, showing all the memory that could not be collected).

Comparing the memory consumption in Sun's (currently) latest Java VM version holds no real surprises.

benchmark memory results

It can be clearly seen that the 64-bit JVM uses the largest amount of memory, due to the fact that each object pointer takes up 8 bytes. The 32-bit JVM uses the least memory. The 64-bit JVM with compressed object pointers seems to lie somewhere in between—although I would have expected it to be comparable to the 32-bit JVM, it seems that it still uses a bit more memory for unknown reasons. Anyway, it comes in handy to know that one can load now larger scenarios on a 32GB (or less) machine. The memory savings, compared to a 64-bit JVM without compressed object pointers, should be even bigger the larger the scenario is or the more details the simulated network has, so this feature really looks promising.
 

Benchmark: Parallel Events Handling and JDEQSim

The following shows two different benchmarks using jdeqsim, parallel events handling, or both.

The first benchmark uses the ivtch-osm network (~60k links), while the second one uses a navteq network with much more links.

QueueSim vs. JDEQSim; parallel Events Handling

Zrh 10%, ivtch-osm network; computing times per iteration (computer = cluster4 = 2x "Dual-Core AMD Opteron Processor 2222", 3.0 GHz, 1000 MHz FSB).  Runs 669, 676, 678, 679

QueueSim vs. JDEQSim; parallel Events Handing

Computing times per Iterations.  Scenario = navteq network of Switzerland; computer = cluster4 = servers with 2x "Dual-Core AMD Opteron Processor 2222", 3.0 GHz, 1000 MHz FSB.

How To Speed Up MATSim Runs

The following list gives some hints on how to speed up MATSim simulation runs.


Warning: Table 'watchdog' is read only query: INSERT INTO watchdog (uid, type, message, severity, link, location, referer, hostname, timestamp) VALUES (0, 'php', '<em>Table &amp;#039;sessions&amp;#039; is read only\nquery: UPDATE sessions SET uid = 0, cache = 0, hostname = &amp;#039;38.107.179.232&amp;#039;, session = &amp;#039;&amp;#039;, timestamp = 1328353817 WHERE sid = &amp;#039;e972b2ab7dd755a2094beea8ac708f02&amp;#039;</em> in <em>/home01/vsp_access/matsimwww/includes/database.mysql.inc</em> on line <em>174</em>.', 2, '', 'http://matsim.org/book/export/html/330', '', '38.107.179.232', 1328353817) in /home01/vsp_access/matsimwww/includes/database.mysql.inc on line 174