The most important rules:
The following chapters contain more information for writing code for MATSim.
As the project is growing, we need a minimal set of guidelines to insure the stability of the MATSim project and its further development. I try to keep this list as short as possible.
if statements and other stuff. A notable exception are line lengths (we have no problem with lines up to 132 characters).org.matsim.* do not reference other classes outside of the org.matsim.*-package except for classes provided by libraries in the libs directory. Especially, org.matsim.*-classes must not reference playground-classes. Also playground classes must not depend on code within src/test/java.mvn compile test-compile from the command line within the project top-level directory.
/* *********************************************************************** *
* project: org.matsim.*
* ${file_name}
* *
* *********************************************************************** *
* *
* copyright : (C) ${year} by the members listed in the COPYING, *
* LICENSE and WARRANTY file. *
* email : info at matsim dot org *
* *
* *********************************************************************** *
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. *
* See also COPYING, LICENSE and WARRANTY file *
* *
* *********************************************************************** */
${filecomment}
${package_declaration}
${typecomment}
${type_declaration}
examples, which is maintained by Michael Balmer and Marcel Rieser – please contact them if you want to add your own examples to this directory.getTravTime() >> getTravelTime()getDist() >> getDistance()Id is an abbreviation for Identifier, the 'd' is thus usually a lowercase letter.create*(), e.g. createLink(). newLink()Abstract, e.g. AbstractPersonAlgorithm.I like SomeInterfaceIImpl, if no more specific class-name is suitable, e.g. PlanImpl implements Plan.
org.matsim.* unless you are the maintainer of a package/module in org.matsim.*, then you can also commit to this package. Only a small group of persons has the right to commit code to org.matsim.core.*, if you're not one of them, do not commit code in there. Talk to one of the core committers if you need changes or want to contribute code into the core.org.matsim.*, run first the test cases before committing and make sure there are no test failures.test/input/*. See the detailed discussion of this topic.KDE has a nice list of additional recommendations for committing code to a repository. The list seems mostly reasonable also for our project, but we don't want to regulate too much, but hope for people's sanity when committing code.
Some questions occurred about committing data to the MATSim project on SourceForge. To clarify, what kind of data is allowed to commit and where to put it, here a short guideline:
src" and its subdirectories. Do not commit any data files under "src"!config.xml" files are allowed under "src/playground/<username>".test/src", you are allowed to add required data to "test/input" or "test/scenario". Then please follow the following constraints:
test/scenario".test/input" or "test/scenario". Note that bigger files may be compressed with gzip since MATSim supports reading and writing gzipped files with org.matsim.utils.io.IOUtils.getBufferedReader() and IOUtils.getBufferedWriter() respectively.test/input" or "test/scenario". Remember: SourceForge is public to everyone!So, to make it short: You are not allowed to add any data to the SourceForge repository except:
config.xml-files in your playgroundtest/input/<name of your test>" or "test/scenario/…".Our main reason to have tests is to ensure we do not loose existing functionality when adding new functionality. Another reason, hopefully getting less important, is to figure out what influences bugs had, respectively what is influenced by a bugfix.
The following list is not complete and likely never will. It is more to give you an idea, what could be tested:
The tests are automatically run every night. When you log in on matsim.org, you'll find a link in the left navigation area ("developer info") where you can look at the nightly status of the tests. If errors or failures occur, an email is sent to Marcel Rieser who will then inform people as necessary.
To run all tests, create a new Run-Configuration. Select to run all Tests in src/test/java with a JUnit 4 runner.

In the Arguments tab, do not forget to increase the available memory for the JVM (-Xmx600m), then click "Run".
export MAVEN_OPTS=-Xmx600m mvn test
Maven should output a summary, listing the number of all tests as well as the number of failed tests (if any).
(The following list has been updated for JUnit 4.7. Many existing tests written for JUnit 3.8 still follow different guidelines.)
MyModuleTest.@Test and name your methods that contain your tests with a starting "test", e.g.: @Test public void testGetParameter();assert keyword from the Java language to check conditions, but use the methods provided by JUnit: Assert.assertEquals(x, y); Assert.assertTrue(x); Assert.assertNull(x); Assert.assertNotNull(y);Assert.assertEquals("different events files.", checksum1, checksum2);
Assert.assertNotNull("did not find person 1.", population.getPersons().get(1));double d1, d2;
Assert.assertEquals(d1, d2); // this matches to assertEquals(Object, Object); likely not what you wanted...
Assert.assertEquals(d1, d2, MatsimTestUtils.EPSILON); // this is the right way to test for double values@Rule MatsimTestRule, see Mini-Introduction to JUnit 4.7.Config loadConfig(String filename);getOutputDirectory(); getInputDirectory(); getClassInputDirectory(); getPackageInputDirectory();EPSILON for comparing double values (see above).MatsimTestUtils.loadConfig().MatsimTestUtils.getOutputDirectory().test/input/<package>/<class>/<test>/. Example:test/input/org/matsim/mymodule/MyModuleUtils/testUtilMethod/config.xml
is a configuration file for the test org.matsim.mymodule.MyModuleUtils.testUtilMethod().JUnit is a framework for writing tests in Java. We currently use JUnit 4.7 for our tests, but many older tests are still written as JUnit 3.8. Thus, the following information is retained only for better understanding of existing tests.
JUnit 3.8 uses three main concepts:
A small example of a test and testcase:
public class MyTests extends TestCase {
protected void setUp() throws Exception {
super.setUp();
// your code here...
}
protected void tearDown() throws Exception {
super.tearDown();
// your code here...
}
public final void testOne() {
// your code here...
assertEquals(expected, actual);
}
public final void testTwo() {
// your code here...
assertNotNull(someObject);
}
}This code defines two tests (testOne, testTwo). Additionally, there are two methods setUp() and tearDown(), which are automatically called by JUnit when executing the TestCase. Make sure to call Gbl.reset() in tearDown() when you use the class Gbl or Config within your tests, so that the following TestCases are not influenced by your TestCase.
JUnit offers many different assert-statements which should be used to verify the results of your tests. Do not use the standardassert() offered by Java, as this must especially be enabled to be executed!
When writing Tests for MATSim, do not extend the class TestCase, but extend MatsimTestCase, as that one offers some convenient methods related to MATSim (see Guidelines for more details).
JUnit is a framework for writing tests in Java. We currently use JUnit 4.7 for our tests, which still supports the older JUnit 3.8 syntax.
A simple example of a testcase:
import org.junit.Assert;
import org.junit.Test;
public class MyTests {
@Test
public final void testOne() {
// your code here...
Assert.assertEquals(expectedValue, actualValue);
}
@Test
public final void testTwo() {
// your code here...
Assert.assertNotNull(someObject);
}
@Test @Ignore("not yet fully implemented")
public final void testThree() {
// your code here...
// TODO complete test
}
}This code defines three tests (testOne, testTwo, testThree). JUnit offers many different assert-statements which should be used to verify the results of your tests. Do not use the standard assert() offered by Java, as this must especially be enabled to be evaluated! If a test is not yet fully implemented but you still want to commit the code, add the annotation @Ignore to the test. In that case, the test will not be executed. Note that your code must still compile when committing ignored tests.
If you need to read in or write out files to disk, use MatsimTestUtils as a rule:
import org.junit.Assert;
import org.junit.Rule;
import org.junit.Test;
public class MyTests {
@Rule public MatsimTestUtils utils = new MatsimTestUtils();
@Test
public final void testOne() {
String inputFile = utils.getInputDirectory() + "myFile.txt";
// your code here...
String outputFile = utils.getOutputDirectory() + "myFile.txt";
Assert.assertEquals(expectedValue, actualValue);
}
}Please note that many methods of MatsimTestUtils can only be used in the actual test (marked with @Test), but not in a constructor or other initialization methods (e.g. in @Before).
Maven maintains an explicit dependency graph, which code uses which other code in order to compile. If your code requires code from other developers, either from a different playground or in form of an external java library, this dependency must be manually registered in order to keep the code compiling.
Imagine you want to write code that relies on classes in someone else's playground. To do this, you have to specify this dependency on the playground of the other person:


First, make sure the library has a license that is compatible with the GPL that MATSim uses. Gnu.org has an extensive list of compatible and non-compatible licenses. If the license is not compatible, do not use it. Also, do not use SNAPSHOT-versions of libraries as a dependeny. Only use stable versions.
Note: The following text will always mention "playground", but the same steps can also be used to create a new contrib-project inside "contribs".
In the playgrounds-project, there exists a directory _template which can be used as a starting point. This template already contains the directory structure and svn meta data for a typical playground. Best is, to copy the template using svn commands. That is:
cd /path/to/workspace/playgrounds/svn copy _template nameOfNewPlaygroundnameOfNewPlayground/pom.xml, change the Artifact Id and the project name from "_template" to the name of the new playground.playgrounds/pom.xml_template directory in Eclipse or in the Explorer! In every case except "svn copy", the metadata will not be correctly updated, resulting in a corrupt svn checkout!
Contributions to the core of MATSim (packages org.matsim.*) require a high quality and stability. Thus, it is usually not desired to develop now concepts directly in a org.matsim-package, even if the code should later be located there. Instead, contributions to the core should go the following way:
org.matsim.core.api.experimental or another appropriate package.org.matsim.api (or other package) on the request of the MATSim Committee.
Every public class should have the following items:
All these items should be written in a Javadoc block atop of the class.
MATSim is developed by several persons on different platforms. So we need a basic set of settings to work successfully together.
In the Eclipse Preferences, please set the Text File Encoding to "UTF-8", and the New Text File Line Delimiter to "Unix" (in General > Workspace).

Exceptions are an important concept of Java, as they allow to signal special conditions that need special handling. Java differentiates between two types of Exceptions, checked and unchecked exceptions. Checked exceptions need to be declared in a method and code calling it needs to handle the possible exceptions, for example with a try-catch block:
public void readFile(final String filename) throws IOException {
// IOException is a checked exception, it needs to be declared as "throws"
if (!(new File(filename).exists()) {
throw new IOException("File not found!");
}
// continue with normal code if file exists
}
public void doCalculation(final int a, final int b) {
if (b == 0) {
// a RuntimeException is an unchecked exception, it does not have to be declared
throw new RuntimeException("b cannot be zero!");
}
}
public void someMethod() {
try {
readFile("foo.bar");
// a checked exception must be handled, either with try-catch or by declaring a "throws" on this method
} catch (IOException e) {
// handle exception
}
doCalculation(3, 0);
}Java itself makes use of checked exceptions in many places, most notably in many I/O related methods (IOException). As many programmers dislike to handle checked exceptions correctly (to either handle them or declare them and move the handling further up the caller chain), one can often observe code like the following:
try {
readFile("foo.bar");
} catch (IOException e) {
e.printStackTrace(); // DO NOT DO IT LIKE THIS!
}
Do not do it like this, as then your code will continue to run as if there was no problem, but you may be missing data in your code, or data was not written out, etc. So you'll spend a lot of time wondering why your application did not do what it should have been.
Better wrap the checked exception in an unchecked exception and re-throw it:
try {
readFile("foo.bar");
} catch (IOException e) {
throw new RuntimeException("Could not read the file foo.bar.", e); // This is better
}
In addition, you can still declare the RuntimeException, such that other people using your method know that there could be an exception and optionally handle it:
/** * Does some calculation with a and b. * * @throws RuntimeException if b is zero */ public void doCalculation(final int a, final int b) throws RuntimeException { if (b == 0) {
throw new RuntimeException("b cannot be zero!");
}
// continue calculation
}
Optionally, you could even create your own (unchecked) Exception:
public class MyCustomException extends RuntimeException {
public MyCustomException(final String message) {
super(message);
}
}
Most of MATSim-T is written in Java and requires Java 1.5 to run. While Java is widely known, we stumble from time to time over certain features or specialities we'd like to highlight. So this is the place to collect interesting and informative stuff about Java which might (or might not) have some relation to our code.
Note: I consider the following as a rather advanced topic / optimization. I stumbled upon it while testing a new Java-profiler and thought I document it, 'cause others may be interested in it. – Cheers, Marcel
While the Collections in Java are really nice and useful, they are not always the best solution performance-wise. In a recent case, a TreeMap<Integer, Integer> was used for travel time lookups. Each link in the network had such a map, storing a (departure) time and the corresponding travel time at that departure time. When searching for routes in the network, this map was accessed really a lot of times. As the queried time was not necessarily a key in the map, often the first entry of a tailmap (a map containing all objects whose key is equal or larger than a certain value) was requested to get the travel time at the next possible departure time.
When testing a profiler, I realized that a Integer.getValue() was called a crazy number of times, using a lot of time in total, even if I had this function-call nowhere explicitly in my method. Well, to find the correct entry in the map, the Integer-keys had to be compared, which was done by accessing their basic type values. That explained the many calls to this function, which made me think of how to implement the travel time lookup without using Integer-objects, but using the basic type int.
Replacing the map (sorted by key) with two int-arrays (int[]), one containing the keys, the other the corresponding values, allowed to use a binary search on the key-array (Arrays.binarySearch()) to find the correct array-index and thus to easily access the corresponding travel time in the value-array. With this change, no Integers were used, thus no (un-)boxing had to be done, and the accessed memory in the array should not be so scattered around then the single map-entry objects, leading to faster access. Finally the code ran in 25% less time!
So, what did we learn from this?
I do not say that we should no longer use Collections, but in certain places where they are accessed really a lot of times, it may help to think a bit further and maybe try some other data structure for an additional speed-up. A profiler (such was YourKit) may clearly help to find such places.
Memory usage is a critical point in multi agent simulations, just because there usually are so many of them. Thus, every single byte that is used to describe a single agent enlarges the memory consumption of the whole system massively.
Our agents have several attributes (like sex, car-availability, employment status), their plans contain activities with a type and legs with a mode. Each of these attributes are stored as Strings. Considering that even an empty String takes up to 40 bytes in Java and characters in Java are always 16bit, even a small string like "f' (for Person.sex) or "car" (for Leg.mode) takes a lot of memory—and that's for every single agent, leg and activity!
It is quite obvious that it does not make sense to have more than one string with the same content multiple times, especially when every instance uses so much memory. This is where object pools are often used: a so called pool stores commonly used objects exactly once, and other objects can refer to these objects instead of holding their own, identical, instances. In our case this means that instead of having separate instances of String for every "f", "car" or other attribute, each occurring value exists only once as a String-object, and all the agents, legs and activities only reference the corresponding pool object instead of storing their own instance. Having a population of 200k agents with 5 attributes would contain 1mio strings—that would be more than 40MB of RAM alone, not yet counting the memory used by plans, leg-modes, activity-types. But when using our "object pool" we only need about 10 or 15 different Strings (depending on the number of different values in the attributes), instead of 1mio!
The class String offers already such an internal object pool, so that this optimization can be used without much additional code:
String.intern();
The documentation to String.intern() reads as follows:
Returns a canonical representation for the string object.
A pool of strings, initially empty, is maintained privately by the class String.
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
So, the setters for such string-attributes can now be written like the following one:
class Act {
String type = null;
...
public void setType(String type) {
this.type = type.intern();
}
}
Using this simple trick, the memory consumption was reduced by over 25% when reading in 160k agents, while the time used for reading the agents did not change significantly (50secs vs 49secs). Reading larger populations should result in even larger memory savings, as most likely no additional strings have to be added to the pool, and all attributes can just reference to them.
Using Java 1.6.0 can evoke OutOfMemoryErrors while parsing huge xml files. For details see: bugs.sun.com/bugdatabase/view_bug.do?bug_id=6536111
HashSet/HashMap do not specify in which order elements are iterated. Testing such structures are very difficult (sorting is always necessary for comparison). Even more tricky it gets, if tests randomly fail. Debugging will be very hard. Use LinkedHashSet/LinkedHashMap instead. It can be iterated over insertion order.
Our simulations depend heavily on random numbers. But despite using random numbers, we still want the simulations to be deterministic, so that results can be reproduced by running the same scenario a second time.
Usually, random numbers in Java are generated by calling Math.random():
double d = Math.random();
While this method returns a random number, the number is indeed random such as that in a second run of the same scenario, other random numbers would result and the simulation would no longer be deterministic. To overcome this problem, an instance of java.util.Random can be generated and initialized with a custom random seed:
Random random = new Random();
random.setSeed(1234);
...
double d = random.nextDouble();
In this case, everytime the code is run, we get the same random numbers. As drawing random numbers is widely used in the code, MATSim-T offers a global instance of Random, which is automatically initialized with the seed specified in the configuration file:
import org.matsim.core.gbl.MatsimRandom;
...
double d = MatsimRandom.random.nextDouble();
What value should be used as random seed? Is a value of 1 better than the value 86294? As both numbers have the same probability to be chosen in the range from 0 to Integer.MAX_VALUE, any value is equally good for a random seed.
But this is only half of the story. We all know, that these random numbers are only pseudo-random—and they depend on the chosen seed!
PlanAlgorithms could be executed in parallel in multiple threads, e.g. during replanning. As the exact order of execution with multiple threads is not deterministic, the usage of MatsimRandom.random would lead to non-deterministic results. Instead, every instance of a PlanAlgorithm should have its own random number generator. Best way to realize that is that PlanAlgorithms that use Random numbers have a constructor where an object of type java.util.Random can be passed. When instantiating PlanAlgorithms, one can use MatsimRandom.getLocalInstance() to obtain a Random-object that can be passed to the PlanAlgorithm. A Random object received by getLocalInstance() is already correctly initialized to return useful random numbers (see below, Problems when setting the random seed).
In our code, we set the random seed at the start of every iteration, so that we can restart a simulation at any iteration. The code was similar to the example code below:
int baseSeed; // the random seed specified in the configuration file
...
for (int iteration = 0; iteration < 1000; iteration++) {
MatsimRandom.random.setSeed(baseSeed + iteration);
...
}
After running several iterations we realized that the first agent was never chosen for re-planning (remember, they get "randomly" chosen for re-planning). A little bit of research revealed, that the first random number drawn after setting the random seed depends heavily on the random seed! Only a slight change in the random seed (in our case always +1 for each iteration) resulted in only a slight change in the value of the random number. The following figure shows the distribution of the first and second drawn random number after setting different random seeds. As can be clearly seen, the first drawn number only moves in a very small range. The second drawn numbers have a better distribution when the seed is only changed a little.
To overcome this problem, we decided that after setting a random seed, we draw one random number and immediately throw it away, as it seems not enough random.
As a lot of functionality in MATSim is created by PhD students, there is often a problem maintaining this functionality after the respective students finished their work and leave university. In order to better communicate which features are "standard MATSim" which will (and have to) be maintained by the MATSim core developers, and which features are just "single-developer functionality", MATSim introduces the concept of "MATSim core" and "MATSim extensions".
The core will be maintained by the core developers, and should contain central functionality which is likely to stay in MATSim forever. Extensions can provide new, but stable, functionality developed to solve specific problems which can be of interest to others in the MATSim community.
Extensions will—as long as they compile and pass all tests—also be packaged for releases and be thus optional parts of MATSim releases. This requires that extensions follow certain guidelines, also in order to keep code maintenance and user support in reasonable bounds.
There are numerous hints that inheritance is not very stable under refactoring; see, for example, Bloch, "Effective Java".
We therefore suggest to prefer composition (=delegation) over inheritance where this is possible. It is only possible when the class that one wants to inherit from implements an interface. In that case, the following is possible (with eclipse):
1. Write a class skeleton as follows:
class MyClass implements XXXInterface {
private XXXInterface delegate = new XXXImplementation(...) ;
}
2. In eclipse, go to "source"/"generate delegate methods" and follow the instructions.
[[Somebody please add a screenshot here. thanks. kai]]
This will delegate all method calls to MyClass to the delegate. Now you can modify some of the delegate methods as you like.
(This sometimes seems to provide less access than inheritance, but I don't think this is true as long as you assume that internal variables/fields are always private.)
It is called "composition" since you can do this with more than one interface/delegate. The classical example is something like
class MyCar implements HasSteering, HasBrakes, HasGears {
private HasSteering steeringDelegate = new PowerSteering(...) ;
private HasBrakes brakesDelegate = new SimpleBrakes(...) ;
private HasGears gearsDelegate = new ElectronicGears(...) ;
}
where the eclipse generate delegate methods will produce methods such as
public void steerToRight( double value ) {
steeringDelegate.steerToRight( value ) ;
}
...
public void brake( double value ) {
brakesDelegate.brake( value ) ;
}
As one can see, this now allows operations such as
MyCar car ... ... car.brake( 3. ) ; car.steerToRight( 0.3 ) ;
that is, the car is now composed of its internals.
In MATSim, we often expose the delegation, that is, the syntax is
car.getBrakes().brake( 3. ) ; car.getSteering().steerToRight( 0.3 ) ;
This has the advantage that, if you extend the interfaces, you do not need to adapt every implementation. It is, clearly, not an option if you want to write a class (such as a PlanStrategy) that is later inserted into the code – that has to fulfill the contract defined by the interface.
Generics can be quite powerful, but also very cumbersome if used wrongly. Thus, respect the following rules: