JavaDoc and the Art of Speccing

by Arnout J. Kuiper

Introduction

I've worked on a lot a Java projects, and each time I see the same problems occur; especcially on the larger projects, with a larger team. One of the problems I'll highlight here: Speccing (or better: the lack of speccing).

In larger projects, it comes down to defining interfaces, which can help to reduce the complexity, and also to divide the project into subprojects. This is the place where most projects 'fail': poorly defined interfaces.

All interfaces should be specced out, because other teams do not (and should not) know the inner workings of the different subsystems. One of the ways to spec is using JavaDoc. A lot of people use JavaDoc, but the quality of the documentation is often poor. E.g.

/**
 * This interface defines a Filter.
 */
public interface Filter {

  /**
   * Main method.
   * @return a boolean.
   */
  public boolean do(String id, int nr)
      throws FilterException;

}

What does this help the developer who needs to program against this interface? Not much. Here are some questions that are bothering the developer when he/she sees this interface:

So a lot of things need some improvement.

The class itself

The JavaDoc of the class itself should contain at least a description of the purpose of its existence. A few examples on the usage of the class wouldn't harm also. If there are concerns for interface implementers, or subclassers, place them here.

This is also the place to describe the lifecycle of the objects. E.g. "the init() method is called once when this object is brought into service, and the destroy() method is called once when this object is taken out of service.", "When servicing a request, the methods are called in the following order: processForm(), process(), saveState(), render()".

The class data members

Data members should also have JavaDoc comments. This documentation should describe what the data member represents, and for what purpose it is used. When possible, the documentation should also contain some statements about the value of the datamembers. E.g. "This value is set in the constructor, and should never be null.".

This will help the developer who needs to change some code in the class. If for instance the existing code relies on the fact that some data member is never null, it would be a problem when some newly added code assigns a null-value to that datamember, only because the constraints on the datamember where not documented correctly.

These constraints on datamembers can be seen as invariants.

The methods

The description

Each method should have a description. OK, that's pretty obvious, but try to be specific. For instance, take a withdraw method in a class Account. It withdraws something from an account. But does it only do that inside its own object or does it also update a database, or send a JMS message?

Checklist

  • Are external systems involved?
  • Is there collaboration with other objects?

The parameters

To make a solid specification, all parameters should be described precisely, including the constraints on the parameters. This seems a dull and unnecessary something. E.g. if the name of a parameter is a string called phonenumber, it is pretty obvious what it means. But is it? May it be null? May it be empty? What is its format? Must the countrycode be included? Are dashed or parentheses used to separate the area code from the subscriber number? Must leading zeros be included? These are all valid (and real) questions.

When working on a project for a large telco, different systems used different conventions for the notation of the phonenumber, and so it was always unclear what format to pass to certain methods. Needless to say that this was not documented correctly. Some examples of these numbers: "+31-(0)10-1234567", "31101234567", "010-1234567", "0101234567", "101234567". As you see, enough confusion.

So to prevent confusion, specify the input parameters as precisely as possible at the @param tag.

It is possible that a method might change the value of a parameter that is passed into the method (thus the parameter is actually an input/output parameter). E.g. a java.util.List is passed in, and a method adds some objects to that list. This behaviour should be documented with the parameter. Try to be precise in to what changes in the value.

Checklist

  • May the parameter be null?
  • If the parameter is a String, may it be empty?
  • If the return value is an int/long/float/double, what is the range?
  • Are there parameters with a special meaning?
  • Is the value of the parameter changed in this method?

The return value

When the method does return something, it should be documented using the @return tag. This description should be precise as possible. E.g. when the return value can never be null, mention that in the documentation. This way, programmers who use your method know precisely what to expect. E.g. when it is documented that the return value is never null, than a check on the return value being null can be ommitted in his/hers code.

It is possible that the return value exposes some interal datastructure to the outside world. This happens a lot with collections. In a lot of cases, an interal java.util.List is returned directly, without cloning or so. This makes it possible for the outside world to change the contents of the internal list. Most of the times this is unwanted behaviour, but for simplicities sake often done. In these cases we trust that no one will change the returned value. The opposite can also happen. When the internal datastructure changes, the change will also be visible in all the datastructures that are returned already. So the best way is to clone the datastructures. But when that is not possible for some reason, document this behaviour.

Checklist

  • Is the return value never null?
  • If the return value is a String, can it be empty?
  • If the return value is an int/long/float/double, what is the range?
  • Are there return values with a special meaning?

Example

/**
 * ...
 * @return the non-null and non-empty name of the company.
 * ...
 */
public String getCompanyName();

/**
 * ...
 * @return the non-null and possibly empty list of all the
 *         customers of the company. Only reading is allowed
 *         on this list. No modifications may be made to it.
 * ...
 */
public List getCustomers();

The exceptions

In the javadoc of the method, describe the exceptions that can be thrown using the @throws tag. Also describe in what situations such an exception is thrown.

You should only mention checked exceptions that can come out of the method (that is, the exceptions listed in the 'throws' statement in your method signature). Unchecked exceptions shouldn't be here, because they shouldn't occur in the first place; and if you describe them, people might rely on them.

For instance, one day you decide to check the input parameters of a certain method, whether they adhere to the constraints documented in the parameter section, and throw an IllegalArgumentException when they are invalid. So far so good. Next you describe this IllegalArgumentException in the exception section of the documentation, together with the situation in which it occurs. Then someday a lazy programmer who want to use this method might be very lazy and not check any of the parameters passed into this method, because as documented it will throw an IllegalArgumentException when they are not valid, so for him/her it would be much easier to catch that exception instead of making sure that the parameters are correct. At a later time, you decide to remove the check on the parameters for performance reasons, and hey, this should be possible, because according to the documentation of the parameters no one is allowed to pass illegal values into the method anyway. Sounds logical, but you forgot about a very crucial factor in the equation: the lazy/stupid/... programmer. The problem is that these programmers are very real, and do exist in most organizations. Because you gave them a clue by describing the IllegalArgumentException, he/she may decide to take the simple way of catching exceptions instead of going through all the hassle of checking parameter values upfront. They see the introduction of some future bugs with this action not as their problem. "It works; so why bother." is their most used phrase.

Example

/**
 * ...
 * @throws BalanceException
 *         Thrown when there is not enough money in the client's account.
 * ...
 */
public withdraw(double amount)
    throws BalanceException;

Parameter checking

You are not finished after you have established a solid specification. Unfortunately we do not live in a world where nobody is sloppy or lazy, and where nobody makes a mistake. Although it is clear what the parameter requirements are for a certain method (because of the splendid specification), once in a while you still might receive parameters that are invalid, due to whatever reason.

This is why it wouldn't hurt to add some checks on the input parameters to check whether they adhere to the constraints described in the documentation. This way, if somebody makes a mistake and passes an illegal value, it will be detected and the problem can be fixed. This checking can be done in several ways. The preferred way is:

/**
 * ...
 * @param days The number of days to add. This value must be
 *             greater than 5.
 * ...
 */
public void someMethod(int days) {
  if (days <= 5) {
    throw new IllegalArgumentException("days must be greater than 5");
  }
  ...
}

As of J2SE 1.4, one can also use assertions to check the parameters:

/**
 * ...
 * @param days The number of days to add. This value must be
 *             greater than 5.
 * ...
 */
public void someMethod(int days) {
  assert days > 5;
  ...
}

For now it seems that the first approach is the preferred way, because it is more descriptive, and is compatible with previous Java versions. The downside is that the checking cannot be turned of easily after deployment. Which way you choose is up to you, as long as you add some checks.

Specification Testharness

Now the specification is clear, it is rather trivial to create a testharness that checks whether implementations implement the specification correctly. The testharness should focus on the adherence to the specification rather than on the implementation. For instance, if you specced out an interface, create a testsuite that runs it tests against the interface rather than a specific implementation. This way, if you got multiple implementations of the same interface, you can reuse the testsuite just by running it against each implementation.

Trust me, I've seen a lot of cases were people created testsuites for each implementation of an interface.

About the author

Arnout J. Kuiper is a Java architect within the Sun Java Center in the Netherlands, with a broad interest in Java and web-related technologies.