2012-03-15

Closures in Java

Often times I hear the complaint "Java should have closures". When I ask why that is people tell me they hate to write boilerplate code like this:

public List<String> resourceToList(String name) {
  List<String> list = new ArrayList<>();
  try {
    InputStream stream = getClass().getClassLoader().getResourceAsStream(name);
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(stream))) {
      for (String l = reader.readLine(); l != null; l = reader.readLine()) {
        if (l.isEmpty()) break;
        list.add(l);

      }
    }
    return list;
  } catch (IOException e) {
    throw new RuntimeException("Can't read resource " + name, e);
  }
}


After all, the truly meaningful statements are those that I've highlighted. I could change just those statements to count the lines, write the lines to another file, etc. And each time I have to copy-paste all the I/O stuff. If only Java had closures we could write the following utility method just once:
  
public void readResourceByLine(String name, (String)->Void closure) {
  try {
    InputStream stream = getClass().getClassLoader().getResourceAsStream(name);
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(stream))) {
      for (String l = reader.readLine(); l != null; l = reader.readLine()) {
        closure.call(l);
      }
    }
  } catch (IOException e) {
    throw new RuntimeException("Can't read resource " + name, e);
  }
}


And then the method we really wanted to write would look like this:

public List<String> resourceToList(String name) {
  List<String> list = new ArrayList<>();
  readResourceByLine(name, (String l) -> {
    if (l.isEmpty()) break;
    list.add(l);
  });
  return list;
}  


Isn't that clear, concise and less error prone?

So what's a closure? It's a code block that can be passed as a method argument. Note that the scope of the code block closes over the method call, hence the name "closure". This means that in the block of code that gets passed to the method we have access to the variable "list" and hitting "break" within the code block interrupts execution of the called method.

Although this syntax is very clean indeed, I believe you can apply a similar style of writing in Java today. First let me define a few reusable function signatures:
  
/** Statement with one argument, the equivalent of (A1) -> Void */
public interface St1<A1> {
  void call(A1 arg1);
}


/** Function with one argument, the equivalent of (A1) -> R */
public interface Fn1<A1, R> {
  R call(A1 arg1);
}

/** Function with two arguments, the equivalent of (A1, A2) -> R */
public interface Fn2<A1, A2, R> {
  R call(A1 arg1, A2 arg2);
}


Etc. We can keep defining variants of St and Fn with more arguments.

Now, by making use of the St1 interface, we can extract all the boiler code from the example:

public void readResourceByLine(String name, St1<String> closure) {
  try {
    InputStream stream = getClass().getClassLoader().getResourceAsStream(name);
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(stream))) {
      for (String l = reader.readLine(); l != null; l = reader.readLine()) {
        closure.call(l);
      }
    }
  } catch (IOException e) {
    throw new RuntimeException("Can't read resource " + name, e);
  }
}

public List<String> resourceToList(String name) {
  final List<String> list = new ArrayList<>();
  readResourceByLine_(name, new St1<String>() {
    public void call(String l) {

      list.add(l);
  }});
  return list;
}


Admittedly I had to replace the syntactic sugar "->" with the more verbose "new St1<String>() public void call()" but I think it's still pretty damn readable. If I ever saw the above in production code instead of the copy-pasted boilerplate I'd be a good day.

You might have noticed that I had to declare the "list" variable final because in Java an anonymous inner class only has access to the final variables of the calling block. In most cases I find this not to be an issue, unless you want to do something like:

public int countLinesOfResource(String name) {
  final int count = 0;
  readResourceByLine(name, new St1<String>() {
    public void call(String l) {
      count++;
  }});
  return count;
}


This will not compile. But you could wrap types like Integer so you have a final variable which you deference to get to the real value:

public class Mutable<T> {
  private T value;

  public Mutable(T value) {
    set(value);
  }

  public T get() {
    return value;
  }

  public void set(T value) {
    this.value = value;
  }
}

public int countLinesOfResource(String name) {
  final Mutable<Integer> count = new Mutable<>(0);
  readResourceByLine(name, new St1<String>() {
    public void call(String l) {
      count.set(count.get() + 1);
  }});
  return count.get();
}


But what about that break statement in the original code? We can't do that in real Java, can we? No. What we can do is immediately return from the "call" method and remember that a "break" event occurred.

public List<String> resourceToList(String name) {
  final List<String> list = new ArrayList<>();
  readResourceByLine(name, new BreakSt1<String>() {
    public void call(String l) {
      if (l.isEmpty()) { breaking(); return; }
      list.add(l);
  }});
  return list;
}


In the above code "St1" has changed from an interface to an abstract class "BreakSt1" which remembers whether its braking() method has been called or not.

public abstract class BreakSt1<I> extends Breakable implements St1<I> {
}

public abstract class Breakable {
  private boolean broken = false;

  public final void breaking() {
    broken = true;
  }

  public final boolean broken() {

    return broken;
  }
}


It now becomes part of the design contract that immediately after calling the call() method you must check whether the closure was broken and handle that accordingly.

public void readResourceByLine(String name, BreakSt1<String> closure) {
  try {
    InputStream stream = getClass().getClassLoader().getResourceAsStream(name);
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(stream))) {
      for (String l = reader.readLine(); l != null; l = reader.readLine()) {
        closure.call(l);
        if (closure.broken()) break;
      }
    }
  } catch (IOException e) {
    throw new RuntimeException("Can't read resource " + name, e);
  }
}


I think this is the most elegant solution. Methods that consume a closure have the choice between the St1 interface and the BreakSt1 abstract class. If they choose the latter they are responsible for handling the break correctly. As a user of the method, when passing in a closure the expected type will tell you whether you have the option to break out of the control flow. And writing breaking(); return; instead of break; is not too bad.

So far, I've tried to show that by defining a package of interfaces and abstract classes like Mutable, St1, BreakSt1, Fn1, BreakFn1, Fn2, etc. the casual use of anonymous inner classes doesn't have to be much more verbose than a real closure. I've argued that in most cases its good enough that an anonymous inner class closes over only the final variables of the calling scope. And I've offered a simple pattern that you can follow for dealing with breaks inside a closure.

Having hopefully convinced you that mostly you can have your closures and stay with Java, I would like to end with asserting that mostly you don't need them.

Because you see, when I find myself wanting to use a closure 9 out of ten times I'm invoking that closure inside a loop. And Java has a strong pattern for abstracting away loop control built in: iterators.

What if I implemented the readResourceByLine() method as an Iterable and AutoCloseable class instead?

public List<String> resourceToList(String name) {
  List<String> list = new ArrayList<>();
  try (ResourceLineReader lines = new ResourceLineReader(name)) {
    for (String l : lines) {
      if (l.isEmpty()) break;
      list.add(l);
    }
  }
  return list;
}


public int countLinesOfResource(String name) {
  int count = 0;
  try (ResourceLineReader lines = new ResourceLineReader(name)) {
    for (String l : lines) {
      count++;
    }
  }
  return count;
}


Isn't that even more readable than the stuff with the pseudo closures?

Let's go ahead and convert the readResourceByLine() method into the ResourceLineReader class.

What does the class have to do?
  1.   Open a resource as a stream
  2.   Provide an iterator for reading line-by-line from the stream
  3.   Close the stream

Step 1: To keep our code short and to the point, the class will take care of the first task and delegate the rest to another class called LineReader.

public class ResourceLineReader implements Iterable<String>, AutoCloseable {
  private LineReader lineReader;

  public ResourceLineReader(String name) {
    InputStream stream = getClass().getClassLoader().getResourceAsStream(name);
    lineReader = new LineReader(new InputStreamReader(stream));
  }

  public Iterator<String> iterator() {
    return lineReader.iterator();
  }

  public void close() {
    lineReader.close();
  }      
}


Step 2: As you can see below, LineReader will take a stream and perform the second task of instantiating an iterator that reads line-by-line from the stream.

public class LineReader extends AutoClose<BufferedReader> implements Iterable<String> {

  public LineReader(Reader in) {
    closeable = new BufferedReader(in);
  }

  public Iterator<String> iterator() {
    return new BufferedIterator<String>() {
      protected String produceValue() {
        try {
          return closeable.readLine();
        } catch (IOException e) {
          throw new RuntimeException("Can't read from " + closeable, e);
        }
      }
    };
  }
}


Step 3: Like ResourceLineReader before it, the LineReader class has a single concern. It doesn't care where the stream comes from, and it doesn't care to close it either. That third task of closing the stream is delegated to a class it inherits from: AutoClose.

public abstract class AutoClose<C extends Closeable> implements AutoCloseable {
  protected C closeable;
  

  public final void close() {
    try {
      if (closeable != null) {
        closeable.close();
      }
    } catch (IOException e) {
      throw new RuntimeException("Can't close " + closeable, e);
    }
  }
}


You might also have noticed that when the LineReader instantiates an Iterator it doesn't implement all the methods of that interface. Why should it have to? The task is simple: I have something I want to do iteratively: calling the readLine() method on the BufferedReader. So that's all I should have to write.

The details of the iterator pattern are well known and can be tucked away. The BufferedIterator used here is an abstract class which asks you to implement a single method called produceValue() and the rest is taken care of.

public abstract class BufferedIterator<V> implements Iterator<V> {
  private V nextValue;

  protected abstract V produceValue();

  private V iterate() {
    if (nextValue == null) {
      nextValue = produceValue();
    }
    return nextValue;
  }

  public final boolean hasNext() {
    return iterate() != null;
  }

  public final V next() {
    V value = iterate();
    nextValue = null;
    return value;
  }

  public final void remove() {
    throw new UnsupportedOperationException();
  }      
}

  
To recap:
  • ResourceLineReader opens a resource as a stream and passes that to LineReader.
  • LineReader takes a reader and makes it iterable by wrapping a BufferedIterator around it.
  • BufferedIterator turns whatever method call you give it into an iterator, in this example the readLine() method of the BufferedReader.
  • ResourceLineReader, and the LineReader it delegates to, are I/O classes and thus must be AutoCloseable. We've implemented that interface in the AutoClose class.

So, I started out wanting to extract boilerplate code with the magic of closures and in the end I fared just as well without them. I think iterators are great, and they're in Java 1.5.

If you still want to hold your breath for Java 8 check out Project Lambda.
http://openjdk.java.net/projects/lambda/

Or, import ch.lambdaj.function.closure.*
http://code.google.com/p/lambdaj/wiki/Closures

That's where I got my inspiration for this article from:

public List<String> resourceToList(String name) {
  List<String> list = new ArrayList<>();
  Closure1<String> add = closure(String.class); {
    of(list).add(var(String.class));
  }
  readResourceByLine(name, add);
  return list;
}


I hope that with these examples you'll feel empowered to write lots of clean reusable code!