sobota 21. února 2015

A Proper Visitor Implementation

As you probably know, Visitor is a design pattern originally defined by Gamma, Helm, Johnson and Vlissides in their book called Design Patterns - Elements of Reusable Object-Oriented Software. There, the intent of this pattern was described as:

Represent an operation to be performed on the elements of an object structure. Visitor lets you define a new operation without changing the classes of the elements on which it operates.
Today, the Internet is flooded with web pages describing individual patterns, including Visitor. However, these pages only include trivial examples, which provide little help when implementing Visitor in a real-world application. Unfortunately, the Design Patterns book does not include any more advanced examples either.

The Problem

In my experience, a typical usage of this pattern is as follows: You've got a tree-like data structure on which you need to implement multiple different operations. For example:
  1. calculate certain properties (tree depth, average number of child nodes, etc),
  2. convert the tree to a different format (serialize it to a string, etc),
  3. etc.
Because a tree is inherently a recursive data structure, it is a good idea to follow the Divide and conquer principle when implementing these operations - to assemble the result for a given node based on the results for it's child nodes. This is where the simple implementations fall short.

A Solution

In this article I'd like to present a slightly modified implementation that supports easy and efficient access to sub-results.

As an example, suppose that we need to store and evaluate boolean expressions. A boolean expression can be represented as a tree-like structure, with atomic propositions as leaf nodes and boolean operations as intermediary nodes. The evaluation of such an expression is a great case for Visitor.

To represent such functions, we'll need to build a hierarchy consisting of the following classes:
  1. Expression - a generic boolean expression,
  2. Atom - an atomic proposition,
  3. Negation - a negation of an expression,
  4. Conjunction - a conjunction of two expression,
  5. Disjunction - a disjunction of two expression.
Let's first focus on the Visitor implementation. The following is a generic Visitor interface. For every operation (such as evaluating an expression), we'll create a separate implementation.

public interface Visitor<T> {
    public T visit(Atom atom);
    public T visit(Negation neg);
    public T visit(Conjunction con);
    public T visit(Disjunction dis);
}

It is important to note that this interface is parametrized - here T stands for the type of result of the represented operation. Every method accepts an expression of some kind as an argument and returns the result of applying the represented operation on that expression. This allows us to implement these methods using recursive calls according to the expression structure, as we'll see below. This is where this approach differs from those presented in existing articles.

In our case, T is going to be Boolean, as evaluating a boolean expression results into a single boolean value. For other operations this could be different.

Let's now take a look at the implementation.

public class Evaluate implements Visitor<Boolean> {
    private final Map<Atom, Boolean> valuation;

    public Evaluate(Map<Atom, Boolean> valuation) {
        this.valuation = new HashMap<>(valuation);
    }

    @Override
    public Boolean visit(Atom atom) {
        Boolean result = valuation.get(atom);
        if (result == null) {
            throw new IllegalArgumentException(
                "Missing valuation for atom [" + atom + "]."
            );
        }
        return result;
    }

    @Override
    public Boolean visit(Negation neg) {
        Boolean subResult = neg.getExpr().accept(this);
        return !subResult;
    }

    @Override
    public Boolean visit(Conjunction con) {
        Boolean leftSubResult = con.getLeftExpr().accept(this);
        Boolean rightSubResult = con.getRightExpr().accept(this);
        return leftSubResult && rightSubResult;
    }

    @Override
    public Boolean visit(Disjunction dis) {
        Boolean leftSubResult = dis.getLeftExpr().accept(this);
        Boolean rightSubResult = dis.getRightExpr().accept(this);
        return leftSubResult || rightSubResult;
     }
}

Please note again how applying the Divide and conquer principle resulted into a very clear and concise implementation.

The last piece of puzzle is the Expression interface. It looks as follows.

public interface Expression {
    public <T> T accept(Visitor<T> visitor);
}

The important part is the accept method declaration. It is generic, which allows to use it together with a Visitor of any result type.

For the sake of completeness, let's now look at the rest of the Expression hierarchy.

public class Atom implements Expression {
    @Override
    public <T> T accept(Visitor<T> visitor) {
        return visitor.visit(this);
    }
}

public class Negation implements Expression {
    private final Expression expr;

    public Negation(Expression expr) {
        this.expr = expr;
    }

    public Expression getExpr() {
        return expr;
    }

    @Override
    public <T> T accept(Visitor<T> visitor) {
        return visitor.visit(this);
    }
}

public class Disjunction implements Expression {
    private final Expression leftExpr;
    private final Expression rightExpr;

    public Disjunction(Expression leftExpr, Expression rightExpr) {
        this.leftExpr = leftExpr;
        this.rightExpr = rightExpr;
    }

    public Expression getLeftExpr() {
        return leftExpr;
    }

    public Expression getRightExpr() {
        return rightExpr;
    }

    @Override
    public <T> T accept(Visitor<T> visitor) {
        return visitor.visit(this);
    }
}

Lastly, the following snippet shows how to put all the pieces together to evaluate a sample boolean expression.

Atom a = new Atom();
Atom b = new Atom();
Atom c = new Atom();

Expression expr = new Disjunction(
    new Conjunction(
        a,
        new Negation(
            b
        )
    ),
    c
);

Map<Atom, Boolean> valuation = new HashMap<>();
valuation.put(a, true);
valuation.put(b, true);
valuation.put(c, false);

Visitor<Boolean> evaluate = new Evaluate(valuation);
System.out.println(expr.accept(evaluate));

Conclusion

Compared to traditionaly presented Visitor implementations, the one described in this article offers greater flexibility and allows to take advantage of the Divide and conquer principle, which is very convenient when dealing with recursive data-structures. The cost is a certain complexity increase due to the introduction of generic types.

If you have any questions and/or comments, please don't hesitate to contact me. I'll appreciate any feedback!