Etymon

Friday, August 27, 2004

Object.equals()

Like many languages Java provides two equality methods to test objects for equivalence. The first is accessed via the '==' operator and does a simple heap reference comparison. The other is a standard method on Object, equals() which is intended to be overridden by the programmer to perform a value-based comparison. Now the equivalence operator is defined by the language spec, and ultimately implemented by the JVM developers, hence not our concern. Unfortunately the java community has failed to properly document the subtleties involved in writing a correct equals method, and so I continue to come across incorrect implementations and worse, incorrect recommendations.

But first we have to understand what we are dealing with. Fundamentally when we are defining an equals method on a class A, what we are defining is an Equivalence Relation over the set of all objects of class A. This is important because it highlights that we can define our equals method to return anything we want provided we meet three rules.

  • Reflexivity: a.equals(a) -> true
  • Symmetry: a.equals(b) <-> b.equals(a)
  • Transitivity: a.equals(b) && b.equals(c) -> a.equals(c)
If we break any of these rules someone will suffer subtle and devious bugs.

Node: To save typing I'm going to leave out the hashCode method in each of these examples, but be aware if you override equals() you *must* override hashCode() to match.

One consequence of the requirements is that we must ensure that any test we use in our equals implementation meets these requirements as well. So lets consider the standard mistake:

class A {
  int n1;
  public boolean equals(Object o) {
    if (!(o instanceof A)) {
      return false;
    } else {
      A rhs = (A)o;
      return this.n1 == rhs.n1;
    }
  }
}
This is a very common implementation of equals, it has even made its way into a number of Java books. Worse, the bug only manifests in the presence of implementation inheritance of concrete classes.
class B extends A { 
  int n2;
  public boolean equals(Object o) {
    if (!(o instanceof B)) {
      return false;
    } else {
      B rhs = (B)o;
      return this.n1 == rhs.n1 && this.n2 == rhs.n2;
    }
  }
}
To demonstrate the bug lets define a simple function on A's:
static printEqual(A a1, A a2) {
  if (a1.equals(a2)) {
    System.out.println("EQUAL");
  } else {
    System.out.println("NOT-EQUAL");
  }
}
And test it:
  A test1 = new A();
  A test2 = new B();
  printEqual(test1, test2);
      => Prints "EQUAL"
  printEqual(test2, test1);
      => Prints "NOT-EQUAL" !!!!
And we've broken symmetry. This will bite you. There is no situation when an asymmetrical equality is not a bug. Unfortunately it is scary how many places on the net this is presented as the exemplar equals() implementation.

So what does a correct equals method look like. Well there are two. The first is very straight forward, and should be the default choice.

class A {
  int n1;
  public boolean equals(Object o) {
    if (o == this) {
      return true;
    } else if (o == null || !getClass().equals(o.getClass())) {
      return false;
    } else {
      A rhs = (A)o;
      return this.n1 == rhs.n1;
    }
  }
}

class B extends A { 
  int n2;
  public boolean equals(Object o) {
    if (!super.equals(o)) {
      return false;
    } else {
      B rhs = (B)o;
      return this.n2 == rhs.n2;


    }
  }
}
Now test1 != test2 AND test2 != test1. Also, by delegating from B to A at the start of B::equals() we ensure any inaccessible state in A is handled appropriately, and more importantly, that the equality code for A isn't duplicated in B. If you can't delegate from B->A like this, then it suggests that your B isn't actually an A, and you should check your class design (you probably should be using delegation rather than inheritance).

There are times however when you would really like an B to compare equal to an A. The classic example is the circle/ellipse or square/rectangle shape examples. It would be really nice if an ellipse with equal axes compared equal to an equivalent instance of a circle sub-class. The way we can achieve this is to recognise that the core requirement of a correct equals implementation is that the operations we use to implement meet the same reflexivity/symmetry/transitivity requirements as the method itself. Next note that instanceof is only asymmetric in the subclass; and in fact any operation in the subclass will lead to asymmetry! So if we do need subclassed equality, we can't permit a subclass to override the equals() method.

abstract class Shape {
  public boolean equals(Object o) {
    if (o == this) {
      return true;
    } else if (o instanceof Shape) {
      // !! Note: instanceof is safe here as Shape is abstract.
      Shape rhs = (Shape)o;
      return [shape specific comparison --- maybe position];
    } else {
      return false;
    }
  }
}

class Ellipse extends Shape { 
  int minor, major;

  public int getMinor() { return minor; }
  public int getMajor() { return major; }

  // !!! NOTE THE USE OF *final* to prevent asymmetry!
  // Also note the exclusive use of accessors.
  public final boolean equals(Object o) {
    if (!super.equals(o)) {
      return false;
    } else if (getClass().equals(o.getClass())) {
      Eclipse rhs = (Eclipse)o;
      return this.getMajor == rhs.getMajor()
          && this.getMinor == rhs.getMinor();
    } else {
      return false;
    }
  }
}

public Circle extends Ellipse {
  public Circle(int radius) { major = minor = radius; }
  public int getRadius()    { return major; }
}
Note that this is probably a rather clumbersome hierachy. A better approach would be:
public abstract class Ellipse {
  public abstract int getMajor();
  public abstract int getMinor();
  public final boolean equals(Object o) {
    if (!super.equals(o)) {
      return false;
    } else {
      Eclipse rhs = (Eclipse)o;
      return this.getMajor == rhs.getMajor()
          && this.getMinor == rhs.getMinor();
    }
  }
}

public class EllipseImpl {
  int major, minor;
  ...
}

public class CircleImpl {
  int radius;
  public int getMajor() { return radius; }
  ...
}
Note that an CircleImpl will still compare equal with an appropriate EllipseImpl.

Please. If you are programming Java, believe me, these really are the only two options open to you. I really am very tired of tracking down the subtle bugs that result when this advice is ignored.

6 Comments:

  • This post has been removed by a blog administrator.

    By Blogger Roberto Iza, at 9:21 AM, November 23, 2005  

  • Looking for US.... If you are like me it seems like there is never enough time in the day.. You get home late and leave early.. then the weekends thats another story by itself. My wife has found it easier to hire somebody if you we need help with home remodeling. She fond a great site for home remodeling at http://2rcs.com and wanted to share it with you.. Im glad to find help when I need it.. I hope you have great luck remodeling your home...

    By Anonymous home remodeling, at 6:38 PM, January 31, 2006  

  • Nice write up! Thanks. I am dropping the following code on the default solution however:

    if (o == this) {
    return true;
    }

    I'll just let the "else" portion figure that one out (but please correct me if I missed something). I just like less lines of code, and on average this may run faster as the case of comparing an object to itself is statistically less in the code I write.

    By Anonymous Anonymous, at 8:02 AM, April 22, 2006  

  • This post has been removed by the author.

    By Blogger Ehren, at 2:06 AM, February 08, 2009  

  • In the first example, if you remove o == this I'm pretty sure equals will return false if both objects are null.

    By Blogger Ehren, at 2:09 AM, February 08, 2009  

  • By definition, both objects cannot be null. An object has to be non-null for equals to be called on it.

    By Blogger Andrae Muys, at 9:48 AM, February 09, 2009  

Post a Comment

<< Home