Identity and Equality of JPA Entities

JPA entities contain an id field (or property) annotaded with @Id. For this post I will assume a single field, but compound ids are possible as well.

Entity objects are business objects, so it is generally accepted that we will have to supply equals and hashCode for entity classes. Let me point out here, that it’s very unlikely for our business code to call any of these methods directly, because we don’t normally need explicit comparisons of entity objects in our business code, but we use collections which in turn may call equals and hashcode for looking up or adding entries.

We have several expectations regarding the semantics of equals and collection operations expecially when looking at entity objects:
We have several expectations regarding the semantics of equals and collection operations expecially when looking at entity objects:

  1. Objects based on the same database entry should be the same regarding equals and objects from different db rows should be different

a) including objects from different entity managers,
b) including detached objects,
c) including new (transient) objects,
d) even if some non-id attribute has been changed (they will end up in ‚their‘ db records!).

  1. Hashcode-based collections (e. g. HashSet) should behave friendly, i. e.

a) adding multiple objects should be possible – even for new (transient) objects,
b) the collection should remain intact even if contained objects get persisted into the db.

If the entity has a business id, i. e. the id attribute has some business meaning and is set by the constructor, the expectations can easily be met by using exactly the id attribute in equals and hashCode.

If you choose to compare all fields instead of just the id, your implementation breaks expectation 1.d). If you choose to not implement equals and hashCode at all and rather use the methods derived from Object, your implementation breaks expectations 1.a) and 1.b).

So, as an intermediate conclusion, implement equals and hashCode in your entity classes and base them on just the id attribute(s):


@Entity
public class Foo {

@Id
private String id;

...

public Foo(String id, ...) {
this.id = id;
...
}

public int hashCode() {
return (this.id == null) ? 0 : this.id.hashCode();
}

public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
Foo other = (Foo) obj;
if (this.id == null) {
return other.id == null;
}
return this.id.equals(other.id);
}

Things get more complicated, if you want to use generated ids. JPA supports you in this with the annotation @GeneratedValue which has to be placed on the id attribute in addition to @Id. The generator works for integral number fields and it’s best to abstain from primitive types in order to have the additional value null for unset ids. So you would use Integer, Long or BigInteger depending on the amount of data entries expected:

@Entity
public class Foo {

@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;

The problems with generated ids looking at identity and equality arise from the fact, that the entity manager will set the ids late – they may be populated not until the commit of the inserting transaction.

To begin with you will use the id field for comparison in equals and in hashCode as you did for business ids before. But if you do that in exactly the way shown above, you break expectation 2.a), because equals will evaluate all new (transient) objects as equal, as their ids are all null.

You can fix this by modifying equals such that it returns false if any of the compared ids are still unset:

public boolean equals(Object obj) {
...
if (this.id == null) {
return false;
}
return this.id.equals(other.id);
}

While this now meets expectation 2.a), it still breaks expectation 2.b): If you add some new entities to a HashSet and persist them afterwards, the collection will be corrupt due to the changed hash codes.

And bad news: There is no way to get around this, if using generated ids! In many application this still is no issue, because the program sequence „Create multiple objects“, „add them to a hash-based collection“, „persist the objects“, „use the collection“ is not very common and can be circumvented by persisting (and flushing) the objects before they get added to the collection.

Please take into account, that the usage of a hash-based collection may not be directly visible and instead be hidden as association collection in some other entity class.

What are your options, if you want to use generated ids and your business logic suffers from the hash-based collection problem discussed above? Well, the solution is to use an id populated early, i. e. in the constructor, which can be generated easily and – most advisable for performance – without hitting the database. java.util.UUID offeres the desired functionalty. It supplies 128 bit integers commonly expressed as 36 character string (32 hex digits and 4 dashes). java.util.UUID uses random numbers as base and offers a distribution which makes duplicates very unlikely. Other implementations exist, which use MAC addresses and random numbers.

@Entity
public class Foo {

@Id
private String id;

private String description;

public Foo(String description) {
this.id = UUID.randomUUID().toString();
...

By using uuids you cherry-pick the advantages from having early set ids while still having generated values.

The size of uuids may be painful. They are roughly four times the size of an integer, but storage requirements depend on your database product. Remember that they are ids, so every foreign key in the db has the same size. If that is a problem, you can resort to having an early set uuid for supporting equals and hashCode, which is not annotated with @Id. Instead you add another integer field as JPA generated id, i. a. annotated with @Id @GeneratedValue. Now all tables contain a number as primary key as well as a non-primary uuid column, but all foreign keys are just numbers.

So, the subject of JPA ids, which seems easy at first glance, is far from that in reality. You may use the following recipe when designing entity classes:

  • If your entity contains some identifying business attribute, take this as id. Let equals and hashCode use exactly this attribute. All expectations expressed above will be met
    (-> ChemicalElement in showcase).
  • If you don’t find a suitable business id, and …
    • if the hash-based collection problem discussed above is no problem for your application, use a @Id @GeneratedValue annotated integer id. Let equals and hashCode use exactly this attribute, but modify equals such that unset ids render a false return value. All expectations expressed above will be met with the exception of 2.b)
      (-> LabExperiment3NonNullIdEquality in showcase).
    • you want / have to use early set ids, use uuids.
      • If db size does not really matter or your model contains just a few associations, use the uuids directly as JPA ids. Let equals and hashCode use exactly this attribute. All expectations expressed above will be met
        (-> LabExperiment4UUIdEquality in showcase).
      • If you care about the storage consumption of foreign keys in your database, use the uuid just for equals and hashCode and add a separate @Id @GeneratedValue id to your class. Let equals and hashCode use exactly the uuid attribute. All expectations expressed above will be met
        (-> LabExperiment5AddIdEquality in showcase).

There is a showcase on https://github.com/GEDOPLAN/jpa-equals-demo demonstrating the various options. You can find the referenced classes there.

See you in our trainings at Berlin, Bielefeld, Cologne or your site!
http://gedoplan-it-training.de/


 

Advertisements

Über Dirk Weil
Dirk Weil ist seit 1998 als Berater im Bereich Java tätig. Als Geschäftsführer der GEDOPLAN GmbH in Bielefeld ist er für die Konzeption und Realisierung von Informationssystemen auf Basis von Java EE verantwortlich. Seine langjährige Erfahrung in der Entwicklung anspruchsvoller Unternehmenslösungen machen ihn zu einem kompetenten Ansprechpartner und anerkannten Experten auf dem Gebiet Java EE. Er ist Autor in Fachmagazinen, hält Vorträge und leitet Seminare und Workshops aus einem eigenen Java-Curriculum.

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden /  Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden /  Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden /  Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden /  Ändern )

Verbinde mit %s

%d Bloggern gefällt das: