JPA performance tip – 1: surrogate keys.

This post is the first in series of JPA performance tips.

The tips are result of my working with JPA and facing issues which are not documented.

These are non-obvious performance pitfalls which if you do not take care from beginning may result in lot of rework or worse non resolvable issues making you grin and bear.

So, please use my experience and build your systems correctly from the beginning.

Surrogate key: The surrogate key is the primary key with no business meaning. Generally the surrogate key type of primary key is of two types:

  • Usually generated with database sequence (but not always)
  • GUID generator.

The primary key generated with sequence is of type  Number (translating to long in java).We will discuss how a ‘long’ type of primary key can improve the performance in a database.

Let us take an  example tables

‘Person’ : contains the details of a person

‘Person_phones’ : One or more phones for a user

The Person entity class is created as follows

public class Person {

private String id;

@OneToMany(cascade = CascadeType.ALL, fetch = FetchType.EAGER, mappedBy = "person")
private Set<Person_phones> phones = new HashSet<Person_phones>();

private String firstName;
private String lastName;
private String type;

The Person_phones class is:

public class Person_phones {

private String id;

private String phoneNumber;

@JoinColumn(name="person", referencedColumnName="id")
private Person person;


To create the person and person phone following code snippet is given. The id in both person and person_phone table is assigned using setter methods.

public String create(String firstName, String lastName, String type)
Person per = new Person();

Person_phones phone = new Person_phones("9873692304", per);

Person newPer =;
System.out.println("New ID: " + newPer.getId());

return "success";


Here the CRUD operations are performed using the CRUDRepository interface provided by spring.

If we run the code with show_sql true lets see the output

<!– HTML generated using –>

Hibernate: select as id1_1_1_, person0_.first_name as first_na2_1_1_, person0_.last_name as last_nam3_1_1_, person0_.type as type4_1_1_, phones1_.person as person3_1_3_, as id1_2_3_, as id1_2_0_, phones1_.person as person3_2_0_, phones1_.phone_number as phone_nu2_2_0_ from person person0_ left outer join person_phones phones1_ on where
Hibernate: insert into person (id, first_name, last_name, type) values (null, ?, ?, ?)
Hibernate: select as id1_2_0_, person_pho0_.person as person3_2_0_, person_pho0_.phone_number as phone_nu2_2_0_ from person_phones person_pho0_ where

Hibernate: insert into person_phones (person, phone_number, id) values (?, ?, ?)


We see that there is a Select query before an insert !!!!!

Now if the person and person_phone contains millions of records, consider the cost of one extra select query. But the point to explore is why did this happen ??

For getting the answer, we need to go into the code of CRUDRepository by Spring. When we call the the control goes to the following class and code snippet

@Transactional(readOnly = true)
public class SimpleJpaRepository<T, ID extends Serializable> implements JpaRepository<T, ID>,
JpaSpecificationExecutor<T> {

public <S extends T> S save(S entity) {

if (entityInformation.isNew(entity)) {
return entity;
} else {
return em.merge(entity);


The entityInformation.isNew(s entity), checks the primary key type. if the primary key type is primitive, i.e. long, int etc it returns true and e.persist is called.

In case pf PK being String the String (as in our case)  or any other class (as in composite key) the  entityInformation.isNew(s entity) returns false and em.merge() called.

In case of em.merge(), the hibernate checks if the entity is already existing by making the select query. Hence we get select query in above example.

if the  primary key is primitive and surrogate (generation type auto, sequence or table) i.e. the not the application data, the is called and  and there is only insert query saving on valuable IO.

Conclusion: Always have surrogate keys for all tables. The relationship (oneToMany, manyToOne) can be maintained outside the primary  keys.  The non-primitive primary keys and composite keys will have a performance impact while inserting new rows.

This article can also be viewed at

Leave a Reply

Your email address will not be published. Required fields are marked *