Using flush() method on each 100 rows of 10 000 slows transaction

  • A+

I have sample project using spring-boot with spring-data-jpa and postgres db with one table.

I'm trying to INSERT 10 000 records in the loop into the table and measure execution time - enabling or disabling flush() method from EntityManager class for each 100 records.

Expected result is that execution time with enabled flush() method is much less then with disabled one, but actually I have the opposite result.

package;  import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service;  @Service public class UserService {     @Autowired     UserRepository userRepository;      public User save(User user) {         return;     } }

package;  import; import org.springframework.stereotype.Repository;  @Repository public interface UserRepository extends JpaRepository<User, Long> { }

package sample;  import; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.CommandLineRunner; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.annotation.Bean; import org.springframework.transaction.annotation.Transactional;  import; import;  import javax.persistence.EntityManager; import javax.persistence.PersistenceContext;  @SpringBootApplication @EnableJpaRepositories(considerNestedRepositories = true) public class Application {     public static void main(String[] args) {, args);     }      @Autowired     private UserService userService;      @PersistenceContext     EntityManager entityManager;      @Bean     public CommandLineRunner addUsers() {         return new CommandLineRunner() {             @Transactional             public void run(String... args) throws Exception {                 long incoming = System.currentTimeMillis();                 for (int i = 1; i <= 10000; i++) {            User("name_" + i));                      if (i % 100 == 0) {                         entityManager.flush();                         entityManager.clear();                     }                 }                 entityManager.close();                 System.out.println("Time: " + (System.currentTimeMillis() - incoming));             }         };     } } 


Make sure you enable JDBC batching in your persistence provider configuration. If you're using Hibernate, add this to your Spring properties:   // or some other reasonable value 

Without enabling batching, I guess the performance regression is due to the overhead of clearing the persistence context every 100 entities, but I'm not sure about that (you'd have to measure).


Actually, enabling JDBC batching or disabling it will not affect the fact that with flush() done every once in a while will not be faster than without it. What you're controlling with the manual flush() is not how the flushing is done (via batched statements or unitary inserts), but instead you're controlling when the flushing to the database will be done.

So what you're comparing is the following:

  1. With flush() every 100 objects: you insert 100 instances into the database upon the flush, and you do this 10000 / 100 = 100 times.
  2. Without flush(): you just collect all 10000 objects in the context in memory and do 10000 inserts upon committing the transaction.

JDBC batching on the other affects how the flushing occurs, but it's still the same number of statements issued with flush() vs without flush().

The benefit of flushing and clearing every once in a while in a loop is to avoid a possible OutOfMemoryError due to the cache holding too many objects.


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: