Spring Data & JPA
The layer where most production Spring apps spend their complexity budget: mapping object graphs to relational schemas, generating queries from method names, and keeping transactions and lazy loading under control. This chapter goes from JpaRepository to N+1 diagnosis, propagation edge cases, and audit columns.
Spring Data abstractions
Spring Data eliminates boilerplate DAO implementations. You declare an interface; the framework generates the implementation at runtime.
Repository hierarchy
| Interface | Adds |
|---|---|
| Repository<T, ID> | Marker — no methods; enables Spring Data repository detection |
| CrudRepository<T, ID> | save, findById, findAll, deleteById, count |
| PagingAndSortingRepository<T, ID> | findAll(Pageable), findAll(Sort) |
| JpaRepository<T, ID> | JPA-specific: flush, saveAndFlush, deleteInBatch, getReferenceById |
public interface OrderRepository extends JpaRepository<Order, Long> {
List<Order> findByCustomerIdAndStatus(String customerId, OrderStatus status);
Optional<Order> findByExternalRef(String externalRef);
}
At startup, JpaRepositoryFactoryBean creates a JDK dynamic proxy implementing your interface. Each method routes to SimpleJpaRepository (built-in CRUD) or a QueryMethod parsed from the method name / @Query. No bytecode generation of implementation classes—you get a proxy delegating to shared infrastructure.
Prefer Optional<T> return types for single-result queries—Spring Data translates empty results to Optional.empty() instead of returning null.
Entity mapping
JPA maps Java classes to relational tables. Hibernate is the default JPA provider in Spring Boot. Entities must have a no-arg constructor (can be private) and an identifier.
@Entity
@Table(name = "orders", indexes = @Index(name = "idx_orders_customer", columnList = "customer_id"))
public class Order {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "customer_id", nullable = false, length = 64)
private String customerId;
@Enumerated(EnumType.STRING)
@Column(nullable = false, length = 32)
private OrderStatus status;
@Column(name = "created_at", nullable = false)
private Instant createdAt;
protected Order() {} // JPA requirement
public Order(String customerId, OrderStatus status) {
this.customerId = customerId;
this.status = status;
this.createdAt = Instant.now();
}
}
@GeneratedValue strategies
Choosing the wrong strategy causes performance issues or ID collisions across databases.
| Strategy | How it works | When to use |
|---|---|---|
| IDENTITY | DB auto-increment (INSERT returns ID) | PostgreSQL, MySQL, SQL Server — simplest; ID known after flush |
| SEQUENCE | Separate sequence object; Hibernate can batch allocations | Oracle, PostgreSQL — better for bulk inserts; use @SequenceGenerator |
| TABLE | Emulates sequence via lock table | Legacy portability — slow; avoid in new code |
| AUTO | Provider picks based on dialect | Dev convenience — be explicit in production |
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "order_seq")
@SequenceGenerator(name = "order_seq", sequenceName = "order_id_seq", allocationSize = 50)
private Long id;
IDENTITY prevents Hibernate JDBC batching on inserts—the ID is required immediately per row. High-volume ingest: use SEQUENCE with allocationSize aligned to Hibernate's optimizer, or assign UUIDs in application code.
Field mapping
Control column names, nullability, length, and how Java types persist.
| Annotation | Purpose |
|---|---|
| @Column | Name, nullable, length, unique, columnDefinition |
| @Transient | Not persisted — computed fields, caches on entity (use sparingly) |
| @Enumerated(STRING) | Store enum name — readable, survives enum reorder (preferred) |
| @Enumerated(ORDINAL) | Store 0,1,2 — fragile if enum order changes |
| @Lob | Large object — CLOB/BLOB; consider external object storage for big files |
| @Convert | Custom AttributeConverter — e.g. JSON column, encrypted strings |
@Converter
class JsonMapConverter implements AttributeConverter<Map<String, String>, String> {
private static final ObjectMapper MAPPER = new ObjectMapper();
@Override
public String convertToDatabaseColumn(Map<String, String> attribute) {
try { return MAPPER.writeValueAsString(attribute); }
catch (JsonProcessingException e) { throw new IllegalArgumentException(e); }
}
@Override
public Map<String, String> convertToEntityAttribute(String dbData) {
try { return MAPPER.readValue(dbData, new TypeReference<>() {}); }
catch (JsonProcessingException e) { throw new IllegalArgumentException(e); }
}
}
Relationship mapping
Object graphs map to foreign keys and join tables. Every association has an owning side (with the FK) and optionally an inverse side (mappedBy).
| Annotation | Default fetch | Typical mapping |
|---|---|---|
| @ManyToOne | EAGER | Child → parent FK column |
| @OneToMany | LAZY | Parent → collection; inverse of ManyToOne |
| @OneToOne | EAGER | Profile ↔ User; either side can own FK |
| @ManyToMany | LAZY | Join table; prefer explicit link entity in production |
@Entity
public class Order {
@Id @GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@OneToMany(mappedBy = "order", cascade = CascadeType.PERSIST, orphanRemoval = true)
private List<OrderLine> lines = new ArrayList<>();
public void addLine(OrderLine line) {
lines.add(line);
line.setOrder(this);
}
}
@Entity
public class OrderLine {
@Id @GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@ManyToOne(fetch = FetchType.LAZY, optional = false)
@JoinColumn(name = "order_id", nullable = false)
private Order order;
}
CascadeType — use with caution
| Cascade | Propagates |
|---|---|
| PERSIST | persist() to associated entities |
| MERGE | merge() on detached graphs |
| REMOVE | remove() cascades deletes |
| ALL | All of the above + refresh, detach |
CascadeType.ALL on @ManyToOne or large collections can delete far more than intended—one removed parent wipes children across the DB. Prefer orphanRemoval = true only on true parent-child composition (Order → OrderLine), never on shared reference entities.
FetchType — defaults and overrides
Always set FetchType.LAZY on @ManyToOne and @OneToOne in production — the JPA default for @ManyToOne is EAGER, which causes accidental joins on every load.
@Entity
public class Student {
@ManyToMany
@JoinTable(
name = "student_course",
joinColumns = @JoinColumn(name = "student_id"),
inverseJoinColumns = @JoinColumn(name = "course_id")
)
private Set<Course> courses = new HashSet<>();
}
Replace @ManyToMany with an explicit Enrollment entity when you need extra columns (enrolledAt, grade, status). Join tables without entities can't carry metadata and complicate queries.
Embeddables
Value objects embedded in the same table—address, money, date ranges—without a separate entity lifecycle.
@Embeddable
public record Address(
@Column(name = "street") String street,
@Column(name = "city") String city,
@Column(name = "postal_code") String postalCode
) {}
@Entity
public class Customer {
@Id @GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Embedded
@AttributeOverrides({
@AttributeOverride(name = "street", column = @Column(name = "billing_street")),
@AttributeOverride(name = "city", column = @Column(name = "billing_city"))
})
private Address billingAddress;
}
Inheritance strategies
Map class hierarchies to relational schema. Each strategy trades storage normalization against query performance.
| Strategy | Schema | Trade-offs |
|---|---|---|
| SINGLE_TABLE | One table, discriminator column | Fast reads; sparse nullable columns; default strategy |
| JOINED | Base table + subclass tables | Normalized; joins on every polymorphic query |
| TABLE_PER_CLASS | Table per concrete class | Polymorphic queries use UNION — poor performance; avoid |
@Entity
@Inheritance(strategy = InheritanceType.SINGLE_TABLE)
@DiscriminatorColumn(name = "payment_type")
public abstract class Payment { @Id @GeneratedValue Long id; }
@Entity
@DiscriminatorValue("CARD")
public class CardPayment extends Payment { private String lastFour; }
@Entity
@DiscriminatorValue("BANK")
public class BankPayment extends Payment { private String iban; }
N+1 query problem
Load N parent rows → Hibernate fires N additional queries for each lazy association. The most common JPA performance bug in production.
sequenceDiagram
participant App as Service
participant EM as EntityManager
participant DB as Database
App->>EM: findAll Orders
EM->>DB: SELECT star FROM orders
DB-->>EM: 100 rows
loop For each order access lines
App->>EM: get lines lazy
EM->>DB: SELECT star FROM order_line WHERE order_id equals id
end
Note over DB: 1 plus 100 equals 101 queries
Detection
- Enable spring.jpa.show-sql=true (dev only) or logging: logging.level.org.hibernate.SQL=DEBUG
- Hibernate statistics: spring.jpa.properties.hibernate.generate_statistics=true
- p6spy or datasource proxy — count statements per request
- APM tools (Datadog, New Relic) — spike in query count per endpoint
Fix 1: JOIN FETCH in JPQL
@Query("SELECT DISTINCT o FROM Order o JOIN FETCH o.lines WHERE o.status = :status")
List<Order> findWithLinesByStatus(@Param("status") OrderStatus status);
Fix 2: @EntityGraph
@Entity
@NamedEntityGraph(name = "Order.withLines", attributeNodes = @NamedAttributeNode("lines"))
public class Order { /* ... */ }
@EntityGraph("Order.withLines")
List<Order> findByStatus(OrderStatus status);
Fix 3: @BatchSize
@Entity
public class Order {
@OneToMany(mappedBy = "order")
@BatchSize(size = 25)
private List<OrderLine> lines;
}
// Hibernate: SELECT ... WHERE order_id IN (?,?,... 25 ids) — reduces N to N/25
Fix 4: DTO projections
Don't load entities at all—query only needed columns into a DTO or interface projection.
public interface OrderSummary {
Long getId();
String getCustomerId();
int getLineCount();
}
@Query("""
SELECT o.id AS id, o.customerId AS customerId, COUNT(l) AS lineCount
FROM Order o LEFT JOIN o.lines l
GROUP BY o.id, o.customerId
""")
List<OrderSummary> findSummaries();
Explain N+1 with concrete numbers: 1 query for list + N for each lazy collection access. Best fix depends on use case: JOIN FETCH for always-needed associations, EntityGraph for optional graphs, DTO for read-only API responses.
Query methods
Spring Data parses method names into queries, or you supply JPQL/SQL explicitly. Know when derived queries stop scaling.
Derived query method naming
| Prefix | Example | Generated intent |
|---|---|---|
| find…By / get…By | findByEmail | SELECT … WHERE email = ? |
| count…By | countByStatus | COUNT … WHERE status = ? |
| exists…By | existsBySku | EXISTS subquery — stops at first match |
| delete…By | deleteByCreatedAtBefore | DELETE … (needs @Transactional on service) |
Keywords: And, Or, Between, LessThan, GreaterThan, Like, In, OrderBy, IgnoreCase, Containing.
Page<Order> findByCustomerIdAndStatusOrderByCreatedAtDesc(
String customerId, OrderStatus status, Pageable pageable);
List<Order> findTop10ByStatusOrderByCreatedAtDesc(OrderStatus status);
@Query — JPQL and native SQL
@Query("SELECT o FROM Order o WHERE o.createdAt >= :since AND o.status IN :statuses")
List<Order> findRecent(@Param("since") Instant since, @Param("statuses") Collection<OrderStatus> statuses);
@Query(value = """
SELECT o.* FROM orders o
WHERE o.customer_id = :customerId
ORDER BY o.created_at DESC
LIMIT :limit
""", nativeQuery = true)
List<Order> findRecentNative(@Param("customerId") String customerId, @Param("limit") int limit);
@Modifying — UPDATE/DELETE
@Modifying(clearAutomatically = true, flushAutomatically = true)
@Query("UPDATE Order o SET o.status = :newStatus WHERE o.id = :id")
int updateStatus(@Param("id") Long id, @Param("newStatus") OrderStatus newStatus);
@Modifying queries bypass the persistence context—managed entities in memory become stale. Use clearAutomatically = true or evict affected entities. Must run inside a transaction.
Projections
| Type | Mechanism |
|---|---|
| Interface closed projection | Getter names match entity properties — Spring Data generates proxy |
| Class-based DTO | Constructor expression in JPQL: SELECT new com.acme.OrderDto(o.id, o.status) |
| Dynamic projection | Method generic type parameter determines projection at runtime |
Transactions
Spring's declarative transactions wrap service methods in AOP proxies. JPA requires a transaction for writes and for keeping the persistence context open during the unit of work.
@Service
public class OrderService {
private final OrderRepository orderRepository;
private final InventoryClient inventoryClient;
@Transactional
public Order placeOrder(PlaceOrderCommand cmd) {
Order order = orderRepository.save(new Order(cmd.customerId()));
inventoryClient.reserve(cmd.sku(), cmd.qty()); // participates in same TX if client is @Transactional
return order;
}
@Transactional(readOnly = true)
public OrderDto getOrder(long id) {
return orderRepository.findById(id)
.map(OrderDto::from)
.orElseThrow(() -> new OrderNotFoundException(id));
}
}
Propagation levels — concrete scenarios
| Propagation | Behavior | Scenario |
|---|---|---|
| REQUIRED (default) | Join existing TX or create new | Normal service method — 95% of usage |
| REQUIRES_NEW | Suspend current TX; always new TX | Audit log that must commit even if outer TX rolls back |
| NESTED | Savepoint within existing TX | Partial rollback of sub-operation (JDBC savepoints; rare with JPA) |
| SUPPORTS | Join if exists; non-transactional otherwise | Read helpers called from both TX and non-TX code |
| NOT_SUPPORTED | Suspend TX; run without | Long-running report that shouldn't hold DB connection |
| MANDATORY | Must have existing TX; else exception | Internal DAO called only from transactional services |
| NEVER | Must not have TX; else exception | Enforce non-transactional side effects |
Isolation levels
| Level | Prevents | Cost |
|---|---|---|
| READ_UNCOMMITTED | Dirty reads (theoretically) | Lowest isolation — rarely used |
| READ_COMMITTED | Dirty reads | PostgreSQL/Oracle default — good for most apps |
| REPEATABLE_READ | Non-repeatable reads | MySQL InnoDB default — phantom reads still possible |
| SERIALIZABLE | Phantoms | Highest consistency — contention and deadlocks |
Self-invocation trap
@Service
public class BrokenOrderService {
public void process(long id) {
doTransactionalWork(id); // NO proxy — @Transactional ignored!
}
@Transactional
void doTransactionalWork(long id) { /* ... */ }
}
// Fix: inject self (careful with cycles), move to another bean, or use AspectJ weaving
Rollback behavior
Default: rollback on unchecked exceptions (RuntimeException, Error). Checked exceptions do not trigger rollback unless configured: @Transactional(rollbackFor = IOException.class).
@Transactional is AOP around advice via TransactionInterceptor. readOnly=true hints Hibernate: FlushMode.MANUAL, no dirty checking flush—optimization for read paths. Put transactions on @Service, not @Repository (Spring Data repos are transactional for single operations already).
@Transactional on private methods is ignored (proxy can't intercept). Catching exceptions inside the method without rethrow prevents rollback—log and rethrow or use rollbackFor.
Hibernate & persistence context
The persistence context (JPA's first-level cache) is a session-scoped map of managed entities. Understanding entity states explains dirty checking, lazy loading, and LazyInitializationException.
Entity states
| State | Meaning | How you get there |
|---|---|---|
| Transient | Not associated with persistence context | new Order() |
| Managed | Tracked; changes flushed at commit | persist(), find(), within @Transactional |
| Detached | Was managed; context closed | TX ended, clear(), serialized to JSON and back |
| Removed | Scheduled for DELETE on flush | remove() on managed entity |
stateDiagram-v2 [*] --> Transient: new entity Transient --> Managed: persist or merge Managed --> Detached: transaction ends Detached --> Managed: merge Managed --> Removed: remove Removed --> Detached: flush delete Detached --> [*]
First-level cache
Within a transaction, repeated findById(1L) returns the same instance—no second SELECT. Identity map guarantees referential consistency inside the unit of work.
Second-level cache
SessionFactory-scoped cache shared across transactions. Entity must be annotated @Cacheable; configure provider (Caffeine in-process, JCache, Infinispan clustered).
@Entity
@Cacheable
@org.hibernate.annotations.Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Product {
@Id private Long id;
private String sku;
private String name;
}
LazyInitializationException
Accessing a lazy association after the persistence context closed throws LazyInitializationException — classic stack trace mentions "no Session" or "could not initialize proxy."
@Transactional(readOnly = true)
public Order getOrder(long id) {
return orderRepository.findById(id).orElseThrow(); // TX ends here
}
// Controller — outside TX
order.getLines().size(); // LazyInitializationException
Proper fixes:
- Fetch needed associations inside TX (JOIN FETCH, EntityGraph)
- Return DTOs from service—not entities with lazy graphs
- @Transactional on the method that traverses the graph (if truly needed)
Open Session In View (OSIV) — spring.jpa.open-in-view=true (Boot default) keeps session open through view rendering. Masks LazyInitializationException but causes lazy loads during JSON serialization—hidden N+1 in controllers. Disable in prod APIs: spring.jpa.open-in-view=false and fetch explicitly in services.
Spring Boot 2.x+ logs a warning when OSIV is enabled. Boot 3 still defaults to true—explicitly set false for REST microservices.
Session management in Spring
JpaTransactionManager binds EntityManager to thread per transaction. Spring Data repositories participate automatically. Don't inject EntityManager into singleton beans without @PersistenceContext (transaction-scoped proxy).
Auditing
Automatic population of created/modified timestamps and user IDs—standard in enterprise schemas without manual setter calls in every service method.
@Configuration
@EnableJpaAuditing(auditorAwareRef = "auditorProvider")
class JpaAuditingConfig {
@Bean
AuditorAware<String> auditorProvider() {
return () -> Optional.ofNullable(SecurityContextHolder.getContext())
.map(SecurityContext::getAuthentication)
.filter(Authentication::isAuthenticated)
.map(Authentication::getName);
}
}
@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
public abstract class AuditableEntity {
@CreatedDate
@Column(nullable = false, updatable = false)
private Instant createdAt;
@LastModifiedDate
@Column(nullable = false)
private Instant updatedAt;
@CreatedBy
@Column(updatable = false, length = 64)
private String createdBy;
@LastModifiedBy
@Column(length = 64)
private String updatedBy;
}
@Entity
public class Order extends AuditableEntity {
@Id @GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
}
Use Instant (UTC) for audit timestamps—not LocalDateTime without zone. For system jobs without security context, AuditorAware should return Optional.of("system"), not empty (which skips @CreatedBy).
Combine JPA auditing with DB-level triggers for compliance-heavy domains (immutable audit trail). Application auditing is convenient; database triggers survive direct SQL and admin tools.