Skip to main content

Reverse Engineering JPA Entities

Complete Code
The end result of the code developed in this document can be found in the GitHub monorepo springboot-demo-projects, under the tag persistence-integration.

The Problem

Writing JPA entity classes by hand from an existing database schema is tedious. If your schema has 20 tables you get 20 entity files to create, annotate, and keep in sync whenever the schema changes. Miss a column, get a nullable wrong, or forget a relationship and you're debugging at runtime.

The alternative: generate the entities at build time directly from the SQL schema. No running database required. No manual entity writing. The build reads the schema, spins up an in-memory H2 instance, introspects it, and writes the entity source files before your code even compiles.

How It Works

  1. A dedicated hibernateTools Gradle dependency configuration pulls in the code generation tooling, completely isolated from your compile and runtime classpath.
  2. The generateEntities Gradle task loads your SQL schema into an in-memory H2 database, connects via JDBC, and uses Hibernate Tools to introspect the schema and write entity source files into build/generated/sources/hibernate/.
  3. That generated directory is registered as an additional source root, so entities compile transparently alongside your hand-written code.
  4. The task is wired to run before compileJava / compileKotlin / compileGroovy, so generated types are always available at compilation time.

New Files

Files to Create/Modify
File Tree
.
├── build.gradle # or build.gradle.kts in kotlin
├── ... # other root files omitted
└── src
├── main
│ ├── ... # source code omitted
│ └── resources
│ ├── application-dev.yaml
│ ├── application.yaml
│ ├── hibernate.reveng.xml
│ ├── hibernate-tools.properties
│ ├── logback-spring.xml
│ ├── openapi.yaml
│ ├── sakila-data.sql
│ ├── sakila-schema.sql
│ └── templates/hibernate/pojo/Pojo.ftl
└── test/... # test sources omitted

The highlighted additions fall into two groups: resource files that drive the generation (hibernate.reveng.xml, hibernate-tools.properties, Pojo.ftl, the SQL schema and data files) and the build.gradle changes that wire everything together.

Step 1 — Add Resource Files

sakila-schema.sql

This is the source of truth. The ongoing example in this guide uses the Sakila sample database, but the same setup applies to whatever schema you're working with. Just swap sakila-schema.sql for your own DDL file and update the task accordingly.

Hibernate Tools never connects to your real database. It only reads this SQL file, loads it into an in-memory H2 instance, and introspects that. This file must accurately reflect the production schema. If production has a column your SQL file doesn't, the generated entity won't have it. If the types don't match, the generated mappings will be wrong. Treat this file with the same care you'd give the database itself.

hibernate.reveng.xml

resources/hibernate.reveng.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-reverse-engineering PUBLIC
"-//Hibernate/Hibernate Reverse Engineering DTD 3.0//EN"
"http://hibernate.org/dtd/hibernate-reverse-engineering-3.0.dtd" >

<hibernate-reverse-engineering>
<type-mapping>
<sql-type jdbc-type="TINYINT" hibernate-type="java.lang.Integer"/>
<sql-type jdbc-type="SMALLINT" hibernate-type="java.lang.Integer"/>
<sql-type jdbc-type="BIT" hibernate-type="java.lang.Boolean" />
<sql-type jdbc-type="TIMESTAMP" hibernate-type="java.time.LocalDateTime" />
<sql-type jdbc-type="DATE" hibernate-type="java.time.LocalDate" />
</type-mapping>

<table-filter match-schema="PUBLIC" match-name=".*" />
</hibernate-reverse-engineering>

The <type-mapping> block overrides the default JDBC-to-Java type mappings. Without it, Hibernate Tools would map TIMESTAMP to a raw java.sql.Timestamp and TINYINT to a primitive byte. The overrides give you clean java.time.* equivalents and proper boxed types instead.

The <table-filter> line tells Hibernate Tools to reverse-engineer every table in the PUBLIC schema, which is all of Sakila.

hibernate-tools.properties

resources/hibernate-tools.properties
hibernate.connection.driver_class=org.h2.Driver
hibernate.connection.username=sa
hibernate.connection.password=
hibernate.dialect=org.hibernate.dialect.H2Dialect
hibernate.connection.provider_class=org.hibernate.connection.DriverManagerConnectionProvider

This file supplies the static JDBC connection details. Notice there's no hibernate.connection.url here. That property is injected dynamically by the Gradle task at generation time, pointing at the SQL schema file via an H2 INIT=RUNSCRIPT connection string. Keeping it out of this file avoids hardcoding an absolute path.

Pojo.ftl

resources/templates/hibernate/pojo/Pojo.ftl
<#-- Hibernate Tools 6.x Compatible Template -->
<#-- Available objects: pojo, clazz -->
${pojo.getPackageDeclaration()}

import jakarta.persistence.*;
import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.NoArgsConstructor;
import lombok.Setter;

import java.io.Serial;
import java.io.Serializable;
import java.math.BigDecimal;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.util.HashSet;
import java.util.Set;

<#-- Determine if this is a composite key class (Embeddable) -->
<#assign className = pojo.getDeclarationName()>
<#assign isCompositeKey = !pojo.hasIdentifierProperty() && className?ends_with("Id")>

<#-- For entities: check if they use a composite key -->
<#assign usesCompositeKey = false>
<#assign compositeKeyTypeName = "">
<#assign compositeKeyFields = []>
<#if !isCompositeKey>
<#list pojo.getAllPropertiesIterator() as property>
<#assign javaType = pojo.getJavaTypeName(property, true)>
<#if property.name == "id" && javaType?ends_with("Id")>
<#assign usesCompositeKey = true>
<#assign compositeKeyTypeName = javaType>
</#if>
</#list>
</#if>

/**
* ${className} generated by Hibernate Tools
*/
<#if isCompositeKey>
@Embeddable
@Getter
@Setter
@NoArgsConstructor
@AllArgsConstructor
@Builder
@EqualsAndHashCode
public class ${className} implements Serializable {
<#else>
@Entity
@Table(name = "${clazz.table.name}"<#if clazz.table.schema?? && clazz.table.schema?has_content>, schema = "${clazz.table.schema}"</#if>)
@Getter
@Setter
@NoArgsConstructor
@AllArgsConstructor
@Builder
public class ${className} implements Serializable {
</#if>

@Serial
private static final long serialVersionUID = 1L;

<#-- Iterate over all properties -->
<#list pojo.getAllPropertiesIterator() as property>
<#assign propertyName = property.name>
<#assign javaType = pojo.getJavaTypeName(property, true)>
<#assign valueTypeName = property.value.class.simpleName>

<#-- Check if this is the identifier -->
<#assign isId = pojo.hasIdentifierProperty() && pojo.getIdentifierProperty().name == propertyName>

<#-- Check if this is a composite/embedded id -->
<#assign isEmbeddedId = propertyName == "id" && javaType?ends_with("Id")>

<#if isCompositeKey>
<#-- For composite key classes, just generate columns without @Id -->
<#if (property.value.columns)?? && property.value.columns?has_content>
<#list property.value.columns as column>
@Column(name = "${column.name}"<#if !column.nullable>, nullable = false</#if>)
<#break>
</#list>
<#elseif (property.value.columnIterator)??>
<#assign columnIterator = property.value.columnIterator>
<#if columnIterator.hasNext()>
<#assign column = columnIterator.next()>
@Column(name = "${column.name}"<#if !column.nullable>, nullable = false</#if>)
</#if>
</#if>
private ${javaType} ${propertyName};

<#else>
<#-- Regular entity logic -->

<#-- Handle EmbeddedId (composite primary key reference) -->
<#if isEmbeddedId>
@EmbeddedId
private ${javaType} ${propertyName};

<#-- Generate @Id annotation for simple primary key -->
<#elseif isId>
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
<#if (property.value.columns)?? && property.value.columns?has_content>
<#list property.value.columns as column>
@Column(name = "${column.name}")
<#break>
</#list>
<#elseif (property.value.columnIterator)??>
<#assign columnIterator = property.value.columnIterator>
<#if columnIterator.hasNext()>
<#assign column = columnIterator.next()>
@Column(name = "${column.name}")
</#if>
</#if>
private ${javaType} ${propertyName};

<#-- Handle ManyToOne relationships -->
<#elseif valueTypeName == "ManyToOne">
@ManyToOne(fetch = FetchType.LAZY)
<#-- Determine column name for potential @MapsId -->
<#assign columnName = "">
<#if (property.value.columns)?? && property.value.columns?has_content>
<#list property.value.columns as column>
<#assign columnName = column.name>
<#break>
</#list>
<#elseif (property.value.columnIterator)??>
<#assign columnIterator = property.value.columnIterator>
<#if columnIterator.hasNext()>
<#assign column = columnIterator.next()>
<#assign columnName = column.name>
</#if>
</#if>
<#-- If entity uses composite key, add @MapsId -->
<#if usesCompositeKey && columnName?has_content>
<#-- Convert ACTOR_ID -> actorId for @MapsId value -->
<#assign mapsIdValue = columnName?lower_case?replace("_", " ")?capitalize?replace(" ", "")?uncap_first>
@MapsId("${mapsIdValue}")
</#if>
<#if columnName?has_content>
@JoinColumn(name = "${columnName}")
</#if>
private ${javaType} ${propertyName};

<#-- Handle OneToMany relationships (Set, Bag, List) -->
<#elseif valueTypeName == "Set" || valueTypeName == "Bag" || valueTypeName == "List">
<#--
Derive mappedBy value from property name pattern.
Property naming pattern: {targetEntity}For{ColumnSuffix} (e.g., filmsForLanguageId)
Inverse property pattern: {thisEntity}By{ColumnSuffix} (e.g., languageByLanguageId)
-->
<#assign mappedByValue = className?uncap_first>
<#if propertyName?contains("For")>
<#assign forIndex = propertyName?index_of("For")>
<#if (forIndex > 0) && (forIndex < propertyName?length - 3)>
<#assign columnSuffix = propertyName?substring(forIndex + 3)>
<#assign mappedByValue = className?uncap_first + "By" + columnSuffix>
</#if>
</#if>
@OneToMany(fetch = FetchType.LAZY, mappedBy = "${mappedByValue}")
@Builder.Default
private ${javaType} ${propertyName} = new HashSet<>(0);

<#-- Handle regular columns (BasicValue, SimpleValue) -->
<#elseif valueTypeName == "BasicValue" || valueTypeName == "SimpleValue">
<#if (property.value.columns)?? && property.value.columns?has_content>
<#list property.value.columns as column>
@Column(name = "${column.name}"<#if !column.nullable>, nullable = false</#if><#if (column.length)?? && column.length != 255 && javaType == "String">, length = ${column.length?c}</#if>)
<#break>
</#list>
<#elseif (property.value.columnIterator)??>
<#assign columnIterator = property.value.columnIterator>
<#if columnIterator.hasNext()>
<#assign column = columnIterator.next()>
@Column(name = "${column.name}"<#if !column.nullable>, nullable = false</#if>)
</#if>
</#if>
private ${javaType} ${propertyName};

<#-- Handle Component/Embedded types -->
<#elseif valueTypeName == "Component">
@Embedded
private ${javaType} ${propertyName};

<#-- Fallback for unknown types -->
<#else>
// TODO: Unknown mapping type "${valueTypeName}" for property "${propertyName}"
private ${javaType} ${propertyName};

</#if>
</#if>
</#list>
}

A FreeMarker template that controls the exact shape of each generated entity class. Hibernate Tools calls it once per table and passes a pojo object describing the table's columns and relationships. The template decides what annotations to emit, how to handle composite keys, and how to wire up @OneToMany and @ManyToOne relationships.

All three language variants share the same structural logic — composite key detection, @Id / @EmbeddedId / @ManyToOne / @OneToMany / @Column handling — but differ in what they emit:

  • Java uses Lombok (@Getter, @Setter, @Builder, etc.) and private fields.
  • Kotlin has a toKotlinType() FreeMarker helper that converts Java type names to Kotlin equivalents (IntegerInt, Set<X>MutableSet<X>). Composite key classes become data class for structural equality; regular entities are plain class to avoid JPA proxying issues.
  • Groovy swaps Lombok for Groovy AST transforms (@CompileStatic, @Builder, @EqualsAndHashCode) and drops explicit field visibility since Groovy's property mechanism handles that.

Because Hibernate Tools always emits .java files, the Kotlin and Groovy tasks include a post-generation rename step that converts .java to .kt and .java to .groovy respectively. The Java task skips this entirely.

Step 2 — Wire Up build.gradle

build.gradle
// ...
configurations {
// ...
hibernateTools
}
// ...
dependencies {
// ...
def h2Version = '2.4.240'
def hibernateVersion = '7.2.6.Final'
hibernateTools "com.h2database:h2:${h2Version}"
hibernateTools "org.hibernate.tool:hibernate-tools-ant:${hibernateVersion}"
hibernateTools "org.hibernate.orm:hibernate-core:${hibernateVersion}"

developmentOnly "com.h2database:h2:${h2Version}"
testRuntimeOnly "com.h2database:h2:${h2Version}"
developmentOnly 'org.springframework.boot:spring-boot-h2console'
implementation 'org.springframework.boot:spring-boot-starter-data-jpa'
testImplementation 'org.springframework.boot:spring-boot-starter-data-jpa-test'
}
// ...
tasks.register('generateEntities') {
group = 'build'
description = 'Reverse engineers resources/sakila-schema.sql into JPA Entities'

def outputDir = layout.buildDirectory.dir("generated/sources/hibernate")
def sqlFile = layout.projectDirectory.file("src/main/resources/sakila-schema.sql")
def revengFile = layout.projectDirectory.file("src/main/resources/hibernate.reveng.xml")
def basePropsFile = layout.projectDirectory.file("src/main/resources/hibernate-tools.properties")
def templateDir = layout.projectDirectory.dir("src/main/resources/templates/hibernate")

inputs.file(sqlFile)
inputs.file(revengFile)
inputs.file(basePropsFile).optional()
inputs.dir(templateDir).optional()
outputs.dir(outputDir)

doLast {
def tempPropsFile = layout.buildDirectory.file("tmp/hibernate-tools.properties").get().asFile
tempPropsFile.parentFile.mkdirs()

def h2DbDir = layout.buildDirectory.dir("tmp").get().asFile
h2DbDir.mkdirs()
h2DbDir.listFiles()?.findAll { it.name.startsWith("sakila-h2") }?.each { it.delete() }

def h2DbPath = layout.buildDirectory.file("tmp/sakila-h2").get().asFile.absolutePath.replace('\\', '/')
def sqlPath = sqlFile.asFile.absolutePath.replace('\\', '/')

def h2Loader = new URLClassLoader(
configurations.hibernateTools.collect { it.toURI().toURL() } as URL[],
ClassLoader.systemClassLoader
)
def jdbcProps = new java.util.Properties()
jdbcProps.setProperty('user', 'sa')
jdbcProps.setProperty('password', '')
def h2Driver = h2Loader.loadClass('org.h2.Driver').getDeclaredConstructor().newInstance()
def initConn = h2Driver.connect("jdbc:h2:file:${h2DbPath};INIT=RUNSCRIPT FROM '${sqlPath}'", jdbcProps)
initConn.close()
h2Loader.close()

def props = new Properties()
if (basePropsFile.asFile.exists()) {
basePropsFile.asFile.withInputStream { props.load(it) }
}
props.setProperty('hibernate.connection.url', "jdbc:h2:file:${h2DbPath}")

tempPropsFile.withOutputStream { props.store(it, null) }

def destDir = outputDir.get().asFile
destDir.mkdirs()

ant.taskdef(
name: 'hibernatetool',
classname: 'org.hibernate.tool.ant.HibernateToolTask',
classpath: configurations.hibernateTools.asPath
)

ant.hibernatetool(
destdir: destDir,
templatepath: templateDir.asFile
) {
jdbcconfiguration(
propertyfile: tempPropsFile,
revengfile: revengFile.asFile,
packagename: "${project.group}.${project.name}.generated.entity",
detectmanytomany: true,
detectoptimisticlock: true
)

hbm2java(jdk5: true, ejb3: true)
}
}
}

sourceSets {
main {
java {
// ...
srcDir(layout.buildDirectory.dir("generated/sources/hibernate"))
}
}
}

tasks.named('compileJava') {
// ...
dependsOn 'generateEntities'
}
// ...

A few things worth calling out in the diff:

hibernateTools configuration is a separate Gradle dependency bucket that exists only for code generation. It holds the Hibernate Tools Ant runner, Hibernate ORM core, and H2, none of which belong in your application's compile or runtime classpath. Keeping them here means they do their job at build time and disappear.

Runtime and compile dependencies add the standard Spring Data JPA and H2 jars your application needs at runtime. The Kotlin variant also requires the kotlin("plugin.jpa") Gradle plugin, which generates the no-arg constructors that JPA needs under the hood, something Kotlin classes don't provide by default.

generateEntities task does the work in three steps:

  1. Reads hibernate-tools.properties, appends a dynamic hibernate.connection.url that points H2 at the schema SQL file via INIT=RUNSCRIPT FROM '...', and writes the merged properties to a temp file.
  2. Calls the HibernateToolTask Ant task, passing that temp properties file, hibernate.reveng.xml, the output directory, and the FreeMarker template directory. Hibernate Tools does the rest.
  3. Renames the emitted .java files to .kt or .groovy where needed (Kotlin and Groovy only).

Source set registration adds build/generated/sources/hibernate/ to the appropriate source set so the compiler picks it up. Kotlin wires generateEntities before both compileKotlin and the kapt stub generation task, since annotation processing runs even earlier and also needs the generated types available.

What Gets Generated

After running ./gradlew generateEntities (or any task that depends on it, like build), you'll find entity source files under:

build/generated/sources/hibernate/<group>/<name>/generated/entity/

Every table in the schema becomes one entity class, compiled transparently alongside your hand-written code. You don't reference this directory manually. The source set registration takes care of it.

About the // TODO: Unknown mapping type "EnhancedBasicValue" Comments

You'll likely see comments like this scattered through the generated files:

// TODO: Unknown mapping type "EnhancedBasicValue" for property firstName
private String firstName;

EnhancedBasicValue is an internal Hibernate ORM 7.x type that replaced the older BasicValue. The hibernate-tools-ant FreeMarker template that generates the Java source doesn't recognize it yet. It's a compatibility gap between the tools version and the ORM version. When the template hits an unrecognized mapping type, it falls back to emitting the bare field and leaving a TODO comment instead of generating the full @Column(name = "FIRST_NAME") annotation.

Why Tests Still Pass

Spring Boot's default SpringPhysicalNamingStrategy automatically converts camelCase field names to snake_case column names: firstNamefirst_name. H2 is case-insensitive and PostgreSQL stores column names lowercase by default, so both happen to match what the strategy produces. The missing @Column annotations don't cause a mismatch.

When It Would Become a Problem

  • If a column name doesn't follow the camelCase → snake_case convention.
  • If you need explicit @Column annotations for documentation or to override the naming strategy.

For a demo project using the default naming strategy against Sakila (which follows snake_case consistently), the generated entities work correctly as-is. The TODOs are noise from a known hibernate-tools/Hibernate 7.x incompatibility. You can ignore them.

Schema Changes in the Real World

How Spring Talks to Databases covers code-first vs. database-first approaches in detail. The short version: code-first is convenient for local development, database-first (Flyway, Liquibase) is safer for production.

With reverse engineering, the flow sits closest to the database-first approach — but the "catch up" step is automated. Say production needs a new column, last_updated on the film table. You update sakila-schema.sql to include the new column, then run ./gradlew generateEntities (or just build). The task re-introspects the schema and regenerates all entity files. The new field appears automatically in the entity — you don't touch it by hand.

The tradeoff is that you own the SQL file. When the real database changes, you need to keep sakila-schema.sql in sync, otherwise the generated entities drift from reality. In practice this means your schema migration (Flyway, Liquibase, or a plain SQL script) and the update to your local SQL file should happen together, ideally in the same commit.

Honest Take: You Probably Won't See This in the Wild

Across every real codebase encountered in the field, entities were written by hand. The reverse engineering approach appeared exactly once in a production project.

Most teams either started their project from scratch (schema and entities grow together) or inherited a legacy codebase where someone already wrote the entities years ago. The use case for generating them from SQL at build time is genuinely narrow: you have an existing schema you don't own, you need JPA entities fast, and you don't want to write hundreds of fields manually.

So keep this in your toolkit. Understand what the tooling does and how to set it up. Just don't be surprised if you spend an entire career and never encounter it in a team codebase.