Skip to content

WIP: AVRO-4223 Gradle plugin for generating Java code#3614

Open
frevib wants to merge 99 commits into
apache:mainfrom
frevib:AVRO-4223-gradle-plugin
Open

WIP: AVRO-4223 Gradle plugin for generating Java code#3614
frevib wants to merge 99 commits into
apache:mainfrom
frevib:AVRO-4223-gradle-plugin

Conversation

@frevib

@frevib frevib commented Jan 5, 2026

Copy link
Copy Markdown

What is the purpose of the change

Gradle plugin to generate Java code from Avro files

Verifying this change

This change added tests and can be verified as follows:

cd to avro/lang/java/gradle-plugin

./gradlew test

Documentation

Release

https://plugins.gradle.org/plugin/eu.eventloopsoftware.avro-gradle-plugin

0.0.2 is released and fully works with AVSC files:

0.0.5

0.0.8

0.1.0 this release adds Protocol support.

0.1.1 Fix issue with Gradle multi project, where sources would not appear on the classpath

Installation instructions: https://github.com/frevib/avro/blob/AVRO-4223-gradle-plugin/lang/java/gradle-plugin/README.md#version

An official release will be done in the coming month

Add license files
Format with Spotless
Add Spotless config
Format
Format
@frevib frevib marked this pull request as ready for review February 2, 2026 15:42
Format
@martin-g

martin-g commented Feb 3, 2026

Copy link
Copy Markdown
Member

I have created https://issues.apache.org/jira/browse/INFRA-27616 for the requirement from Gradle to prove the ownership of avro.apache.org DNS domain.

private fun instantiateAdditionalVelocityTools(velocityToolsClassesNames: List<String>): List<Any> {
return velocityToolsClassesNames.map { velocityToolClassName ->
try {
Class.forName(velocityToolClassName).getDeclaredConstructor().newInstance()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this uses the current class' loader ?
loadLogicalTypesFactories() and doCompile() use the thread's class loader

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used the same as in the Maven plugin:

Class<?> klass = Class.forName(velocityToolClassName);

and:

protected URLClassLoader createClassLoader() throws DependencyResolutionRequiredException, MalformedURLException {

Comment thread lang/java/gradle-plugin/README.md Outdated
compileSchemaTask.protocolFiles.from(
project.fileTree(sourceDirectory).apply {
setIncludes(includesProtocol)
setExcludes(extension.excludes.get())

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - testExcludes

@frevib frevib Mar 21, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not making a distinction between main and test includes/excludes like in the Maven plugin. If you compile for the test classpath, the same value is used from includedSchemaFiles, excludedSchemaFiles, includedProtocolFiles and excludedProtocolFiles.

It's a bit of design choice, but I thought it's a bit unnecessary to have this distinction. Thoughts?

</execution>
<execution>
<id>run-gradle-task-publish</id>
<phase>deploy</phase>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use deploy to deploy -SNAPSHOTs for the Maven artefacts.
AFAIK Gradle plugins repo does not allow -SNAPSHOTs

@frevib frevib Mar 21, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the Gradle plugin portal does not support snapshots. It works different than Sonatype, where you first push and later release via the Sonatype UI. With the Gradle publish plugin when you run ./gradlew publishPlugins you'll immediately publish the plugin to the portal.

You can however publish -SNAPSHOT to the local Maven repo with ./gradlew publishToMavenLocal

Comment thread lang/java/gradle-plugin/build.gradle.kts Outdated
@frevib frevib requested a review from Sineaggi February 16, 2026 07:37
Fix source directory
Fix bug with source dependencies
Release 0.1.1
Update Kotlin plugin

Update readme
Remove unused testExcludes

Use excludes from configuration
Use GradleException
Fix deprecated API
Clarify docs
Update Javadoc

Cleanup unneeded code
Bump version
Add release information
@RyanSkraba

Copy link
Copy Markdown
Contributor

Hello! I haven't been following this, and I'm catching up, but I'm really grateful for the work you've already put into this! I am wholeheartedly for getting the gradle plugin moving for the next Avro release.

As a donation, are you willing to move the package into the org.apache.avro instead of `eu.eventloop namespace? We might need to recreate the ticket with INFRA.

Does anybody have concerns about integrating kotlin code into the Avro repo? I'm not against it but I'm not as familiar with Kotlin.

Comment on lines +64 to +66
for (sourceFile in schemaFileTree.files) {
parser.parse(sourceFile)
}

@ebAtUelzener ebAtUelzener Apr 15, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop assumes that a single pass of all files is sufficient to parse all input files, however in the case of an unresolved schema exception multiple passes may be required.

The old plugin solves this inside its SchemaResolver by delaying the parsing of these unresolved schemas (see call on line 67) and then requeueing them after each successfully parsed schema (see the call on line 55).

This can easily happen for instance in the default value validation step for record fields when using an enum that has not been parsed yet due to how the filetree walk has ordered the schema files.

@ebAtUelzener ebAtUelzener Apr 15, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a POC i've tried using a custom task definition extending from the AbstractCompileTask that changes the loop a bit to allow multiple passes until nothing successfully parses anymore. This works for the mentioned issue but i am not sure whether the protocol parsing also needs logic like this.

val remainingWork: MutableList<File> = schemaFileTree.files.toMutableList()
val delayedWork: MutableList<File> = mutableListOf()
val suppressedExceptions: MutableList<Exception> = mutableListOf()
while (remainingWork.isNotEmpty()) {
  val nextWork = remainingWork.removeFirst()
  try {
    parser.parse(nextWork)

    if (delayedWork.isNotEmpty()) {
      delayedWork.forEach { remainingWork.addLast(it) }
      delayedWork.clear()
    }
  } catch (e: RuntimeException) {
    if (e.message?.contains("unresolved schema") == true) {
      delayedWork.add(nextWork)
      suppressedExceptions.add(e)
    } else {
      throw e
    }
  }
}
if (delayedWork.isNotEmpty()) {
  val e = GradleException("Unable to parse some schema files, see suppressed exceptions.")
  suppressedExceptions.forEach { e.addSuppressed(it) }
  throw e
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop assumes that a single pass of all files is sufficient to parse all input files, however in the case of an unresolved schema exception multiple passes may be required.

This is incorrect: the new parser delays resolving unknown schemas until returning them. This is also why the parse methods return a ParseResult: this is a lazy construct, and allows you to resolve the schemas later.

As a bonus, this also allows resolving circular references between files. A loop that parses and resolves each file cannot do this, no matter how often you loop.

@ebAtUelzener ebAtUelzener May 29, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clearing that up, then i assumed to early by just looking at the plugins at too high of a level.

This does not mean that there is no problem at all however. Perhaps the cause lies inside the validator/ how it is used which is how the problem i mentioned occurs.

For reference, when a problem like i initially mentioned as an example arises, you can observe that the task fails with a cause inside the validation Schema.validateDefault:

org.gradle.api.tasks.TaskExecutionException: Execution failed for task ':avroGenerateJavaClasses'
<gradle stack>
Caused by: org.apache.avro.AvroTypeException: Invalid default for field myEnum: "NONE" not a {"type":"record","name":"UnresolvedSchema_1","namespace":"org.apache.avro.compiler","doc":"unresolved schema","fields":[],"org.apache.avro.idl.unresolved.name":"some.MyEnum"}
	at org.apache.avro.Schema.validateDefault(Schema.java:1720)
	at org.apache.avro.Schema$Field.<init>(Schema.java:579)
	at org.apache.avro.Schema.parseField(Schema.java:1904)
	at org.apache.avro.Schema.parseRecord(Schema.java:1872)
	at org.apache.avro.Schema.parse(Schema.java:1836)
	at org.apache.avro.Schema$Parser.parse(Schema.java:1539)
	at org.apache.avro.Schema$Parser.parseInternal(Schema.java:1524)
	at org.apache.avro.JsonSchemaParser.parse(JsonSchemaParser.java:86)
	at org.apache.avro.JsonSchemaParser.parse(JsonSchemaParser.java:77)
	at org.apache.avro.SchemaParser.parse(SchemaParser.java:245)
	at org.apache.avro.SchemaParser.parse(SchemaParser.java:137)
	at org.apache.avro.SchemaParser.parse(SchemaParser.java:103)
	at org.apache.avro.SchemaParser.parse(SchemaParser.java:88)
	at eu.eventloopsoftware.avro.gradle.plugin.tasks.CompileAvroSchemaTask.compileSchemas(CompileAvroSchemaTask.kt:65)
	at eu.eventloopsoftware.avro.gradle.plugin.tasks.CompileAvroSchemaTask.compileSchema(CompileAvroSchemaTask.kt:42)
<gradle stack>

This can be reproduced (somewhat, depends on order they are sent to the parser) by having two files, an enum and a record like so:

Enum schema:

{
  "type": "enum",
  "name": "MyEnum",
  "namespace": "some",
  "symbols": [
    "SOME"
    "NONE"
  ]
}

Record schema:

{
  "type": "record",
  "name": "MyRecord",
  "namespace": "some",
  "fields": [
    {
        "name": "myEnum",
        "type": "MyEnum",
        "default": "NONE"
    }
  ]
}

Question is, is this a bug inside the avro-compiler package rather than an issue here and if so should the plugin accomodate for this bug in the meantime or not?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find! This is caused by a mistimed check on the default value.
What happens is that the default is checked immediately, but the reference is resolved later.

I've created a bug report for this: AVRO-4265

@siddharthapd

siddharthapd commented Apr 30, 2026

Copy link
Copy Markdown

Hey @frevib

I gave it a shot to use 0.1.2 version. For plain simple AVSC, it works but when it comes to adding custom conversions and adding logical factory - it fails.

build.gradle

avro {
    sourceDirectory = "src/main/avro"
    customConversions = ["com.test.custom.AbcdConversion"]
    customLogicalTypeFactories = ["com.test.custom.AbcdLogicalTypeFactory"]
}

tasks.named("compileJava") { dependsOn(tasks.named("avroGenerateJavaClasses")) }

ClassNotFoundException

https://github.com/frevib/avro/blob/AVRO-4223-gradle-plugin/lang/java/gradle-plugin/src/main/kotlin/eu/eventloopsoftware/avro/gradle/plugin/tasks/AbstractCompileTask.kt#L94C9-L94C113
compiler.addCustomConversion(Thread.currentThread().getContextClassLoader().loadClass(customConversion))

Caused by: java.io.IOException: java.lang.ClassNotFoundException: com.test.custom.AbcdConversion
        at eu.eventloopsoftware.avro.gradle.plugin.tasks.AbstractCompileTask.doCompile(AbstractCompileTask.kt:97)
        at eu.eventloopsoftware.avro.gradle.plugin.tasks.CompileAvroSchemaTask.compileSchemas(CompileAvroSchemaTask.kt:68)
        ... 131 more

I feel there is a bit of challenge to use with Gradle Task wit ThreadLocal.

./gradlew clean check --info

setCompilerProperties(compiler)
try {
for (customConversion in customConversions.get()) {
compiler.addCustomConversion(Thread.currentThread().getContextClassLoader().loadClass(customConversion))

@siddharthapd siddharthapd Apr 30, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Thread local is throwing error.
I have shared response seperately

@hamidentur

Copy link
Copy Markdown

@frevib Hi is there a rough timeline when the PR will be finalized?

@frevib

frevib commented Jun 16, 2026

Copy link
Copy Markdown
Author

@hamidentur we were working on support from the Avro team to get this merged. Now I see @RyanSkraba is here to help getting this PR into main.

@siddharthapd has found an issue that I will fix asap. When @RyanSkraba @opwvhk are OK, we can merge this. I think @martin-g has, or knows who has the Gradle Plugin Portal account to actually release this plugin on the Gradle Plugin Portal. I'll be here to provide support and bugfixes.

At our company many teams are using the plugin for production code. n = 1, but it's been thoroughly tested.

@martin-g

Copy link
Copy Markdown
Member

@martin-g has, or knows who has the Gradle Plugin Portal account to actually release this plugin on the Gradle Plugin Portal

The ASF Infra team could/should do it.
I just tried to help you with https://issues.apache.org/jira/browse/INFRA-27616 . The ticket has been closed due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Java Pull Requests for Java binding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants