An aggregate’s API defines its inputs and outputs. The types of inputs and outputs the API exposes will depend on the Creek extensions installed. The quick-start tutorial series focuses on the Kafka Streams extension, and the aggregate’s API will be defined by the Kafka topics it exposes.
ProTip: In Data Mesh terminology, each aggregate output topic, and to a lesser extent, input topic, may define a data product: a defined data set, with a documented schema, the aggregate manages and exposes to the rest of the architecture and organisation.
In this case, the aggregate will expose a single output: the
twitter.handle.usage output topic, defined during the
first quick-start tutorial.
An aggregate’s API is defined in an aggregate descriptor: a Java class that captures the metadata about the aggregate, including its inputs and outputs.
The aggregate template repository, used to bootstrap a new repository during the
Basic Kafka Streams tutorial, provides an empty aggregate descriptor.
So far, this descriptor has been left untouched by quick-start tutorial series.
The aggregate descriptor can be found in the
Note: To avoid descriptor name clashes, the name of the aggregate descriptor is derived from the aggregate name,
i.e. the repository name.
For example, a repository named
basic-kafka-streams-demo would contain an aggregate descriptor named
Whereas, in a repository named
ks-aggregate-api-demo would contain a
The steps below define the API of the tutorial’s aggregate:
Define a Creek aggregate API
Locate the aggregate’s descriptor: this is a class named
<something>AggregateDescriptor in the
To keep things consistent, rename the class to
OccurrenceAggregateDescriptor. This should help avoid confusion
later due to potentially different class names.
handle-occurrence-service’s descriptor: declared in the
class, within the
services module. Copy it’s
TweetHandleUsageStream declaration into
It should look like the following:
This adds an output topic and
registers it with the descriptor.
Update the service descriptor
There are now two definitions of the same topic: one in the aggregate descriptor and one in the service descriptor. This code duplication is to be avoided.
TweetHandleUsageStream declaration in the
HandleOccurrenceServiceDescriptor class to use the aggregate’s topic declaration:
Referencing the aggregate’s topic descriptor, defines in code, that the service’s output topic is part of the aggregate’s api.
Testing the changes
To ensure that changes are correct, run the build by running:
This will compile the changes and run the tests. The build should be green.
A word about dependencies
Before moving on, its worth having a quiet word about the dependencies on the
If you were to look at the Gradle build file in the
dependencies block looks like the following:
The module has a single direct production dependency: the
creek-kafka-metadata that contains the topic descriptor and config types
used within the aggregate’s descriptor.
As the API module is shared code, as the comment states, it is strongly advised to avoid adding production dependencies to this module, other than other metadata jars for specific Creek extensions.
The Creek extension metadata jars themselves deliberately do not provide implementations for the types they define.
Instead, the aggregate template repository provides a default implementation, which aggregates are free to customise
as needed, without any risk of causing dependency conflicts in projects that include
api jars from multiple aggregates.
ProTip: Creek deliberately minimises the surface area of the shared code used in
To avoid dependency hell, it is strongly recommended that you do the same.
There’s nothing worst that not being able to patch a production issue due to a dependency conflict!