With the topic resources defined in the last step, it’s time to write a simple Kafka Streams topology to perform the business logic of this service.
The service will search each tweets text for occurrences of Twitter handles, e.g.
For each handle found, it will produce a record mapping the Twitter handle to its number of occurrences.
For example, it a tweet contained the handle
@katyperry twice, then it would produce a record
with a key of
@katyperry and a value of
Define the stream topology
The aggregate template provided a shell
Flesh out the class’s
build method to match what’s below:
ProTip: The example code deliberately names each step in the topology. This is good practice. Relying on default naming can result in topology evolution issues in the future. Internal store and topic names incorporate the step name. With default naming, the name of a step, and hence the store or topic, can change if steps are added or removed. This can lead to unintentional changes in internal topic names. If such a change was deployed, any unprocessed messages in the old topics would be skipped.
The above topology consumes
TweetTextStream we defined in the service’s descriptor,
transforms it in the
extractHandles method, and produces any output to the
The Creek Kafka Streams extension provides type-safe access to the input and output topic metadata, serde, and Kafka cluster properties, allowing engineers and the code to focus on the business logic.
As a single input record can result in zero or more output records, depending on the occurrences of Twitter handles in the tweet text,
we use the
flatMap method to invoke the
The details of the
extractHandles method isn’t particularly important in the context of demonstrating Creek’s functionality.
A simple solution might look like this:
…and that’s the production code of the service complete!
Name instance defined in the
TopologyBuilder doesn’t add much in this example, but as topologies
get more complex, getting broken down into multiple builder classes, it really comes into its own.
Check out its JavaDoc to see how it can be used to help avoid topology node name clashes.