Writing an application in Kotlin™Visit the kotlin recipe on GitHub.
Use the Kotlin programming language
In this recipe, you will create an Apache Flink application that every 5 seconds counts the number of words that are posted on an Apache Kafka topic using the Kotlin programming language.
Flink natively supports Java and Scala, but you can use any JVM language to create your application.
This recipe for Apache Flink is a self-contained recipe that you can directly copy and run from your favorite editor. There is no need to download Apache Flink or Apache Kafka.
Kotlin project setup
This project setup is based on the official Apache Flink Maven quickstart for the Java API .
To add support for Kotlin, you add dependencies and configure them correctly in the Maven project. The full documentation can be found at the Kotlin Maven build tool documentation.
Add dependency on Kotlin standard library
kotlin-stdlib-jdk8 artifact can be used with JDK 8 and JDK 11 (or higher).
Configure source/test directories
You need to explicitly define your Kotlin source- and test directories in Maven, so you can reference them in the necessary configuration for the Kotlin Maven plugin.
Setup Kotlin Maven plugin
The Kotlin Maven plugin uses the previously configured values and connects them to the correct Maven lifecycles.
Configure main class in the shade-plugin
By configuring your main class, Maven can properly create a fat JAR when you are building the recipe.
The record input data
The recipe uses Kafka topic
input, containing String records.
_6"Frodo Baggins"_6"Meriadoc Brandybuck"_6"Boromir"_6"Gollum"_6"Sauron"_6...
Count the number of words
The recipe uses the Apache
connector in the application to connect to your Apache Kafka broker.
The recipe uses a Kotlin data class. The class has public and mutable properties, with a no-arg constructor so that Flink treats it as a POJO. This has performance benefits, and more importantly allows schema evolution (i.e., you can add/remove fields) when using this class for state. An immutable data class would be serialized via Kryo, because Flink does not have built-in support for such classes.
Alternatively you could also use Avro types, or implement custom TypeSerializers for your data types.
Each appearing word and the number of appearances will be stored in this data class. Everytime 5 seconds have passed, the application will output the word and how many times it has appeared in that timeframe. Since the data is unbounded, the output to the console will never stop.
The full recipe
This recipe is self-contained. You can run the
WordCountTest#testProductionJob test to see the full recipe
in action. That test uses an embedded Kafka and Flink instance, so you can run it directly via
Maven or in your favorite editor such as IntelliJ IDEA or Visual Studio Code.
See the comments and tests included in the code for more details.