Class WordCount
java.lang.Object
org.apache.flink.streaming.examples.dsv2.wordcount.WordCount
Implements the "WordCount" program by DataStream API V2 that computes a simple word occurrence
histogram over text files. The job will currently be executed in streaming mode, and will support
batch mode execution in the future.
The input is a [list of] plain text file[s] with lines separated by a newline character.
Usage:
--input <path>A list of input files and / or directories to read. If no input is provided, the program is run with default data fromWordCountData.--discovery-interval <duration>Turns the file reader into a continuous source that will monitor the provided input directories every interval and read any new files.--output <path>The output directory where the Job will write the results. If no output path is provided, the Job will print the results tostdout.
This example shows how to:
- Write a simple Flink program by DataStream API V2
- Use tuple data types
- Write and use a user-defined process function
Please note that if you intend to run this example in an IDE, you must first add the following VM options: "--add-opens=java.base/java.util=ALL-UNNAMED". This is necessary because the module system in JDK 17+ restricts some reflection operations.
Please note that the DataStream API V2 is a new set of APIs, to gradually replace the original DataStream API. It is currently in the experimental stage and is not fully available for production.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classImplements a word counter as a user-defined ProcessFunction that counts received words in streaming mode.static final classImplements the string tokenizer that splits sentences into words as a user-defined ProcessFunction. -
Constructor Summary
Constructors -
Method Summary
-
Constructor Details
-
WordCount
public WordCount()
-
-
Method Details
-
main
- Throws:
Exception
-