Package org.apache.flink.formats.csv
Class RowCsvInputFormat
java.lang.Object
org.apache.flink.api.common.io.RichInputFormat<OT,org.apache.flink.core.fs.FileInputSplit>
org.apache.flink.api.common.io.FileInputFormat<T>
org.apache.flink.formats.csv.AbstractCsvInputFormat<org.apache.flink.types.Row>
org.apache.flink.formats.csv.RowCsvInputFormat
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,,org.apache.flink.core.fs.FileInputSplit> org.apache.flink.core.io.InputSplitSource<org.apache.flink.core.fs.FileInputSplit>
Input format that reads csv into
Row.
Different from old csv org.apache.flink.api.java.io.RowCsvInputFormat: 1.New csv will
emit this row (Fill null the remaining fields) when row is too short. But Old csv will skip this
too short row. 2.New csv, escape char will be removed. But old csv will keep the escape char.
These can be continuously improved in new csv input format: 1.New csv not support configure comment char. The comment char is "#". 2.New csv not support configure multi chars field delimiter. 3.New csv not support read first N, it will throw exception. 4.Only support configure line delimiter: "\r" or "\n" or "\r\n".
- See Also:
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class org.apache.flink.api.common.io.FileInputFormat
org.apache.flink.api.common.io.FileInputFormat.FileBaseStatistics, org.apache.flink.api.common.io.FileInputFormat.InputSplitOpenThread -
Field Summary
Fields inherited from class org.apache.flink.formats.csv.AbstractCsvInputFormat
csvInputStream, csvSchemaFields inherited from class org.apache.flink.api.common.io.FileInputFormat
currentSplit, enumerateNestedFiles, INFLATER_INPUT_STREAM_FACTORIES, minSplitSize, numSplits, openTimeout, READ_WHOLE_SPLIT_FLAG, splitLength, splitStart, stream, unsplittable -
Method Summary
Modifier and TypeMethodDescriptionstatic RowCsvInputFormat.Builderbuilder(org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.types.Row> typeInfo, org.apache.flink.core.fs.Path... filePaths) Create a builder.org.apache.flink.types.RownextRecord(org.apache.flink.types.Row record) voidopen(org.apache.flink.core.fs.FileInputSplit split) booleanMethods inherited from class org.apache.flink.api.common.io.FileInputFormat
acceptFile, close, configure, createInputSplits, decorateInputStream, extractFileExtension, getFilePaths, getFileStats, getFileStats, getInflaterInputStreamFactory, getInputSplitAssigner, getMinSplitSize, getNestedFileEnumeration, getNumSplits, getOpenTimeout, getSplitLength, getSplitStart, getStatistics, getSupportedCompressionFormats, registerInflaterInputStreamFactory, setFilePath, setFilePath, setFilePaths, setFilePaths, setFilesFilter, setMinSplitSize, setNestedFileEnumeration, setNumSplits, setOpenTimeout, testForUnsplittable, toStringMethods inherited from class org.apache.flink.api.common.io.RichInputFormat
closeInputFormat, getRuntimeContext, openInputFormat, setRuntimeContext
-
Method Details
-
open
- Specified by:
openin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.fs.FileInputSplit> - Overrides:
openin classAbstractCsvInputFormat<org.apache.flink.types.Row>- Throws:
IOException
-
reachedEnd
public boolean reachedEnd() -
nextRecord
- Throws:
IOException
-
builder
public static RowCsvInputFormat.Builder builder(org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.types.Row> typeInfo, org.apache.flink.core.fs.Path... filePaths) Create a builder.
-