Class SourceOperator<OUT,SplitT extends org.apache.flink.api.connector.source.SourceSplit>
- Type Parameters:
OUT- The output type of the operator.
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.state.CheckpointListener,AvailabilityProvider,OperatorEventHandler,KeyContext,KeyContextHandler,TimestampsAndWatermarks.WatermarkUpdateListener,StreamOperator<OUT>,StreamOperatorStateHandler.CheckpointedStreamOperator,YieldingOperator<OUT>,PushingAsyncDataInput<OUT>
PushingAsyncDataInput which is naturally compatible with one
input processing in runtime stack.
Important Note on Serialization: The SourceOperator inherits the Serializable interface from the StreamOperator, but is in fact NOT serializable. The
operator must only be instantiated in the StreamTask from its factory.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.flink.runtime.io.AvailabilityProvider
AvailabilityProvider.AvailabilityHelperNested classes/interfaces inherited from interface org.apache.flink.streaming.runtime.io.PushingAsyncDataInput
PushingAsyncDataInput.DataOutput<T> -
Field Summary
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManagerFields inherited from interface org.apache.flink.runtime.io.AvailabilityProvider
AVAILABLE -
Constructor Summary
ConstructorsConstructorDescriptionSourceOperator(StreamOperatorParameters<OUT> parameters, org.apache.flink.util.function.FunctionWithException<org.apache.flink.api.connector.source.SourceReaderContext, org.apache.flink.api.connector.source.SourceReader<OUT, SplitT>, Exception> readerFactory, OperatorEventGateway operatorEventGateway, org.apache.flink.core.io.SimpleVersionedSerializer<SplitT> splitSerializer, org.apache.flink.api.common.eventtime.WatermarkStrategy<OUT> watermarkStrategy, ProcessingTimeService timeService, org.apache.flink.configuration.Configuration configuration, String localHostname, boolean emitProgressiveWatermarks, StreamTask.CanEmitBatchOfRecordsChecker canEmitBatchOfRecords, Map<String, Boolean> watermarkIsAlignedMap) -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()This method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.emitNext(PushingAsyncDataInput.DataOutput<OUT> output) Pushes elements to the output from current data input, and returns the input status to indicate whether there are more available data in current input.voidfinish()This method is called at the end of data processing.voidhandleOperatorEvent(OperatorEvent event) voidStream operators with state which can be restored need to override this hook method.voidInitializes the reader.protected voidvoidnotifyCheckpointAborted(long checkpointId) voidnotifyCheckpointComplete(long checkpointId) voidopen()This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.protected voidsetup(StreamTask<?, ?> containingTask, StreamConfig config, Output<StreamRecord<OUT>> output) voidsnapshotState(StateSnapshotContext context) Stream operators with state, which want to participate in a snapshot need to override this hook method.voidsplitFinished(String splitId) Notifies that split has finished.voidupdateCurrentEffectiveWatermark(long watermark) Update the effective watermark.voidupdateCurrentSplitWatermark(String splitId, long watermark) Notifies about changes to per split watermarks.voidupdateIdle(boolean isIdle) It should be called once the idle is changed.Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
beforeInitializeStateHandler, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark, processWatermark1, processWatermark1, processWatermark2, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, snapshotState, useSplittableTimersMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.runtime.io.AvailabilityProvider
isApproximatelyAvailable, isAvailableMethods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContextMethods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
getOperatorAttributes
-
Constructor Details
-
SourceOperator
public SourceOperator(StreamOperatorParameters<OUT> parameters, org.apache.flink.util.function.FunctionWithException<org.apache.flink.api.connector.source.SourceReaderContext, org.apache.flink.api.connector.source.SourceReader<OUT, SplitT>, Exception> readerFactory, OperatorEventGateway operatorEventGateway, org.apache.flink.core.io.SimpleVersionedSerializer<SplitT> splitSerializer, org.apache.flink.api.common.eventtime.WatermarkStrategy<OUT> watermarkStrategy, ProcessingTimeService timeService, org.apache.flink.configuration.Configuration configuration, String localHostname, boolean emitProgressiveWatermarks, StreamTask.CanEmitBatchOfRecordsChecker canEmitBatchOfRecords, Map<String, Boolean> watermarkIsAlignedMap)
-
-
Method Details
-
setup
protected void setup(StreamTask<?, ?> containingTask, StreamConfig config, Output<StreamRecord<OUT>> output) - Overrides:
setupin classAbstractStreamOperator<OUT>
-
initSourceMetricGroup
@VisibleForTesting protected void initSourceMetricGroup() -
initReader
Initializes the reader. The code from this method should ideally happen in the constructor or in the operator factory even. It has to happen here at a slightly later stage, because of the lazy metric initialization.Calling this method explicitly is an optional way to have the reader initialization a bit earlier than in open(), as needed by the
SourceOperatorStreamTaskThis code should move to the constructor once the metric groups are available at task setup time.
- Throws:
Exception
-
getSourceMetricGroup
-
open
Description copied from class:AbstractStreamOperatorThis method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
openin interfaceStreamOperator<OUT>- Overrides:
openin classAbstractStreamOperator<OUT>- Throws:
Exception- An exception in this method causes the operator to fail.
-
finish
Description copied from interface:StreamOperatorThis method is called at the end of data processing.The method is expected to flush all remaining buffered data. Exceptions during this flushing of buffered data should be propagated, in order to cause the operation to be recognized as failed, because the last data items are not processed properly.
After this method is called, no more records can be produced for the downstream operators.
WARNING: It is not safe to use this method to commit any transactions or other side effects! You can use this method to flush any buffered data that can later on be committed e.g. in a
CheckpointListener.notifyCheckpointComplete(long).NOTE:This method does not need to close any resources. You should release external resources in the
StreamOperator.close()method.- Specified by:
finishin interfaceStreamOperator<OUT>- Overrides:
finishin classAbstractStreamOperator<OUT>- Throws:
Exception- An exception in this method causes the operator to fail.
-
stop
-
close
Description copied from interface:StreamOperatorThis method is called at the very end of the operator's life, both in the case of a successful completion of the operation, and in the case of a failure and canceling.This method is expected to make a thorough effort to release all resources that the operator has acquired.
NOTE:It can not emit any records! If you need to emit records at the end of processing, do so in the
StreamOperator.finish()method.- Specified by:
closein interfaceStreamOperator<OUT>- Overrides:
closein classAbstractStreamOperator<OUT>- Throws:
Exception
-
emitNext
Description copied from interface:PushingAsyncDataInputPushes elements to the output from current data input, and returns the input status to indicate whether there are more available data in current input.This method should be non blocking.
- Specified by:
emitNextin interfacePushingAsyncDataInput<OUT>- Throws:
Exception
-
snapshotState
Description copied from class:AbstractStreamOperatorStream operators with state, which want to participate in a snapshot need to override this hook method.- Specified by:
snapshotStatein interfaceStreamOperatorStateHandler.CheckpointedStreamOperator- Overrides:
snapshotStatein classAbstractStreamOperator<OUT>- Parameters:
context- context that provides information and means required for taking a snapshot- Throws:
Exception
-
getAvailableFuture
- Specified by:
getAvailableFuturein interfaceAvailabilityProvider- Returns:
- a future that is completed if the respective provider is available.
-
initializeState
Description copied from class:AbstractStreamOperatorStream operators with state which can be restored need to override this hook method.- Specified by:
initializeStatein interfaceStreamOperatorStateHandler.CheckpointedStreamOperator- Overrides:
initializeStatein classAbstractStreamOperator<OUT>- Parameters:
context- context that allows to register different states.- Throws:
Exception
-
notifyCheckpointComplete
- Specified by:
notifyCheckpointCompletein interfaceorg.apache.flink.api.common.state.CheckpointListener- Overrides:
notifyCheckpointCompletein classAbstractStreamOperator<OUT>- Throws:
Exception
-
notifyCheckpointAborted
- Specified by:
notifyCheckpointAbortedin interfaceorg.apache.flink.api.common.state.CheckpointListener- Overrides:
notifyCheckpointAbortedin classAbstractStreamOperator<OUT>- Throws:
Exception
-
handleOperatorEvent
- Specified by:
handleOperatorEventin interfaceOperatorEventHandler
-
updateIdle
public void updateIdle(boolean isIdle) Description copied from interface:TimestampsAndWatermarks.WatermarkUpdateListenerIt should be called once the idle is changed.- Specified by:
updateIdlein interfaceTimestampsAndWatermarks.WatermarkUpdateListener
-
updateCurrentEffectiveWatermark
public void updateCurrentEffectiveWatermark(long watermark) Description copied from interface:TimestampsAndWatermarks.WatermarkUpdateListenerUpdate the effective watermark. If an output becomes idle, please call {@link this#updateIdle} instead of update the watermark toLong.MAX_VALUE. Because the output needs to distinguish between idle and real watermark.- Specified by:
updateCurrentEffectiveWatermarkin interfaceTimestampsAndWatermarks.WatermarkUpdateListener
-
updateCurrentSplitWatermark
Description copied from interface:TimestampsAndWatermarks.WatermarkUpdateListenerNotifies about changes to per split watermarks.- Specified by:
updateCurrentSplitWatermarkin interfaceTimestampsAndWatermarks.WatermarkUpdateListener
-
splitFinished
Description copied from interface:TimestampsAndWatermarks.WatermarkUpdateListenerNotifies that split has finished.- Specified by:
splitFinishedin interfaceTimestampsAndWatermarks.WatermarkUpdateListener
-
getSourceReader
-