- checkOutputSpecs(JobContext) - Method in class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
- checksumsAreEqual(FileSystem, Path, FileChecksum, FileSystem, Path) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Utility to compare checksums for the paths specified.
- clone() - Method in class org.apache.hadoop.tools.DistCpOptions
-
- close() - Method in class org.apache.hadoop.tools.mapred.lib.DynamicRecordReader
-
Implementation of RecordReader::close().
- close() - Method in class org.apache.hadoop.tools.util.ThrottledInputStream
-
- commitJob(JobContext) - Method in class org.apache.hadoop.tools.mapred.CopyCommitter
-
- compareFs(FileSystem, FileSystem) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
- CONF_LABEL_ATOMIC_COPY - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
Constants mapping to command line switches/input options
- CONF_LABEL_BANDWIDTH_MB - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_COPY_STRATEGY - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_DELETE_MISSING - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_DISTCP_JOB_ID - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
DistCp job id for consumers of the Disctp
- CONF_LABEL_IGNORE_FAILURES - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_LISTING_FILE_PATH - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_LOG_PATH - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_MAX_MAPS - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_META_FOLDER - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_OVERWRITE - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_PRESERVE_STATUS - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_SKIP_CRC - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_SOURCE_LISTING - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_SSL_CONF - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_SSL_KEY_STORE_LOCATION - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
Conf label for SSL Key-store location.
- CONF_LABEL_SSL_KEYSTORE - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_SSL_TRUST_STORE_LOCATION - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
Conf label for SSL Trust-store location.
- CONF_LABEL_SYNC_FOLDERS - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_TARGET_FINAL_PATH - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_TARGET_WORK_PATH - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_TOTAL_BYTES_TO_BE_COPIED - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_TOTAL_NUMBER_OF_RECORDS - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CONF_LABEL_WORK_PATH - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- CopyCommitter - Class in org.apache.hadoop.tools.mapred
-
The CopyCommitter class is DistCp's OutputCommitter implementation.
- CopyCommitter(Path, TaskAttemptContext) - Constructor for class org.apache.hadoop.tools.mapred.CopyCommitter
-
Create a output committer
- CopyListing - Class in org.apache.hadoop.tools
-
The CopyListing abstraction is responsible for how the list of
sources and targets is constructed, for DistCp's copy function.
- CopyListing(Configuration, Credentials) - Constructor for class org.apache.hadoop.tools.CopyListing
-
Protected constructor, to initialize configuration.
- CopyMapper - Class in org.apache.hadoop.tools.mapred
-
Mapper class that executes the DistCp copy operation.
- CopyMapper() - Constructor for class org.apache.hadoop.tools.mapred.CopyMapper
-
- CopyMapper.Counter - Enum in org.apache.hadoop.tools.mapred
-
Hadoop counters for the DistCp CopyMapper.
- CopyOutputFormat<K,V> - Class in org.apache.hadoop.tools.mapred
-
The CopyOutputFormat is the Hadoop OutputFormat used in DistCp.
- CopyOutputFormat() - Constructor for class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.hadoop.tools.mapred.lib.DynamicInputFormat
-
Implementation of Inputformat::createRecordReader().
- createRecordReader(InputSplit, TaskAttemptContext) - Method in class org.apache.hadoop.tools.mapred.UniformSizeInputFormat
-
Implementation of InputFormat::createRecordReader().
- DEFAULT_BANDWIDTH_MB - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- DEFAULT_MAPS - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- description - Variable in class org.apache.hadoop.tools.util.RetriableCommand
-
- DistCp - Class in org.apache.hadoop.tools
-
DistCp is the main driver-class for DistCpV2.
- DistCp(Configuration, DistCpOptions) - Constructor for class org.apache.hadoop.tools.DistCp
-
Public Constructor.
- DistCpConstants - Class in org.apache.hadoop.tools
-
Utility class to hold commonly used constants.
- DistCpConstants() - Constructor for class org.apache.hadoop.tools.DistCpConstants
-
- DistCpOptions - Class in org.apache.hadoop.tools
-
The Options class encapsulates all DistCp options.
- DistCpOptions(List<Path>, Path) - Constructor for class org.apache.hadoop.tools.DistCpOptions
-
Constructor, to initialize source/target paths.
- DistCpOptions(Path, Path) - Constructor for class org.apache.hadoop.tools.DistCpOptions
-
Constructor, to initialize source/target paths.
- DistCpOptions(DistCpOptions) - Constructor for class org.apache.hadoop.tools.DistCpOptions
-
Copy constructor.
- DistCpOptions.FileAttribute - Enum in org.apache.hadoop.tools
-
- DistCpOptionSwitch - Enum in org.apache.hadoop.tools
-
Enumeration mapping configuration keys to distcp command line
options.
- DistCpUtils - Class in org.apache.hadoop.tools.util
-
Utility functions used in DistCp.
- DistCpUtils() - Constructor for class org.apache.hadoop.tools.util.DistCpUtils
-
- doBuildListing(Path, DistCpOptions) - Method in class org.apache.hadoop.tools.CopyListing
-
The interface to be implemented by sub-classes, to create the source/target file listing.
- doBuildListing(Path, DistCpOptions) - Method in class org.apache.hadoop.tools.FileBasedCopyListing
-
Implementation of CopyListing::buildListing().
- doBuildListing(Path, DistCpOptions) - Method in class org.apache.hadoop.tools.GlobbedCopyListing
-
Implementation of CopyListing::buildListing().
- doBuildListing(Path, DistCpOptions) - Method in class org.apache.hadoop.tools.SimpleCopyListing
-
The interface to be implemented by sub-classes, to create the source/target file listing.
- doBuildListing(SequenceFile.Writer, DistCpOptions) - Method in class org.apache.hadoop.tools.SimpleCopyListing
-
- doExecute(Object...) - Method in class org.apache.hadoop.tools.mapred.RetriableDirectoryCreateCommand
-
Implementation of RetriableCommand::doExecute().
- doExecute(Object...) - Method in class org.apache.hadoop.tools.mapred.RetriableFileCopyCommand
-
Implementation of RetriableCommand::doExecute().
- doExecute(Object...) - Method in class org.apache.hadoop.tools.util.RetriableCommand
-
Implement this interface-method define the command-logic that will be
retried on failure (i.e.
- DUPLICATE_INPUT - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
- DynamicInputFormat<K,V> - Class in org.apache.hadoop.tools.mapred.lib
-
DynamicInputFormat implements the "Worker pattern" for DistCp.
- DynamicInputFormat() - Constructor for class org.apache.hadoop.tools.mapred.lib.DynamicInputFormat
-
- DynamicRecordReader<K,V> - Class in org.apache.hadoop.tools.mapred.lib
-
The DynamicRecordReader is used in conjunction with the DynamicInputFormat
to implement the "Worker pattern" for DistCp.
- DynamicRecordReader() - Constructor for class org.apache.hadoop.tools.mapred.lib.DynamicRecordReader
-
- getAtomicWorkPath() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Get work path for atomic commit.
- getAttribute(char) - Static method in enum org.apache.hadoop.tools.DistCpOptions.FileAttribute
-
- getBytesPerSec() - Method in class org.apache.hadoop.tools.util.ThrottledInputStream
-
Getter for the read-rate from this stream, since creation.
- getBytesToCopy() - Method in class org.apache.hadoop.tools.CopyListing
-
Return the total bytes that distCp should copy for the source paths
This doesn't consider whether file is same should be skipped during copy
- getBytesToCopy() - Method in class org.apache.hadoop.tools.FileBasedCopyListing
-
Return the total bytes that distCp should copy for the source paths
This doesn't consider whether file is same should be skipped during copy
- getBytesToCopy() - Method in class org.apache.hadoop.tools.GlobbedCopyListing
-
Return the total bytes that distCp should copy for the source paths
This doesn't consider whether file is same should be skipped during copy
- getBytesToCopy() - Method in class org.apache.hadoop.tools.SimpleCopyListing
-
Return the total bytes that distCp should copy for the source paths
This doesn't consider whether file is same should be skipped during copy
- getCommitDirectory(Job) - Static method in class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
Getter for the final commit-directory.
- getConfigLabel() - Method in enum org.apache.hadoop.tools.DistCpOptionSwitch
-
Get Configuration label for the option
- getCopyListing(Configuration, Credentials, DistCpOptions) - Static method in class org.apache.hadoop.tools.CopyListing
-
Public Factory method with which the appropriate CopyListing implementation may be retrieved.
- getCopyStrategy() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Get the copy strategy to use.
- getCredentials() - Method in class org.apache.hadoop.tools.CopyListing
-
get credentials to update the delegation tokens for accessed FS objects
- getCurrentKey() - Method in class org.apache.hadoop.tools.mapred.lib.DynamicRecordReader
-
Implementation of RecordReader::getCurrentKey().
- getCurrentValue() - Method in class org.apache.hadoop.tools.mapred.lib.DynamicRecordReader
-
Implementation of RecordReader::getCurrentValue().
- getFileSize(Path, Configuration) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Retrieves size of the file at the specified path.
- getFormatter() - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
- getInt(Configuration, String) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Utility to retrieve a specified key from a Configuration.
- getLogPath() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Get output directory for writing distcp logs.
- getLong(Configuration, String) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Utility to retrieve a specified key from a Configuration.
- getMapBandwidth() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Get the map bandwidth in MB
- getMaxMaps() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Get the max number of maps to use for this copy
- getNumberOfPaths() - Method in class org.apache.hadoop.tools.CopyListing
-
Return the total number of paths to distcp, includes directories as well
This doesn't consider whether file/dir is already present and should be skipped during copy
- getNumberOfPaths() - Method in class org.apache.hadoop.tools.FileBasedCopyListing
-
Return the total number of paths to distcp, includes directories as well
This doesn't consider whether file/dir is already present and should be skipped during copy
- getNumberOfPaths() - Method in class org.apache.hadoop.tools.GlobbedCopyListing
-
Return the total number of paths to distcp, includes directories as well
This doesn't consider whether file/dir is already present and should be skipped during copy
- getNumberOfPaths() - Method in class org.apache.hadoop.tools.SimpleCopyListing
-
Return the total number of paths to distcp, includes directories as well
This doesn't consider whether file/dir is already present and should be skipped during copy
- getOption() - Method in enum org.apache.hadoop.tools.DistCpOptionSwitch
-
Get CLI Option corresponding to the distcp option
- getOutputCommitter(TaskAttemptContext) - Method in class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
- getProgress() - Method in class org.apache.hadoop.tools.mapred.lib.DynamicRecordReader
-
Implementation of RecordReader::getProgress().
- getRelativePath(Path, Path) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Gets relative path of child path with respect to a root path
For ex.
- getSourceFileListing() - Method in class org.apache.hadoop.tools.DistCpOptions
-
File path (hdfs:// or file://) that contains the list of actual
files to copy
- getSourcePaths() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Getter for sourcePaths.
- getSplits(JobContext) - Method in class org.apache.hadoop.tools.mapred.lib.DynamicInputFormat
-
Implementation of InputFormat::getSplits().
- getSplits(JobContext) - Method in class org.apache.hadoop.tools.mapred.UniformSizeInputFormat
-
Implementation of InputFormat::getSplits().
- getSslConfigurationFile() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Get path where the ssl configuration file is present to use for hftps://
- getStrategy(Configuration, DistCpOptions) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Returns the class that implements a copy strategy.
- getStringDescriptionFor(long) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
- getSwitch() - Method in enum org.apache.hadoop.tools.DistCpOptionSwitch
-
Get Switch symbol
- getTargetPath() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Getter for the targetPath.
- getTotalBytesRead() - Method in class org.apache.hadoop.tools.util.ThrottledInputStream
-
Getter for the number of bytes read from this stream, since creation.
- getTotalSleepTime() - Method in class org.apache.hadoop.tools.util.ThrottledInputStream
-
Getter the total time spent in sleep.
- getWorkingDirectory(Job) - Static method in class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
Getter for the working directory.
- GlobbedCopyListing - Class in org.apache.hadoop.tools
-
GlobbedCopyListing implements the CopyListing interface, to create the copy
listing-file by "globbing" all specified source paths (wild-cards and all.)
- GlobbedCopyListing(Configuration, Credentials) - Constructor for class org.apache.hadoop.tools.GlobbedCopyListing
-
Constructor, to initialize the configuration.
- setAtomicCommit(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if data need to be committed automatically
- setAtomicWorkPath(Path) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set the work path for atomic commit
- setBlocking(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if Disctp should run blocking or non-blocking
- setCommitDirectory(Job, Path) - Static method in class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
Setter for the final directory for DistCp (where files copied will be
moved, atomically.)
- setCopyStrategy(String) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set the copy strategy to use.
- setCredentials(Credentials) - Method in class org.apache.hadoop.tools.CopyListing
-
set Credentials store, on which FS delegatin token will be cached
- setDeleteMissing(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if files only present in target should be deleted
- setIgnoreFailures(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if failures during copy be ignored
- setLogPath(Path) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set the log path where distcp output logs are stored
Uses JobStagingDir/_logs by default
- setMapBandwidth(int) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set per map bandwidth
- setMaxMaps(int) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set the max number of maps to use for copy
- setOverwrite(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if files should always be overwritten on target
- setRetryPolicy(RetryPolicy) - Method in class org.apache.hadoop.tools.util.RetriableCommand
-
Fluent-interface to change the RetryHandler.
- setSkipCRC(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if checksum comparison should be skipped while determining if
source and destination files are identical
- setSourcePaths(List<Path>) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Setter for sourcePaths.
- setSslConfigurationFile(String) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set the SSL configuration file path to use with hftps:// (local path)
- setSyncFolder(boolean) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Set if source and target folder contents be sync'ed up
- setup(Mapper<Text, FileStatus, Text, Text>.Context) - Method in class org.apache.hadoop.tools.mapred.CopyMapper
-
Implementation of the Mapper::setup() method.
- setWorkingDirectory(Job, Path) - Static method in class org.apache.hadoop.tools.mapred.CopyOutputFormat
-
Setter for the working directory for DistCp (where files will be copied
before they are moved to the final commit-directory.)
- shouldAtomicCommit() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should the data be committed atomically?
- shouldBlock() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should DistCp be running in blocking mode
- shouldDeleteMissing() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should target files missing in source should be deleted?
- shouldIgnoreFailures() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should failures be logged and ignored during copy?
- shouldOverwrite() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should files be overwritten always?
- shouldPreserve(DistCpOptions.FileAttribute) - Method in class org.apache.hadoop.tools.DistCpOptions
-
Checks if the input attibute should be preserved or not
- shouldSkipCRC() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should CRC/checksum check be skipped while checking files are identical
- shouldSyncFolder() - Method in class org.apache.hadoop.tools.DistCpOptions
-
Should the data be sync'ed between source and target paths?
- SHUTDOWN_HOOK_PRIORITY - Static variable in class org.apache.hadoop.tools.DistCp
-
Priority of the ResourceManager shutdown hook.
- SimpleCopyListing - Class in org.apache.hadoop.tools
-
The SimpleCopyListing is responsible for making the exhaustive list of
all files/directories under its specified list of input-paths.
- SimpleCopyListing(Configuration, Credentials) - Constructor for class org.apache.hadoop.tools.SimpleCopyListing
-
Protected constructor, to initialize configuration.
- sortListing(FileSystem, Configuration, Path) - Static method in class org.apache.hadoop.tools.util.DistCpUtils
-
Sort sequence file containing FileStatus and Text as key and value respecitvely
- SUCCESS - Static variable in class org.apache.hadoop.tools.DistCpConstants
-
Constants for DistCp return code to shell / consumer of ToolRunner's run