public static class SequenceFile.Sorter
extends java.lang.Object
For best performance, applications should make sure that the Writable.readFields(DataInput) implementation of their keys is
very efficient. In particular, it should avoid allocating memory.
| Modifier and Type | Class | Description |
|---|---|---|
static interface |
SequenceFile.Sorter.RawKeyValueIterator |
The interface to iterate over raw keys/values of SequenceFiles.
|
class |
SequenceFile.Sorter.SegmentDescriptor |
This class defines a merge segment.
|
| Constructor | Description |
|---|---|
Sorter(FileSystem fs,
java.lang.Class<? extends WritableComparable> keyClass,
java.lang.Class valClass,
Configuration conf) |
Sort and merge files containing the named classes.
|
Sorter(FileSystem fs,
RawComparator comparator,
java.lang.Class keyClass,
java.lang.Class valClass,
Configuration conf) |
Sort and merge using an arbitrary
RawComparator. |
Sorter(FileSystem fs,
RawComparator comparator,
java.lang.Class keyClass,
java.lang.Class valClass,
Configuration conf,
SequenceFile.Metadata metadata) |
Sort and merge using an arbitrary
RawComparator. |
| Modifier and Type | Method | Description |
|---|---|---|
SequenceFile.Writer |
cloneFileAttributes(Path inputFile,
Path outputFile,
Progressable prog) |
Clones the attributes (like compression of the input file and creates a
corresponding Writer
|
int |
getFactor() |
|
int |
getMemory() |
|
SequenceFile.Sorter.RawKeyValueIterator |
merge(java.util.List<SequenceFile.Sorter.SegmentDescriptor> segments,
Path tmpDir) |
Merges the list of segments of type
SegmentDescriptor |
SequenceFile.Sorter.RawKeyValueIterator |
merge(Path[] inNames,
boolean deleteInputs,
int factor,
Path tmpDir) |
Merges the contents of files passed in Path[]
|
SequenceFile.Sorter.RawKeyValueIterator |
merge(Path[] inNames,
boolean deleteInputs,
Path tmpDir) |
Merges the contents of files passed in Path[] using a max factor value
that is already set
|
void |
merge(Path[] inFiles,
Path outFile) |
Merge the provided files.
|
SequenceFile.Sorter.RawKeyValueIterator |
merge(Path[] inNames,
Path tempDir,
boolean deleteInputs) |
Merges the contents of files passed in Path[]
|
void |
setFactor(int factor) |
Set the number of streams to merge at once.
|
void |
setMemory(int memory) |
Set the total amount of buffer memory, in bytes.
|
void |
setProgressable(Progressable progressable) |
Set the progressable object in order to report progress.
|
void |
sort(Path[] inFiles,
Path outFile,
boolean deleteInput) |
Perform a file sort from a set of input files into an output file.
|
void |
sort(Path inFile,
Path outFile) |
The backwards compatible interface to sort.
|
SequenceFile.Sorter.RawKeyValueIterator |
sortAndIterate(Path[] inFiles,
Path tempDir,
boolean deleteInput) |
Perform a file sort from a set of input files and return an iterator.
|
void |
writeFile(SequenceFile.Sorter.RawKeyValueIterator records,
SequenceFile.Writer writer) |
Writes records from RawKeyValueIterator into a file represented by the
passed writer.
|
public Sorter(FileSystem fs, java.lang.Class<? extends WritableComparable> keyClass, java.lang.Class valClass, Configuration conf)
fs - input FileSystem.keyClass - input keyClass.valClass - input valClass.conf - input Configuration.public Sorter(FileSystem fs, RawComparator comparator, java.lang.Class keyClass, java.lang.Class valClass, Configuration conf)
RawComparator.fs - input FileSystem.comparator - input RawComparator.keyClass - input keyClass.valClass - input valClass.conf - input Configuration.public Sorter(FileSystem fs, RawComparator comparator, java.lang.Class keyClass, java.lang.Class valClass, Configuration conf, SequenceFile.Metadata metadata)
RawComparator.fs - input FileSystem.comparator - input RawComparator.keyClass - input keyClass.valClass - input valClass.conf - input Configuration.metadata - input metadata.public void setFactor(int factor)
factor - factor.public int getFactor()
public void setMemory(int memory)
memory - buffer memory.public int getMemory()
public void setProgressable(Progressable progressable)
progressable - input Progressable.public void sort(Path[] inFiles, Path outFile, boolean deleteInput) throws java.io.IOException
inFiles - the files to be sortedoutFile - the sorted output filedeleteInput - should the input files be deleted as they are read?java.io.IOException - raised on errors performing I/O.public SequenceFile.Sorter.RawKeyValueIterator sortAndIterate(Path[] inFiles, Path tempDir, boolean deleteInput) throws java.io.IOException
inFiles - the files to be sortedtempDir - the directory where temp files are created during sortdeleteInput - should the input files be deleted as they are read?java.io.IOException - raised on errors performing I/O.public void sort(Path inFile, Path outFile) throws java.io.IOException
inFile - the input file to sort.outFile - the sorted output file.java.io.IOException - raised on errors performing I/O.public SequenceFile.Sorter.RawKeyValueIterator merge(java.util.List<SequenceFile.Sorter.SegmentDescriptor> segments, Path tmpDir) throws java.io.IOException
SegmentDescriptorsegments - the list of SegmentDescriptorstmpDir - the directory to write temporary files intojava.io.IOException - raised on errors performing I/O.public SequenceFile.Sorter.RawKeyValueIterator merge(Path[] inNames, boolean deleteInputs, Path tmpDir) throws java.io.IOException
inNames - the array of path namesdeleteInputs - true if the input files should be deleted when
unnecessarytmpDir - the directory to write temporary files intojava.io.IOException - raised on errors performing I/O.public SequenceFile.Sorter.RawKeyValueIterator merge(Path[] inNames, boolean deleteInputs, int factor, Path tmpDir) throws java.io.IOException
inNames - the array of path namesdeleteInputs - true if the input files should be deleted when
unnecessaryfactor - the factor that will be used as the maximum merge fan-intmpDir - the directory to write temporary files intojava.io.IOException - raised on errors performing I/O.public SequenceFile.Sorter.RawKeyValueIterator merge(Path[] inNames, Path tempDir, boolean deleteInputs) throws java.io.IOException
inNames - the array of path namestempDir - the directory for creating temp files during mergedeleteInputs - true if the input files should be deleted when
unnecessaryjava.io.IOException - raised on errors performing I/O.public SequenceFile.Writer cloneFileAttributes(Path inputFile, Path outputFile, Progressable prog) throws java.io.IOException
inputFile - the path of the input file whose attributes should be
clonedoutputFile - the path of the output fileprog - the Progressable to report status during the file writejava.io.IOException - raised on errors performing I/O.public void writeFile(SequenceFile.Sorter.RawKeyValueIterator records, SequenceFile.Writer writer) throws java.io.IOException
records - the RawKeyValueIteratorwriter - the Writer created earlierjava.io.IOException - raised on errors performing I/O.Copyright © 2008–2025 Apache Software Foundation. All rights reserved.