@Private
@Unstable
public final class TaskPool
extends java.lang.Object
org.apache.hadoop.fs.s3a.commit.Tasks which came from
the Netflix committer patch.
Apache Iceberg has its own version of this, with a common ancestor
at some point in its history.
A key difference with this class is that the iterator is always,
internally, an RemoteIterator.
This is to allow tasks to be scheduled while incremental operations
such as paged directory listings are still collecting in results.
While awaiting completion, this thread spins and sleeps a time of
SLEEP_INTERVAL_AWAITING_COMPLETION, which, being a
busy-wait, is inefficient.
There's an implicit assumption that remote IO is being performed, and
so this is not impacting throughput/performance.
History:
This class came with the Netflix contributions to the S3A committers
in HADOOP-13786.
It was moved into hadoop-common for use in the manifest committer and
anywhere else it is needed, and renamed in the process as
"Tasks" has too many meanings in the hadoop source.
The iterator was then changed from a normal java iterable
to a hadoop RemoteIterator.
This allows a task pool to be supplied with incremental listings
from object stores, scheduling work as pages of listing
results come in, rather than blocking until the entire
directory/directory tree etc has been enumerated.
There is a variant of this in Apache Iceberg in
org.apache.iceberg.util.Tasks
That is not derived from any version in the hadoop codebase, it
just shares a common ancestor somewhere in the Netflix codebase.
It is the more sophisticated version.| Modifier and Type | Class | Description |
|---|---|---|
static class |
TaskPool.Builder<I> |
Builder for task execution.
|
static interface |
TaskPool.FailureTask<I,E extends java.lang.Exception> |
Callback invoked on a failure.
|
static interface |
TaskPool.Submitter |
Interface to whatever lets us submit tasks.
|
static interface |
TaskPool.Task<I,E extends java.lang.Exception> |
Callback invoked to process an item.
|
| Modifier and Type | Method | Description |
|---|---|---|
static <I> TaskPool.Builder<I> |
foreach(I[] items) |
|
static <I> TaskPool.Builder<I> |
foreach(java.lang.Iterable<I> items) |
Create a task builder for the iterable.
|
static <I> TaskPool.Builder<I> |
foreach(RemoteIterator<I> items) |
Create a task builder for the remote iterator.
|
public static <I> TaskPool.Builder<I> foreach(java.lang.Iterable<I> items)
I - type of result.items - item source.public static <I> TaskPool.Builder<I> foreach(RemoteIterator<I> items)
I - type of result.items - item source.public static <I> TaskPool.Builder<I> foreach(I[] items)
Copyright © 2008–2025 Apache Software Foundation. All rights reserved.