org.apache.hadoop.mapreduce.v2.app.speculate
Class ExponentiallySmoothedTaskRuntimeEstimator

java.lang.Object
  extended by org.apache.hadoop.mapreduce.v2.app.speculate.ExponentiallySmoothedTaskRuntimeEstimator
All Implemented Interfaces:
TaskRuntimeEstimator

public class ExponentiallySmoothedTaskRuntimeEstimator
extends Object

This estimator exponentially smooths the rate of progress versus wallclock time. Conceivably we could write an estimator that smooths time per unit progress, and get different results.


Nested Class Summary
static class ExponentiallySmoothedTaskRuntimeEstimator.SmoothedValue
           
 
Field Summary
protected  AppContext context
           
protected  Set<Task> doneTasks
           
protected  Map<Job,DataStatistics> mapperStatistics
           
protected  Map<Job,DataStatistics> reducerStatistics
           
protected  Map<org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId,Long> startTimes
           
 
Constructor Summary
ExponentiallySmoothedTaskRuntimeEstimator()
           
 
Method Summary
 long attemptEnrolledTime(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId attemptID)
           
 void contextualize(org.apache.hadoop.conf.Configuration conf, AppContext context)
           
protected  DataStatistics dataStatisticsForTask(org.apache.hadoop.mapreduce.v2.api.records.TaskId taskID)
           
 void enrollAttempt(TaskAttemptStatusUpdateEvent.TaskAttemptStatus status, long timestamp)
           
 long estimatedNewAttemptRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskId id)
          Estimates how long a new attempt on this task will take if we start one now
 long estimatedRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId id)
          Estimate a task attempt's total runtime.
 long runtimeEstimateVariance(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId id)
          Computes the width of the error band of our estimate of the task runtime as returned by TaskRuntimeEstimator.estimatedRuntime(TaskAttemptId)
 long thresholdRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskId taskID)
          Find a maximum reasonable execution wallclock time.
 void updateAttempt(TaskAttemptStatusUpdateEvent.TaskAttemptStatus status, long timestamp)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

context

protected AppContext context

startTimes

protected final Map<org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId,Long> startTimes

mapperStatistics

protected final Map<Job,DataStatistics> mapperStatistics

reducerStatistics

protected final Map<Job,DataStatistics> reducerStatistics

doneTasks

protected final Set<Task> doneTasks
Constructor Detail

ExponentiallySmoothedTaskRuntimeEstimator

public ExponentiallySmoothedTaskRuntimeEstimator()
Method Detail

contextualize

public void contextualize(org.apache.hadoop.conf.Configuration conf,
                          AppContext context)
Specified by:
contextualize in interface TaskRuntimeEstimator

estimatedRuntime

public long estimatedRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId id)
Description copied from interface: TaskRuntimeEstimator
Estimate a task attempt's total runtime. Includes the time already elapsed.

Parameters:
id - the TaskAttemptId of the attempt we are asking about
Returns:
our best estimate of the attempt's runtime, or -1 if we don't have enough information yet to produce an estimate.

runtimeEstimateVariance

public long runtimeEstimateVariance(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId id)
Description copied from interface: TaskRuntimeEstimator
Computes the width of the error band of our estimate of the task runtime as returned by TaskRuntimeEstimator.estimatedRuntime(TaskAttemptId)

Parameters:
id - the TaskAttemptId of the attempt we are asking about
Returns:
our best estimate of the attempt's runtime, or -1 if we don't have enough information yet to produce an estimate.

updateAttempt

public void updateAttempt(TaskAttemptStatusUpdateEvent.TaskAttemptStatus status,
                          long timestamp)
Specified by:
updateAttempt in interface TaskRuntimeEstimator

enrollAttempt

public void enrollAttempt(TaskAttemptStatusUpdateEvent.TaskAttemptStatus status,
                          long timestamp)
Specified by:
enrollAttempt in interface TaskRuntimeEstimator

attemptEnrolledTime

public long attemptEnrolledTime(org.apache.hadoop.mapreduce.v2.api.records.TaskAttemptId attemptID)
Specified by:
attemptEnrolledTime in interface TaskRuntimeEstimator

dataStatisticsForTask

protected DataStatistics dataStatisticsForTask(org.apache.hadoop.mapreduce.v2.api.records.TaskId taskID)

thresholdRuntime

public long thresholdRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskId taskID)
Description copied from interface: TaskRuntimeEstimator
Find a maximum reasonable execution wallclock time. Includes the time already elapsed. Find a maximum reasonable execution time. Includes the time already elapsed. If the projected total execution time for this task ever exceeds its reasonable execution time, we may speculate it.

Specified by:
thresholdRuntime in interface TaskRuntimeEstimator
Parameters:
taskID - the TaskId of the task we are asking about
Returns:
the task's maximum reasonable runtime, or MAX_VALUE if we don't have enough information to rule out any runtime, however long.

estimatedNewAttemptRuntime

public long estimatedNewAttemptRuntime(org.apache.hadoop.mapreduce.v2.api.records.TaskId id)
Description copied from interface: TaskRuntimeEstimator
Estimates how long a new attempt on this task will take if we start one now

Specified by:
estimatedNewAttemptRuntime in interface TaskRuntimeEstimator
Parameters:
id - the TaskId of the task we are asking about
Returns:
our best estimate of a new attempt's runtime, or -1 if we don't have enough information yet to produce an estimate.


Copyright © 2014 Apache Software Foundation. All Rights Reserved.