Packages

c

com.nvidia.spark.rapids

GpuHashAggregateIterator

class GpuHashAggregateIterator extends Iterator[ColumnarBatch] with Arm with AutoCloseable with Logging

Iterator that takes another columnar batch iterator as input and emits new columnar batches that are aggregated based on the specified grouping and aggregation expressions. This iterator tries to perform a hash-based aggregation but is capable of falling back to a sort-based aggregation which can operate on data that is either larger than can be represented by a cudf column or larger than can fit in GPU memory.

The iterator starts by pulling all batches from the input iterator, performing an initial projection and aggregation on each individual batch via aggregateInputBatches(). The resulting aggregated batches are cached in memory as spillable batches. Once all input batches have been aggregated, tryMergeAggregatedBatches() is called to attempt a merge of the aggregated batches into a single batch. If this is successful then the resulting batch can be returned, otherwise buildSortFallbackIterator is used to sort the aggregated batches by the grouping keys and performs a final merge aggregation pass on the sorted batches.

Linear Supertypes
Logging, AutoCloseable, Arm, Iterator[ColumnarBatch], TraversableOnce[ColumnarBatch], GenTraversableOnce[ColumnarBatch], AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GpuHashAggregateIterator
  2. Logging
  3. AutoCloseable
  4. Arm
  5. Iterator
  6. TraversableOnce
  7. GenTraversableOnce
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GpuHashAggregateIterator(cbIter: Iterator[ColumnarBatch], inputAttributes: Seq[Attribute], groupingExpressions: Seq[NamedExpression], aggregateExpressions: Seq[GpuAggregateExpression], aggregateAttributes: Seq[Attribute], resultExpressions: Seq[NamedExpression], modeInfo: AggregateModeInfo, metrics: GpuHashAggregateMetrics, configuredTargetBatchSize: Long)

    cbIter

    iterator providing the input columnar batches

    inputAttributes

    input attributes to identify the input columns from the input batches

    groupingExpressions

    expressions used for producing the grouping keys

    aggregateExpressions

    GPU aggregate expressions used to produce the aggregations

    aggregateAttributes

    attribute references to each aggregate expression

    resultExpressions

    output expression for the aggregation

    modeInfo

    identifies which aggregation modes are being used

    metrics

    metrics that will be updated during aggregation

    configuredTargetBatchSize

    user-specified value for the targeted input batch size

Type Members

  1. class AggHelper extends AnyRef

    Internal class used in computeAggregates for the pre, agg, and post steps

  2. class GroupedIterator[B >: A] extends AbstractIterator[Seq[B]] with Iterator[Seq[B]]
    Definition Classes
    Iterator

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def ++[B >: ColumnarBatch](that: ⇒ GenTraversableOnce[B]): Iterator[B]
    Definition Classes
    Iterator
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. def addString(b: StringBuilder): StringBuilder
    Definition Classes
    TraversableOnce
  6. def addString(b: StringBuilder, sep: String): StringBuilder
    Definition Classes
    TraversableOnce
  7. def addString(b: StringBuilder, start: String, sep: String, end: String): StringBuilder
    Definition Classes
    TraversableOnce
  8. def aggregate[B](z: ⇒ B)(seqop: (B, ColumnarBatch) ⇒ B, combop: (B, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  9. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  10. def buffered: BufferedIterator[ColumnarBatch]
    Definition Classes
    Iterator
  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  12. def close(): Unit
    Definition Classes
    GpuHashAggregateIterator → AutoCloseable
  13. def closeOnExcept[T <: AutoCloseable, V](r: Option[T])(block: (Option[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  14. def closeOnExcept[T <: AutoCloseable, V](r: ArrayBuffer[T])(block: (ArrayBuffer[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  15. def closeOnExcept[T <: AutoCloseable, V](r: Array[T])(block: (Array[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  16. def closeOnExcept[T <: AutoCloseable, V](r: Seq[T])(block: (Seq[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  17. def closeOnExcept[T <: AutoCloseable, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block, closing the resource only if an exception occurs

    Executes the provided code block, closing the resource only if an exception occurs

    Definition Classes
    Arm
  18. def collect[B](pf: PartialFunction[ColumnarBatch, B]): Iterator[B]
    Definition Classes
    Iterator
    Annotations
    @migration
    Migration

    (Changed in version 2.8.0) collect has changed. The previous behavior can be reproduced with toSeq.

  19. def collectFirst[B](pf: PartialFunction[ColumnarBatch, B]): Option[B]
    Definition Classes
    TraversableOnce
  20. def contains(elem: Any): Boolean
    Definition Classes
    Iterator
  21. def copyToArray[B >: ColumnarBatch](xs: Array[B], start: Int, len: Int): Unit
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  22. def copyToArray[B >: ColumnarBatch](xs: Array[B]): Unit
    Definition Classes
    TraversableOnce → GenTraversableOnce
  23. def copyToArray[B >: ColumnarBatch](xs: Array[B], start: Int): Unit
    Definition Classes
    TraversableOnce → GenTraversableOnce
  24. def copyToBuffer[B >: ColumnarBatch](dest: Buffer[B]): Unit
    Definition Classes
    TraversableOnce
  25. def corresponds[B](that: GenTraversableOnce[B])(p: (ColumnarBatch, B) ⇒ Boolean): Boolean
    Definition Classes
    Iterator
  26. def count(p: (ColumnarBatch) ⇒ Boolean): Int
    Definition Classes
    TraversableOnce → GenTraversableOnce
  27. def drop(n: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  28. def dropWhile(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  29. def duplicate: (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  30. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  31. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  32. def exists(p: (ColumnarBatch) ⇒ Boolean): Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  33. def filter(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  34. def filterNot(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  35. def find(p: (ColumnarBatch) ⇒ Boolean): Option[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  36. def flatMap[B](f: (ColumnarBatch) ⇒ GenTraversableOnce[B]): Iterator[B]
    Definition Classes
    Iterator
  37. def fold[A1 >: ColumnarBatch](z: A1)(op: (A1, A1) ⇒ A1): A1
    Definition Classes
    TraversableOnce → GenTraversableOnce
  38. def foldLeft[B](z: B)(op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  39. def foldRight[B](z: B)(op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  40. def forall(p: (ColumnarBatch) ⇒ Boolean): Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  41. def foreach[U](f: (ColumnarBatch) ⇒ U): Unit
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  42. def freeOnExcept[T <: RapidsBuffer, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block, freeing the RapidsBuffer only if an exception occurs

    Executes the provided code block, freeing the RapidsBuffer only if an exception occurs

    Definition Classes
    Arm
  43. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  44. def grouped[B >: ColumnarBatch](size: Int): GroupedIterator[B]
    Definition Classes
    Iterator
  45. def hasDefiniteSize: Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  46. def hasNext: Boolean
    Definition Classes
    GpuHashAggregateIterator → Iterator
  47. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  48. def indexOf[B >: ColumnarBatch](elem: B, from: Int): Int
    Definition Classes
    Iterator
  49. def indexOf[B >: ColumnarBatch](elem: B): Int
    Definition Classes
    Iterator
  50. def indexWhere(p: (ColumnarBatch) ⇒ Boolean, from: Int): Int
    Definition Classes
    Iterator
  51. def indexWhere(p: (ColumnarBatch) ⇒ Boolean): Int
    Definition Classes
    Iterator
  52. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  53. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def isEmpty: Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  55. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  56. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  57. def isTraversableAgain: Boolean
    Definition Classes
    Iterator → GenTraversableOnce
  58. def length: Int
    Definition Classes
    Iterator
  59. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  60. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  61. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  62. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  63. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  64. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  65. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  66. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  67. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  68. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  69. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  70. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  71. def map[B](f: (ColumnarBatch) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  72. def max[B >: ColumnarBatch](implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  73. def maxBy[B](f: (ColumnarBatch) ⇒ B)(implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  74. def min[B >: ColumnarBatch](implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  75. def minBy[B](f: (ColumnarBatch) ⇒ B)(implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  76. def mkString: String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  77. def mkString(sep: String): String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  78. def mkString(start: String, sep: String, end: String): String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  79. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  80. def next(): ColumnarBatch
    Definition Classes
    GpuHashAggregateIterator → Iterator
  81. def nonEmpty: Boolean
    Definition Classes
    TraversableOnce → GenTraversableOnce
  82. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  83. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  84. def padTo[A1 >: ColumnarBatch](len: Int, elem: A1): Iterator[A1]
    Definition Classes
    Iterator
  85. def partition(p: (ColumnarBatch) ⇒ Boolean): (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  86. def patch[B >: ColumnarBatch](from: Int, patchElems: Iterator[B], replaced: Int): Iterator[B]
    Definition Classes
    Iterator
  87. def product[B >: ColumnarBatch](implicit num: Numeric[B]): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  88. def reduce[A1 >: ColumnarBatch](op: (A1, A1) ⇒ A1): A1
    Definition Classes
    TraversableOnce → GenTraversableOnce
  89. def reduceLeft[B >: ColumnarBatch](op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce
  90. def reduceLeftOption[B >: ColumnarBatch](op: (B, ColumnarBatch) ⇒ B): Option[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  91. def reduceOption[A1 >: ColumnarBatch](op: (A1, A1) ⇒ A1): Option[A1]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  92. def reduceRight[B >: ColumnarBatch](op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  93. def reduceRightOption[B >: ColumnarBatch](op: (ColumnarBatch, B) ⇒ B): Option[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  94. def reversed: List[ColumnarBatch]
    Attributes
    protected[this]
    Definition Classes
    TraversableOnce
  95. def sameElements(that: Iterator[_]): Boolean
    Definition Classes
    Iterator
  96. def scanLeft[B](z: B)(op: (B, ColumnarBatch) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  97. def scanRight[B](z: B)(op: (ColumnarBatch, B) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  98. def seq: Iterator[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  99. def size: Int
    Definition Classes
    TraversableOnce → GenTraversableOnce
  100. def sizeHintIfCheap: Int
    Attributes
    protected[collection]
    Definition Classes
    GenTraversableOnce
  101. def slice(from: Int, until: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  102. def sliceIterator(from: Int, until: Int): Iterator[ColumnarBatch]
    Attributes
    protected
    Definition Classes
    Iterator
  103. def sliding[B >: ColumnarBatch](size: Int, step: Int): GroupedIterator[B]
    Definition Classes
    Iterator
  104. def span(p: (ColumnarBatch) ⇒ Boolean): (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  105. def sum[B >: ColumnarBatch](implicit num: Numeric[B]): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  106. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  107. def take(n: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  108. def takeWhile(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  109. def to[Col[_]](implicit cbf: CanBuildFrom[Nothing, ColumnarBatch, Col[ColumnarBatch]]): Col[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  110. def toArray[B >: ColumnarBatch](implicit arg0: ClassTag[B]): Array[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  111. def toBuffer[B >: ColumnarBatch]: Buffer[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  112. def toIndexedSeq: IndexedSeq[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  113. def toIterable: Iterable[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  114. def toIterator: Iterator[ColumnarBatch]
    Definition Classes
    Iterator → GenTraversableOnce
  115. def toList: List[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  116. def toMap[T, U](implicit ev: <:<[ColumnarBatch, (T, U)]): Map[T, U]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  117. def toSeq: Seq[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  118. def toSet[B >: ColumnarBatch]: Set[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  119. def toStream: Stream[ColumnarBatch]
    Definition Classes
    Iterator → GenTraversableOnce
  120. def toString(): String
    Definition Classes
    Iterator → AnyRef → Any
  121. def toTraversable: Traversable[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  122. def toVector: Vector[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  123. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  124. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  125. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  126. def withFilter(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  127. def withResource[T <: AutoCloseable, V](h: CloseableHolder[T])(block: (CloseableHolder[T]) ⇒ V): V

    Executes the provided code block and then closes the resource

    Executes the provided code block and then closes the resource

    Definition Classes
    Arm
  128. def withResource[T <: AutoCloseable, V](r: ArrayBuffer[T])(block: (ArrayBuffer[T]) ⇒ V): V

    Executes the provided code block and then closes the array buffer of resources

    Executes the provided code block and then closes the array buffer of resources

    Definition Classes
    Arm
  129. def withResource[T <: AutoCloseable, V](r: Array[T])(block: (Array[T]) ⇒ V): V

    Executes the provided code block and then closes the array of resources

    Executes the provided code block and then closes the array of resources

    Definition Classes
    Arm
  130. def withResource[T <: AutoCloseable, V](r: Seq[T])(block: (Seq[T]) ⇒ V): V

    Executes the provided code block and then closes the sequence of resources

    Executes the provided code block and then closes the sequence of resources

    Definition Classes
    Arm
  131. def withResource[T <: AutoCloseable, V](r: Option[T])(block: (Option[T]) ⇒ V): V

    Executes the provided code block and then closes the Option[resource]

    Executes the provided code block and then closes the Option[resource]

    Definition Classes
    Arm
  132. def withResource[T <: AutoCloseable, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block and then closes the resource

    Executes the provided code block and then closes the resource

    Definition Classes
    Arm
  133. def withResourceIfAllowed[T, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block and then closes the value if it is AutoCloseable

    Executes the provided code block and then closes the value if it is AutoCloseable

    Definition Classes
    Arm
  134. def zip[B](that: Iterator[B]): Iterator[(ColumnarBatch, B)]
    Definition Classes
    Iterator
  135. def zipAll[B, A1 >: ColumnarBatch, B1 >: B](that: Iterator[B], thisElem: A1, thatElem: B1): Iterator[(A1, B1)]
    Definition Classes
    Iterator
  136. def zipWithIndex: Iterator[(ColumnarBatch, Int)]
    Definition Classes
    Iterator

Deprecated Value Members

  1. def /:[B](z: B)(op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
    Annotations
    @deprecated
    Deprecated

    (Since version 2.12.10) Use foldLeft instead of /:

  2. def :\[B](z: B)(op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
    Annotations
    @deprecated
    Deprecated

    (Since version 2.12.10) Use foldRight instead of :\

  3. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from Logging

Inherited from AutoCloseable

Inherited from Arm

Inherited from Iterator[ColumnarBatch]

Inherited from TraversableOnce[ColumnarBatch]

Inherited from GenTraversableOnce[ColumnarBatch]

Inherited from AnyRef

Inherited from Any

Ungrouped