Packages

c

com.nvidia.spark.rapids

GpuOutOfCoreSortIterator

case class GpuOutOfCoreSortIterator(iter: Iterator[ColumnarBatch], sorter: GpuSorter, cpuOrd: LazilyGeneratedOrdering, targetSize: Long, opTime: GpuMetric, sortTime: GpuMetric, outputBatches: GpuMetric, outputRows: GpuMetric, peakDevMemory: GpuMetric, spillCallback: SpillCallback) extends Iterator[ColumnarBatch] with Arm with AutoCloseable with Product with Serializable

Sorts incoming batches of data spilling if needed.
The algorithm for this is a modified version of an external merge sort with multiple passes for large data. https://en.wikipedia.org/wiki/External_sorting#External_merge_sort
The main difference is that we cannot stream the data when doing a merge sort. So, we instead divide the data into batches that are small enough that we can do a merge sort on N batches and still fit the output within the target batch size. When merging batches instead of individual rows we cannot assume that all of the resulting data is globally sorted. Hopefully, most of it is globally sorted but we have to use the first row from the next pending batch to determine the cutoff point between globally sorted data and data that still needs to be merged with other batches. The globally sorted portion is put into a sorted queue while the rest of the merged data is split and put back into a pending queue. The process repeats until we have enough data to output.

Linear Supertypes
Serializable, Serializable, Product, Equals, AutoCloseable, Arm, Iterator[ColumnarBatch], TraversableOnce[ColumnarBatch], GenTraversableOnce[ColumnarBatch], AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GpuOutOfCoreSortIterator
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AutoCloseable
  7. Arm
  8. Iterator
  9. TraversableOnce
  10. GenTraversableOnce
  11. AnyRef
  12. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GpuOutOfCoreSortIterator(iter: Iterator[ColumnarBatch], sorter: GpuSorter, cpuOrd: LazilyGeneratedOrdering, targetSize: Long, opTime: GpuMetric, sortTime: GpuMetric, outputBatches: GpuMetric, outputRows: GpuMetric, peakDevMemory: GpuMetric, spillCallback: SpillCallback)

Type Members

  1. class GroupedIterator[B >: A] extends AbstractIterator[Seq[B]] with Iterator[Seq[B]]
    Definition Classes
    Iterator

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def ++[B >: ColumnarBatch](that: ⇒ GenTraversableOnce[B]): Iterator[B]
    Definition Classes
    Iterator
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. def addString(b: StringBuilder): StringBuilder
    Definition Classes
    TraversableOnce
  6. def addString(b: StringBuilder, sep: String): StringBuilder
    Definition Classes
    TraversableOnce
  7. def addString(b: StringBuilder, start: String, sep: String, end: String): StringBuilder
    Definition Classes
    TraversableOnce
  8. def aggregate[B](z: ⇒ B)(seqop: (B, ColumnarBatch) ⇒ B, combop: (B, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  9. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  10. def buffered: BufferedIterator[ColumnarBatch]
    Definition Classes
    Iterator
  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  12. def close(): Unit
    Definition Classes
    GpuOutOfCoreSortIterator → AutoCloseable
  13. def closeOnExcept[T <: AutoCloseable, V](r: Option[T])(block: (Option[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  14. def closeOnExcept[T <: AutoCloseable, V](r: ArrayBuffer[T])(block: (ArrayBuffer[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  15. def closeOnExcept[T <: AutoCloseable, V](r: Array[T])(block: (Array[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  16. def closeOnExcept[T <: AutoCloseable, V](r: Seq[T])(block: (Seq[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  17. def closeOnExcept[T <: AutoCloseable, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block, closing the resource only if an exception occurs

    Executes the provided code block, closing the resource only if an exception occurs

    Definition Classes
    Arm
  18. def collect[B](pf: PartialFunction[ColumnarBatch, B]): Iterator[B]
    Definition Classes
    Iterator
    Annotations
    @migration
    Migration

    (Changed in version 2.8.0) collect has changed. The previous behavior can be reproduced with toSeq.

  19. def collectFirst[B](pf: PartialFunction[ColumnarBatch, B]): Option[B]
    Definition Classes
    TraversableOnce
  20. def contains(elem: Any): Boolean
    Definition Classes
    Iterator
  21. def copyToArray[B >: ColumnarBatch](xs: Array[B], start: Int, len: Int): Unit
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  22. def copyToArray[B >: ColumnarBatch](xs: Array[B]): Unit
    Definition Classes
    TraversableOnce → GenTraversableOnce
  23. def copyToArray[B >: ColumnarBatch](xs: Array[B], start: Int): Unit
    Definition Classes
    TraversableOnce → GenTraversableOnce
  24. def copyToBuffer[B >: ColumnarBatch](dest: Buffer[B]): Unit
    Definition Classes
    TraversableOnce
  25. def corresponds[B](that: GenTraversableOnce[B])(p: (ColumnarBatch, B) ⇒ Boolean): Boolean
    Definition Classes
    Iterator
  26. def count(p: (ColumnarBatch) ⇒ Boolean): Int
    Definition Classes
    TraversableOnce → GenTraversableOnce
  27. val cpuOrd: LazilyGeneratedOrdering
  28. def drop(n: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  29. def dropWhile(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  30. def duplicate: (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  31. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  32. def exists(p: (ColumnarBatch) ⇒ Boolean): Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  33. def filter(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  34. def filterNot(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  35. def find(p: (ColumnarBatch) ⇒ Boolean): Option[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  36. def flatMap[B](f: (ColumnarBatch) ⇒ GenTraversableOnce[B]): Iterator[B]
    Definition Classes
    Iterator
  37. def fold[A1 >: ColumnarBatch](z: A1)(op: (A1, A1) ⇒ A1): A1
    Definition Classes
    TraversableOnce → GenTraversableOnce
  38. def foldLeft[B](z: B)(op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  39. def foldRight[B](z: B)(op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  40. def forall(p: (ColumnarBatch) ⇒ Boolean): Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  41. def foreach[U](f: (ColumnarBatch) ⇒ U): Unit
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  42. def freeOnExcept[T <: RapidsBuffer, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block, freeing the RapidsBuffer only if an exception occurs

    Executes the provided code block, freeing the RapidsBuffer only if an exception occurs

    Definition Classes
    Arm
  43. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  44. def grouped[B >: ColumnarBatch](size: Int): GroupedIterator[B]
    Definition Classes
    Iterator
  45. def hasDefiniteSize: Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  46. def hasNext: Boolean
    Definition Classes
    GpuOutOfCoreSortIterator → Iterator
  47. def indexOf[B >: ColumnarBatch](elem: B, from: Int): Int
    Definition Classes
    Iterator
  48. def indexOf[B >: ColumnarBatch](elem: B): Int
    Definition Classes
    Iterator
  49. def indexWhere(p: (ColumnarBatch) ⇒ Boolean, from: Int): Int
    Definition Classes
    Iterator
  50. def indexWhere(p: (ColumnarBatch) ⇒ Boolean): Int
    Definition Classes
    Iterator
  51. def isEmpty: Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  52. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  53. def isTraversableAgain: Boolean
    Definition Classes
    Iterator → GenTraversableOnce
  54. val iter: Iterator[ColumnarBatch]
  55. def length: Int
    Definition Classes
    Iterator
  56. def map[B](f: (ColumnarBatch) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  57. def max[B >: ColumnarBatch](implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  58. def maxBy[B](f: (ColumnarBatch) ⇒ B)(implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  59. def min[B >: ColumnarBatch](implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  60. def minBy[B](f: (ColumnarBatch) ⇒ B)(implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  61. def mkString: String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  62. def mkString(sep: String): String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  63. def mkString(start: String, sep: String, end: String): String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  64. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  65. def next(): ColumnarBatch
    Definition Classes
    GpuOutOfCoreSortIterator → Iterator
  66. def nonEmpty: Boolean
    Definition Classes
    TraversableOnce → GenTraversableOnce
  67. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  68. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  69. val opTime: GpuMetric
  70. val outputBatches: GpuMetric
  71. val outputRows: GpuMetric
  72. def padTo[A1 >: ColumnarBatch](len: Int, elem: A1): Iterator[A1]
    Definition Classes
    Iterator
  73. def partition(p: (ColumnarBatch) ⇒ Boolean): (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  74. def patch[B >: ColumnarBatch](from: Int, patchElems: Iterator[B], replaced: Int): Iterator[B]
    Definition Classes
    Iterator
  75. val peakDevMemory: GpuMetric
  76. def product[B >: ColumnarBatch](implicit num: Numeric[B]): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  77. def reduce[A1 >: ColumnarBatch](op: (A1, A1) ⇒ A1): A1
    Definition Classes
    TraversableOnce → GenTraversableOnce
  78. def reduceLeft[B >: ColumnarBatch](op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce
  79. def reduceLeftOption[B >: ColumnarBatch](op: (B, ColumnarBatch) ⇒ B): Option[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  80. def reduceOption[A1 >: ColumnarBatch](op: (A1, A1) ⇒ A1): Option[A1]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  81. def reduceRight[B >: ColumnarBatch](op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  82. def reduceRightOption[B >: ColumnarBatch](op: (ColumnarBatch, B) ⇒ B): Option[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  83. def reversed: List[ColumnarBatch]
    Attributes
    protected[this]
    Definition Classes
    TraversableOnce
  84. def sameElements(that: Iterator[_]): Boolean
    Definition Classes
    Iterator
  85. def scanLeft[B](z: B)(op: (B, ColumnarBatch) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  86. def scanRight[B](z: B)(op: (ColumnarBatch, B) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  87. def seq: Iterator[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  88. def size: Int
    Definition Classes
    TraversableOnce → GenTraversableOnce
  89. def sizeHintIfCheap: Int
    Attributes
    protected[collection]
    Definition Classes
    GenTraversableOnce
  90. def slice(from: Int, until: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  91. def sliceIterator(from: Int, until: Int): Iterator[ColumnarBatch]
    Attributes
    protected
    Definition Classes
    Iterator
  92. def sliding[B >: ColumnarBatch](size: Int, step: Int): GroupedIterator[B]
    Definition Classes
    Iterator
  93. val sortTime: GpuMetric
  94. val sorter: GpuSorter
  95. def span(p: (ColumnarBatch) ⇒ Boolean): (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  96. val spillCallback: SpillCallback
  97. def sum[B >: ColumnarBatch](implicit num: Numeric[B]): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  98. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  99. def take(n: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  100. def takeWhile(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  101. val targetSize: Long
  102. def to[Col[_]](implicit cbf: CanBuildFrom[Nothing, ColumnarBatch, Col[ColumnarBatch]]): Col[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  103. def toArray[B >: ColumnarBatch](implicit arg0: ClassTag[B]): Array[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  104. def toBuffer[B >: ColumnarBatch]: Buffer[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  105. def toIndexedSeq: IndexedSeq[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  106. def toIterable: Iterable[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  107. def toIterator: Iterator[ColumnarBatch]
    Definition Classes
    Iterator → GenTraversableOnce
  108. def toList: List[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  109. def toMap[T, U](implicit ev: <:<[ColumnarBatch, (T, U)]): Map[T, U]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  110. def toSeq: Seq[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  111. def toSet[B >: ColumnarBatch]: Set[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  112. def toStream: Stream[ColumnarBatch]
    Definition Classes
    Iterator → GenTraversableOnce
  113. def toString(): String
    Definition Classes
    Iterator → AnyRef → Any
  114. def toTraversable: Traversable[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  115. def toVector: Vector[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  116. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  117. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  118. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  119. def withFilter(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  120. def withResource[T <: AutoCloseable, V](h: CloseableHolder[T])(block: (CloseableHolder[T]) ⇒ V): V

    Executes the provided code block and then closes the resource

    Executes the provided code block and then closes the resource

    Definition Classes
    Arm
  121. def withResource[T <: AutoCloseable, V](r: ArrayBuffer[T])(block: (ArrayBuffer[T]) ⇒ V): V

    Executes the provided code block and then closes the array buffer of resources

    Executes the provided code block and then closes the array buffer of resources

    Definition Classes
    Arm
  122. def withResource[T <: AutoCloseable, V](r: Array[T])(block: (Array[T]) ⇒ V): V

    Executes the provided code block and then closes the array of resources

    Executes the provided code block and then closes the array of resources

    Definition Classes
    Arm
  123. def withResource[T <: AutoCloseable, V](r: Seq[T])(block: (Seq[T]) ⇒ V): V

    Executes the provided code block and then closes the sequence of resources

    Executes the provided code block and then closes the sequence of resources

    Definition Classes
    Arm
  124. def withResource[T <: AutoCloseable, V](r: Option[T])(block: (Option[T]) ⇒ V): V

    Executes the provided code block and then closes the Option[resource]

    Executes the provided code block and then closes the Option[resource]

    Definition Classes
    Arm
  125. def withResource[T <: AutoCloseable, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block and then closes the resource

    Executes the provided code block and then closes the resource

    Definition Classes
    Arm
  126. def withResourceIfAllowed[T, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block and then closes the value if it is AutoCloseable

    Executes the provided code block and then closes the value if it is AutoCloseable

    Definition Classes
    Arm
  127. def zip[B](that: Iterator[B]): Iterator[(ColumnarBatch, B)]
    Definition Classes
    Iterator
  128. def zipAll[B, A1 >: ColumnarBatch, B1 >: B](that: Iterator[B], thisElem: A1, thatElem: B1): Iterator[(A1, B1)]
    Definition Classes
    Iterator
  129. def zipWithIndex: Iterator[(ColumnarBatch, Int)]
    Definition Classes
    Iterator

Deprecated Value Members

  1. def /:[B](z: B)(op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
    Annotations
    @deprecated
    Deprecated

    (Since version 2.12.10) Use foldLeft instead of /:

  2. def :\[B](z: B)(op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
    Annotations
    @deprecated
    Deprecated

    (Since version 2.12.10) Use foldRight instead of :\

  3. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AutoCloseable

Inherited from Arm

Inherited from Iterator[ColumnarBatch]

Inherited from TraversableOnce[ColumnarBatch]

Inherited from GenTraversableOnce[ColumnarBatch]

Inherited from AnyRef

Inherited from Any

Ungrouped