Packages

class GpuSorter extends Arm with Serializable

A class that provides convenience methods for sorting batches of data. A Spark SortOrder typically will just reference a single column using an AttributeReference. This is the simplest situation so we just need to bind the attribute references to where they go, but it is possible that some computation can be done in the SortOrder. This would be a situation like sorting strings by their length instead of in lexicographical order. Because cudf does not support this directly we instead go through the SortOrder instances that are a part of this sorter and find the ones that require computation. We then do the sort in a few stages first we compute any needed columns from the SortOrder instances that require some computation, and add them to the original batch. The method appendProjectedColumns does this. This then provides a number of methods that can be used to operate on a batch that has these new columns added to it. These include sorting, merge sorting, and finding bounds. These can be combined in various ways to do different algorithms. When you are done with these different operations you can drop the temporary columns that were added, just for computation, using removeProjectedColumns. Some times you may want to pull data back to the CPU and sort rows there too. We provide cpuOrders that lets you do this on rows that have had the extra ordering columns added to them. This also provides fullySortBatch as an optimization. If all you want to do is sort a batch you don't want to have to sort the temp columns too, and this provide that.

Linear Supertypes
Serializable, Serializable, Arm, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GpuSorter
  2. Serializable
  3. Serializable
  4. Arm
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GpuSorter(sortOrder: Seq[SortOrder], inputSchema: Seq[Attribute])

    A class that provides convenience methods for sorting batches of data

    A class that provides convenience methods for sorting batches of data

    sortOrder

    The unbound sorting order requested (Should be converted to the GPU)

    inputSchema

    The schema of the input data

  2. new GpuSorter(sortOrder: Seq[SortOrder], inputSchema: Array[Attribute])

    sortOrder

    The unbound sorting order requested (Should be converted to the GPU)

    inputSchema

    The schema of the input data

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def appendProjectedAndSort(inputBatch: ColumnarBatch, sortTime: GpuMetric): Table

    Append any columns needed for sorting the batch and sort it.

    Append any columns needed for sorting the batch and sort it. Be careful because a batch with no columns/only rows will cause errors and should be special cased.

    inputBatch

    the batch to sort

    sortTime

    metric for the sort time

    returns

    a sorted table.

  5. final def appendProjectedColumns(inputBatch: ColumnarBatch): ColumnarBatch

    Append any columns to the batch that need to be materialized for sorting to work.

    Append any columns to the batch that need to be materialized for sorting to work.

    inputBatch

    the batch to add columns to

    returns

    the batch with columns added

  6. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  7. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  8. def closeOnExcept[T <: AutoCloseable, V](r: Option[T])(block: (Option[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  9. def closeOnExcept[T <: AutoCloseable, V](r: ArrayBuffer[T])(block: (ArrayBuffer[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  10. def closeOnExcept[T <: AutoCloseable, V](r: Array[T])(block: (Array[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  11. def closeOnExcept[T <: AutoCloseable, V](r: Seq[T])(block: (Seq[T]) ⇒ V): V

    Executes the provided code block, closing the resources only if an exception occurs

    Executes the provided code block, closing the resources only if an exception occurs

    Definition Classes
    Arm
  12. def closeOnExcept[T <: AutoCloseable, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block, closing the resource only if an exception occurs

    Executes the provided code block, closing the resource only if an exception occurs

    Definition Classes
    Arm
  13. final def computeSortOrder(inputBatch: ColumnarBatch, sortTime: GpuMetric): ColumnVector

    Get the sort order for a batch of data that is the output of appendProjectedColumns.

    Get the sort order for a batch of data that is the output of appendProjectedColumns. Be careful because a batch with no columns/only rows will cause errors and should be special cased.

    inputBatch

    the batch to sort

    sortTime

    metric for the sort time (really the sort order time here)

    returns

    a gather map column

  14. def cpuOrdering: Seq[SortOrder]

    A sort order that the CPU can use to sort data that is the output of appendProjectedColumns.

    A sort order that the CPU can use to sort data that is the output of appendProjectedColumns. You cannot use the regular sort order directly because it has been translated to the GPU when computation is needed.

  15. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  17. def freeOnExcept[T <: RapidsBuffer, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block, freeing the RapidsBuffer only if an exception occurs

    Executes the provided code block, freeing the RapidsBuffer only if an exception occurs

    Definition Classes
    Arm
  18. final def fullySortBatch(inputBatch: ColumnarBatch, sortTime: GpuMetric, peakDevMemory: GpuMetric): ColumnarBatch

    Sort a batch start to finish.

    Sort a batch start to finish. Add any projected columns that are needed to sort, sort the data, and drop the added columns.

    inputBatch

    the batch to sort

    sortTime

    metric for the amount of time taken to sort.

    peakDevMemory

    metric for the peak memory usage

    returns

    the sorted batch

  19. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  20. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  21. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  22. def lowerBound(findIn: Table, find: Table): ColumnVector

    Find the lower bounds on data that is the output of appendProjectedColumns.

    Find the lower bounds on data that is the output of appendProjectedColumns. Be careful because a batch with no columns/only rows will cause errors and should be special cased.

    findIn

    the data to look in for lower bounds

    find

    the data to look for and get the lower bound for

    returns

    the rows where the insertions would happen.

  23. def lowerBound(findIn: ColumnarBatch, find: ColumnarBatch): ColumnVector

    Find the lower bounds on data that is the output of appendProjectedColumns.

    Find the lower bounds on data that is the output of appendProjectedColumns. Be careful because a batch with no columns/only rows will cause errors and should be special cased.

    findIn

    the data to look in for lower bounds

    find

    the data to look for and get the lower bound for

    returns

    the rows where the insertions would happen.

  24. final def mergeSort(batches: Array[ColumnarBatch], sortTime: GpuMetric): ColumnarBatch

    Merge multiple batches together.

    Merge multiple batches together. All of these batches should be the output of appendProjectedColumns and the output of this will also be in that same format.

    batches

    the batches to sort

    sortTime

    metric for the time spent doing the merge sort

    returns

    the sorted data.

  25. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  26. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  27. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  28. lazy val originalTypes: Array[DataType]

    The original input types without any temporary columns added to them needed for sorting.

  29. lazy val projectedBatchSchema: Seq[Attribute]

    Some SortOrder instances require adding temporary columns which is done as a part of the appendProjectedColumns method.

    Some SortOrder instances require adding temporary columns which is done as a part of the appendProjectedColumns method. This is the schema for the result of that method.

  30. lazy val projectedBatchTypes: Array[DataType]

    The types and order for the columns returned by appendProjectedColumns

  31. final def removeProjectedColumns(input: Table): ColumnarBatch

    Convert a sorted table into a ColumnarBatch and drop any columns added by appendProjectedColumns

    Convert a sorted table into a ColumnarBatch and drop any columns added by appendProjectedColumns

    input

    the table to convert

    returns

    the ColumnarBatch

  32. final def sort(inputBatch: ColumnarBatch, sortTime: GpuMetric): Table

    Sort a batch of data that is the output of appendProjectedColumns.

    Sort a batch of data that is the output of appendProjectedColumns. Be careful because a batch with no columns/only rows will cause errors and should be special cased.

    inputBatch

    the batch to sort

    sortTime

    metric for the sort time

    returns

    a sorted table.

  33. val sortOrder: Seq[SortOrder]
  34. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  35. def toString(): String
    Definition Classes
    AnyRef → Any
  36. def upperBound(findIn: ColumnarBatch, find: ColumnarBatch): ColumnVector

    Find the upper bounds on data that is the output of appendProjectedColumns.

    Find the upper bounds on data that is the output of appendProjectedColumns. Be careful because a batch with no columns/only rows will cause errors and should be special cased.

    findIn

    the data to look in for upper bounds

    find

    the data to look for and get the upper bound for

    returns

    the rows where the insertions would happen.

  37. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  38. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  39. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  40. def withResource[T <: AutoCloseable, V](h: CloseableHolder[T])(block: (CloseableHolder[T]) ⇒ V): V

    Executes the provided code block and then closes the resource

    Executes the provided code block and then closes the resource

    Definition Classes
    Arm
  41. def withResource[T <: AutoCloseable, V](r: ArrayBuffer[T])(block: (ArrayBuffer[T]) ⇒ V): V

    Executes the provided code block and then closes the array buffer of resources

    Executes the provided code block and then closes the array buffer of resources

    Definition Classes
    Arm
  42. def withResource[T <: AutoCloseable, V](r: Array[T])(block: (Array[T]) ⇒ V): V

    Executes the provided code block and then closes the array of resources

    Executes the provided code block and then closes the array of resources

    Definition Classes
    Arm
  43. def withResource[T <: AutoCloseable, V](r: Seq[T])(block: (Seq[T]) ⇒ V): V

    Executes the provided code block and then closes the sequence of resources

    Executes the provided code block and then closes the sequence of resources

    Definition Classes
    Arm
  44. def withResource[T <: AutoCloseable, V](r: Option[T])(block: (Option[T]) ⇒ V): V

    Executes the provided code block and then closes the Option[resource]

    Executes the provided code block and then closes the Option[resource]

    Definition Classes
    Arm
  45. def withResource[T <: AutoCloseable, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block and then closes the resource

    Executes the provided code block and then closes the resource

    Definition Classes
    Arm
  46. def withResourceIfAllowed[T, V](r: T)(block: (T) ⇒ V): V

    Executes the provided code block and then closes the value if it is AutoCloseable

    Executes the provided code block and then closes the value if it is AutoCloseable

    Definition Classes
    Arm

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from Serializable

Inherited from Serializable

Inherited from Arm

Inherited from AnyRef

Inherited from Any

Ungrouped